Wednesday, 27 November 2013

Sorting Data in a MySQL query before grouping

In some cases you have the problem that you need to sort your data before grouping it in a single SQL query. Normally it's impossible to do this in a single SQL SELECT statement, because the result is first grouped, then sorted.

Given a database table which contain images in different resolutions for products like this:

+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | int(11)      | NO   | PRI | NULL    | auto_increment |
| article_id | int(11)      | YES  |     | NULL    |                |
| width      | int(11)      | YES  |     | NULL    |                |
| height     | int(11)      | YES  |     | NULL    |                |
| filename   | varchar(255) | YES  |     | NULL    |                |
+------------+--------------+------+-----+---------+----------------+

Imagine to select only the biggest image for a couple of article ids you could try something like this:

SELECT article_id, width, height, filename
FROM article_images
WHERE article_id IN(23, 42)
GROUP by article_id
ORDER by width*height

But you'll only get the first entry for a given article_id, not for the biggest resolution.

So let's try another approach:

SELECT article_id, width, height, filename
FROM (SELECT article_id, width, height, filename
      FROM article_images
      WHERE article_id IN(23, 42)
      ORDER by width*height
) as ghoti
GROUP by article_id

This query first selects all images in all resolutions for the given article_ids and sorts it by resolution, copying it into a temporary table named ghoti. The query around this subselect then groups the data by article_id. Because we have sorted by resolution before it results in the biggest image for a given article_id.

ghoti is our name used internally for that kind of SELECT, because we need it sometimes to transfer less data to our PHP scripts and crunch data on the database server. What ghoti means is described in this wikipedia article.

Friday, 22 November 2013

Splunk Data Analytics

We're using Splunk> Enterprise for about 3 month now and our conclusion is: It's one of the best decisions for our data analytics and processing we could have made.

Our previous process for logging and analysing data was to store data in a custom mysql table created for that specific logging purpose and reading it with some PHP scripts and pass it to the google charting library on a custom created page.

Every new analysis took us some hours to implement which reduced our willingness to log anything to nearly zero.

With Splunk> the logging just went from "to complicated, won't implement" to "what could we log next?"

We've crafted a logging class which can be used in our current store as easy as it could get:

Log::info('fun', 'woot', array(
    'monkey' => $amountOfMonkeys
));


This results in a key value log event like specified in the Splunk> logging best practice guide  like this:

2013-11-22T10:22:18+00:00 mod=fun evt=woot monkey=13

Now its really easy to do some analytics in splunk with some easy search queries.


The big advantage is, that we can specify our log format by ourselves and don't have to rely on the log format of third party tools. But even with some custom log events you can extract data with the built-in field extractions using regex and start analyzing your data within minutes.

Thanks Splunk!

Thursday, 21 November 2013

The definition of rock 'n roll: when your ads get rejected by youporn.

We're about to put some suggestive ads on several porn sites, but somehow YouPorn rejected our "GILP - guitars I'd like to play" ad. Probably too hardcore for them or something.


Monday, 18 November 2013

Multiple Monitors on Linux are quite complicated

As developers we're using a third display for all our important stuff - or at least my colleagues who are using windows... :(

I tried several times to get my third display up and running on a Ubuntu 12.04 using an Nvidia Quadro NVS 450 graphics card with 4 display ports - and it looks like there's the problem.

The nvidia graphics card consists of two GPUs with each 2 display ports which can be joined with some tricks.

First Try:


First i fired up nvidia-settings and configure my third display to be a seperate X server. Nvidia supports the mode "TwinView" as you suggest for only two monitors. TwinView provides a layer above the first two displays to behave like a single one to my window manager Gnome.



After storing these changes to my Xorg.conf and rebooting the machine i ended up with a gray background on the third display - which looks like a bug in nautilus. Furthermore i can't move the application windows to the seperate X screen by drag and drop.

But even worse:  After disabling the third screen i ended up with additional menus on the top and bottom of my main screen every time I login.

Which looks like this:


The only solution for me was to delete all my gnome config settings in .gnome2 and .config/gnome-* and set all my configurations (shortcurts, etc..) again.

Second Try:


Another possibility for this setup is to use "Xinerama" on top of TwinView and the seperate X-Server. Xinerama is a layer on top of this configuration with groups all displays to one big as TwinView would do... but actually it behaves different

If i click on fullscreen on a single application, my window is stretched across all displays like this:


Nevertheless the performance of Xinerama on high display resolutions is quite bad. You can see flickering by scrolling in your browser. For me that's a no-go

Third Try:


My for-now last try was to use different window managers like XFCE and awesome (which is my favorite). In XFCE i've experienced quite the same issues like in gnome and awesome behaves totally different as my current workflow is.

awesome has several workspaces per screen, on my current setup the workspace is shared across all screens which has several advantages like split them into duties (coding, communication (skype, mail...)).

Conclusion:


I gave up for now and currently working on two screens, accepting the fun comments from my windows-using colleagues. If you have a solution for this, i would really appreciate if you could share it with me. It totally drives me crazy ;)


PS: Switching to windows is not an option