Thomann /dev/blog: 2013

Thursday, 12 December 2013

Looking into the high resolution timer of node.js

In the nodejs core a high resolution timer can be found. It is a method of the global process object called hrtime.
When the method is called an array is returned:

$ node
> process.hrtime();
[ 31013, 815378921 ]

Well, that does not look too useful.
The official documentation states:

It is relative to an arbitrary time in the past. It is not related to the time of day and therefore not subject to clock drift. The primary use is for measuring performance between intervals.

So measuring the time between two events is the actual purpose. The first event is recorded via an call to process.hrtime():

var start = process.hrtime();

Then when a certain operation has been finished (using setTimeout to simulate this), process.hrtime is called again with the start marker as a parameter:

setTimeout(function() {
  var elapsed = process.hrtime(start);
  console.log(elapsed);
}, 1000);

Not the output looks like this:

[ 1, 14933877 ]

The first element in the array is the elapsed time in seconds, the second element is the additional time in nanoseconds.
So to get an actual useful value out of this array we have to do a little calculation:

var timeInMilliseconds = elapsed[0] * 1e3 + elapsed[1] / 1e6;

Now the result is something like 1014.96679.
Success!

The complete code example:

var start = process.hrtime();

setTimeout(function() {
    var elapsed = process.hrtime(start);
    var timeInMilliseconds = elapsed[0] * 1e3 + elapsed[1] / 1e6;

    console.log(timeInMilliseconds);
}, 1000);

If you dont like to do the same operation over and over again there is a high resolution timer module on npm: https://npmjs.org/package/hirestime

Install via:

npm install hirestime

Invokation:

var hirestime = require('hirestime');

//startpoint of the time measurement
var getElapsed = hirestime();

setTimeout(function() {
    //returns the elapsed milliseconds
    console.log(getElapsed());
}, 1000);

Optionally a timeunit can be assigned:

var hirestime = require('hirestime');

//startpoint of the time measurement
var getElapsed = hirestime();

setTimeout(function() {
    //returns the elapsed seconds
    console.log(getElapsed(hirestime.S));
}, 1000);

Possible time units are:

hirestime.S the elapsed time in seconds
hirestime.NS the elapsed time in nanoseconds
hirestime.MS the elapsed time in milliseconds

The timeunit defaults to milliseconds.

Wednesday, 27 November 2013

Sorting Data in a MySQL query before grouping

In some cases you have the problem that you need to sort your data before grouping it in a single SQL query. Normally it's impossible to do this in a single SQL SELECT statement, because the result is first grouped, then sorted.

Given a database table which contain images in different resolutions for products like this:

+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | int(11)      | NO   | PRI | NULL    | auto_increment |
| article_id | int(11)      | YES  |     | NULL    |                |
| width      | int(11)      | YES  |     | NULL    |                |
| height     | int(11)      | YES  |     | NULL    |                |
| filename   | varchar(255) | YES  |     | NULL    |                |
+------------+--------------+------+-----+---------+----------------+

Imagine to select only the biggest image for a couple of article ids you could try something like this:

SELECT article_id, width, height, filename
FROM article_images
WHERE article_id IN(23, 42)
GROUP by article_id
ORDER by width*height

But you'll only get the first entry for a given article_id, not for the biggest resolution.

So let's try another approach:

SELECT article_id, width, height, filename
FROM (SELECT article_id, width, height, filename
      FROM article_images
      WHERE article_id IN(23, 42)
      ORDER by width*height
) as ghoti
GROUP by article_id

This query first selects all images in all resolutions for the given article_ids and sorts it by resolution, copying it into a temporary table named ghoti. The query around this subselect then groups the data by article_id. Because we have sorted by resolution before it results in the biggest image for a given article_id.

ghoti is our name used internally for that kind of SELECT, because we need it sometimes to transfer less data to our PHP scripts and crunch data on the database server. What ghoti means is described in this wikipedia article.

Friday, 22 November 2013

Splunk Data Analytics

We're using Splunk> Enterprise for about 3 month now and our conclusion is: It's one of the best decisions for our data analytics and processing we could have made.

Our previous process for logging and analysing data was to store data in a custom mysql table created for that specific logging purpose and reading it with some PHP scripts and pass it to the google charting library on a custom created page.

Every new analysis took us some hours to implement which reduced our willingness to log anything to nearly zero.

With Splunk> the logging just went from "to complicated, won't implement" to "what could we log next?"

We've crafted a logging class which can be used in our current store as easy as it could get:

Log::info('fun', 'woot', array(
    'monkey' => $amountOfMonkeys
));

This results in a key value log event like specified in the Splunk> logging best practice guide like this:

2013-11-22T10:22:18+00:00 mod=fun evt=woot monkey=13

Now its really easy to do some analytics in splunk with some easy search queries.

The big advantage is, that we can specify our log format by ourselves and don't have to rely on the log format of third party tools. But even with some custom log events you can extract data with the built-in field extractions using regex and start analyzing your data within minutes.

Thanks Splunk!

Thursday, 21 November 2013

The definition of rock 'n roll: when your ads get rejected by youporn.

We're about to put some suggestive ads on several porn sites, but somehow YouPorn rejected our "GILP - guitars I'd like to play" ad. Probably too hardcore for them or something.

Monday, 18 November 2013

Multiple Monitors on Linux are quite complicated

As developers we're using a third display for all our important stuff - or at least my colleagues who are using windows... :(

I tried several times to get my third display up and running on a Ubuntu 12.04 using an Nvidia Quadro NVS 450 graphics card with 4 display ports - and it looks like there's the problem.

The nvidia graphics card consists of two GPUs with each 2 display ports which can be joined with some tricks.

First Try:

First i fired up nvidia-settings and configure my third display to be a seperate X server. Nvidia supports the mode "TwinView" as you suggest for only two monitors. TwinView provides a layer above the first two displays to behave like a single one to my window manager Gnome.

After storing these changes to my Xorg.conf and rebooting the machine i ended up with a gray background on the third display - which looks like a bug in nautilus. Furthermore i can't move the application windows to the seperate X screen by drag and drop.

But even worse: After disabling the third screen i ended up with additional menus on the top and bottom of my main screen every time I login.

Which looks like this:

source: askubuntu.com/177226

The only solution for me was to delete all my gnome config settings in .gnome2 and .config/gnome-* and set all my configurations (shortcurts, etc..) again.

Second Try:

Another possibility for this setup is to use "Xinerama" on top of TwinView and the seperate X-Server. Xinerama is a layer on top of this configuration with groups all displays to one big as TwinView would do... but actually it behaves different

If i click on fullscreen on a single application, my window is stretched across all displays like this:

Nevertheless the performance of Xinerama on high display resolutions is quite bad. You can see flickering by scrolling in your browser. For me that's a no-go

Third Try:

My for-now last try was to use different window managers like XFCE and awesome (which is my favorite). In XFCE i've experienced quite the same issues like in gnome and awesome behaves totally different as my current workflow is.

awesome has several workspaces per screen, on my current setup the workspace is shared across all screens which has several advantages like split them into duties (coding, communication (skype, mail...)).

Conclusion:

I gave up for now and currently working on two screens, accepting the fun comments from my windows-using colleagues. If you have a solution for this, i would really appreciate if you could share it with me. It totally drives me crazy ;)

PS: Switching to windows is not an option

Thursday, 31 October 2013

Finding good employees is hard

We’ve been in a very busy hiring phase lately, looking for someone to work at our web development department. We got a lot of applications, a few good guys and many real weird ones. My favorite is the one who claims:

"…by learning several programming languages like HTML and CSS it is possible for me to work on a large spectrum of challenges and to fulfill them."

Another one was a constructor of metal planes who likes to work on our website. Well, what about building our own plane? Maybe that could speed up our deliveries to oversea… The third candidate was an artist (painter), worked as electrician, with cnc machines, than as a painter, technical drawer, studied architecture, worked at a gas station and in fact did also an educational training as an IT guy. We’re still hiring, if you know why HTML and CSS are no programming languages, you actually have a web background and ideally love to play and hear music, we want to talk to you! PS: Bring toilet paper.

Wednesday, 30 October 2013

how to kill internet explorer by using css

When we started our re-design process earlier in March 2013, we decided to change a few wordings on our webpage (e.g. changing "Thomann Cyberstore" to "Musikhaus Thomann" n stuff). To bypass our quite complicated deploy process for different language files, some developers in our team came to the great idea to transform the first letter of buttons to uppercase using css (which was a requested behaviour by our UX-designer). Later that day we had customers reporting crashes of their browser (Internet Explorer, obviously) and so we started investigating.

To cut a long story:

Using a CSS selector with the pseudo-class :first-letter and assigning the text-transform: uppercase; property really isn't a good idea, unless you care for, at least, a minor useable site for IE users (You can see an explanation/demo here). It somehow crashes all the lonely and unloved Internet Explorer's below and including version 9.

Fun fact: IE10 was the only version the responsible developer tested before going live. Success!

We also reported this bug to Microsoft, but I guess they're right; it has a limited impact, because transforming a first letter to uppercase by using CSS is simply a bad idea. Period.

memcached key limitations

We’ve had observed weird cache behaviour on our website, resulting in old or uncached page modules. Our page consists of different parts which are cached separately and it looked like some of them couldn’t write into our memcached cluster. This problem was not reproducible for a single page module, but appeared randomly on the whole page.

Our default behaviour for cache writing, is to disable the whole cache for the rest of the page if something went wrong.

Memcached has some constraints about its key and the data saved inside.
Some of them are:

Keys are limited to max. 250 characters
Serialized data can’t exceed 1 MB

After adding more and more debug messages, which produced gigabytes of log files, we found another (for us) new, relevant memcached key constraint:

You can’t use spaces in your cachename

That’s even specified in the memcached protocol document

But what’s not specified in the document is, that memcached keys can only contain ASCII characters

After fixing these issues our cache hit rate went up again.