Archive for the ‘browser’ Category.

WebKit Page Cache

Performance on the web has always been a hot topic to talk about specially when you talk about page load, but you don’t see as often people talking about perceived load which is related to how fast the interface responds to the user interaction, in this area WebKit based browsers like Safari and Chrome from my point of view have always been ahead.

In the article WebKit Page Cache I – The Basics Brady Eidson talks about the Page Cache feature on WebKit, as he stated it’s not about caching in the HTTP sense, storing raw files on disk or even caching resources in memory. “More simply, when you leave the page we pause it and when you come back we press play.”, this totally affects the perceived performance, when you hit the back button, even if you have files cached locally the browser will still have to load it again, run all your scripts, recreate the DOM and so on, so basically “When the Page Cache works it makes clicking the back button almost instantaneous.”

The last time they updated Page Cache was back in ‘02 the web has evolved a lot since then and it’s not easy to keep up to date, you have to take in considetration HTTPS, plugins, frames, page events, although some of these issues have been fixed already like the frames problem, it still a long way to go, anyway it’s great to see that they are working on it.

String searching algorithms in JavaScript engines

I’ve just finished chapter 7: Writing Efficient JavaScript by Nicholas Zakas on Steve Souders‘ new book, Even Faster Web Sites, where he presents several string optimization techniques to improve JavaScript performance and wondered which algorithm does String.indexOf method implements on JavaScript engines (aka ECMAScript engines).
A few months ago I’ve asked this question to Yahoo! fellow Douglas Crockford and he said the ECMAScript standard does not require a specific algorithm, so it could vary with each browser. You can check that on section 15.5.4.7 of Standard ECMA-262. I decided then to download the most popular open-source JavaScript engines source codes and found mainly 3 algorithms:

  • Naïve: simple and least inefficient way to search in strings, it basically checks every position in the haystack then every position of the needle. The advantage of using this algorithm is that it needs no initial overhead such as auxiliary tables creation. It has O(mn) complexity, where m is the needle length and n the haystack length.
  • Boyer-Moore: Efficient searching string algorithm that preprocesses the needle and doesn’t check every position in the haystack but rather skips some of them on each unsuccessful attempt. It has O(n) complexity with O(3n) in the worst case.
  • Boyer-Moore-Horspool: It’s a simplification of Boyer-Moore’s algorithm with less overhear during initial needle preprocessing. It also has O(n) complexity but O(mn) in the worst case.

Algorithms by engines

The String.indexOf algorithms by JavaScript engines follows:

JavaScript Engine Layout Engine Browsers String.indexOf algorithm
SpiderMonkey Gecko Firefox up to 3.0.* Boyer-Moore-Horspool
TraceMonkey Gecko Firefox from 3.1.* Boyer-Moore-Horspool
KJS KHTML Konqueror Naïve
JavaScriptCore WebKit Safari up to 3.* Naïve
SquirrelFish WebKit Safari from 4.* Naïve
JScript Trident Internet Explorer ?
V8 WebKit Chrome Strategy: Naïve, Boyer-Moore-Horspool and Boyer-Moore
Linear B Presto Opera 7.0 - 9.50[ ?
Futhark Presto Core 2 Opera from 9.50 ?
Rhino - - java.lang.String.indexOf

SpiderMonkey

Implements the String.indexOf in C with some verifications in string lengths prior to run BMH algorithm in order to avoid long searching for relatively small strings.

Source code available at: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/releases/3.0.13/source/firefox-3.0.13-source.tar.bz2

TraceMonkey

It has exactly the same String.indexOf implementation as SpiderMonkey but in C++.

Source code available at: ftp://ftp.mozilla.org/pub/mozilla.org/firefox/releases/3.5.2/source/firefox-3.5.2-source.tar.bz2

KJS

The main part of the naïve implementation of indexOf follows*:

/* ... */
for (const UChar* c = data_ + pos; c <= end; c++)
    if (c->uc == fchar && !memcmp(c + 1, fdata, fsizeminusone))
        return (c - data_);
 
return -1;

* taken from KDE 4.0 API reference

Files related to String.indexOf method:

  • string_object.cpp: defines the prototype for String object where indexOf is in a switch case statement and calls find function.
  • ustring.cpp: defines the find function where the naïve algorithm is implemented.

Browse the source code online: http://api.kde.org/4.0-api/kdelibs-apidocs/kjs/html/files.html

JavaScriptCore & SquirrelFish

These engines are known as JavaScriptCore in WebKit Project and was originally derived from KJS, hence still shares the same algorithm for String.indexOf.

Files related to String.indexOf method:

  • root/trunk/JavaScriptCore/runtime/StringPrototype.cpp: this is where indexOf method is defined and call find function
  • root/trunk/JavaScriptCore/runtime/UString.cpp: look for find function

Source code available at: http://webkit.org/building/checkout.html
Browse the source code online: http://trac.webkit.org/browser

V8

A very smart strategy is applied to the string searching in order to choose the best algorithm based on the length of the needle:

  • First of all it checks if there is a non-ASCII needle for an ASCII haystack and bail out if there is.
  • Checks if the needle length is less than 5 then uses a naïve solution called simpleIndexOf, because the max shift of Boyer-Moore on such needle length doesn’t compensate for the overhead. This simpleIndexOf function never bails out, it means that the needle will be checked for a match in the whole haystack.
  • If the needle length is greater than or equals to 5 another simpleIndexOf function is called. This one considers how much work have been done (unsuccessful matches) in order to stop trying and switch for a better algorithm. This is called the “badness count” which once reached the max, stop the search and returns the index in the haystack where the next algorithm should continue from.
  • The next algorithm in the strategy chain is Boyer-Moore-Horspool which also consider the badness count prior to jump to the next algorithm.
  • The last one is Boyer-Moore which has some initial overhead when creating good and bad shift tables.

Source code available at: http://code.google.com/p/v8/wiki/Source?tm=4

Rhino

Rhino runs on top of Java Virtual Machine and uses the java.lang.String.indexOf from Java language implemented for such JVM. Interestingly there is a comment saying:

“Uses Java String.indexOf(). OPT to add - BMH searching from jsstr.c”.

Where jsstr.c is the file for SpiderMonkey JavaScript String implementation. Implementing such algorithm in Java might degrade the search performance, unless the java.lang.String.indexOf implementation is much worse than that.

Source code available at: http://www.mozilla.org/rhino/download.html

Other engines

What about Internet Explorer, Opera and other browsers JavaScript engines? As they aren’t open-source projects I could not check their codes out. :-(

Benchmark

By running a simple test across some browsers we can have an idea how fast/slow is String.indexOf on some JavaScript engines although this doesn’t necessarily mean an algorithm is better than another because the performance of the engine itself might affect the outcome.
The test consists of the average of a 100 times running a search for the word “foobar” in the middle of a ~1200 length “ipsum lorem” text iterating 1 million times each search. Try it yourself.

The results in the follow table were taken by running this test on the same machine (Pentium 4HT, 3GHz, 1Gb RAM, Windows XP SP3).

JavaScript Engine Browser Version Average (ms)
V8 Chrome 2.0.172.39 827.66
SpiderMonkey Firefox 3.0.13 947.97
TraceMonkey Firefox 3.5.2 1169.25
SquirrelFish Safari 4.0.2 1207.02
KJS* Konqueror 4.2.2 1361.59
SpiderMonkey Firefox 2.0.0.20 1456.57
Futhark Opera 10.00 beta 2 1549.06
Futhark Opera 9.64 1613.02
JScript** Internet Explorer 8.0 3101.23
Rhino*** - - 4103.64
JScript Internet Explorer 6.0 4479.82
JScript Internet Explorer 7.0 4515.08

* running on the same machine with Ubuntu 9.04 live cd
** running on a VM on a different computer
*** running on Sun JRE 6 - 1.6.0_14

Again, these results don’t prove which algorithm is the best due to different browser performances, however it is worth noting that Firefox 3.0.13 performed better than Firefox 3.5.2 on this benchmark. Internet Explorer had the worst results, it can be either the algorithm or the browser performance itself or even both. :-)

Cross-browser event listener with design patterns

A lot of the power of JavaScript comes from the ability to be an event-driven language. Dealing with event listeners is not a simple task considering different implementations by Internet Explorer and the rest of the world. After some searching on internet you have probably found some cross-browser solutions, although they are not always as efficient as it possibly could.
I’ll present some of them and explain their design pros and cons. All them do exactly the same, attach actions to events, none of them does browser sniffing to determine which event attachment technique to use, they use feature detection instead. For clarity sake, browser sniffing is a common technique on which an application tries to find out the user’s browser, usually by examining the useragent string looking for vendor/version and is a common source of detection problems. On the other hand, feature detection is a much safer technique where the application checks if a certain feature (function, property) exists (is implemented) in the browser and once this feature is defined, use it, otherwise another feature is checked as a fallback.
Some design patterns presented here are well covered in Ross Harmes & Dustin Diaz’s Pro JavaScript Design Patterns book.

facade

Very intuitive and also popular when quick searching on internet. Here the facade pattern simplifies the interface for attaching events by providing a common method to be used on your code which does the task of detecting features in order to provide a safe cross-browser implementation.

1
2
3
4
5
6
7
8
9
var addEvent = function (el, ev, fn) {
    if (el.addEventListener) {
        el.addEventListener(ev, fn, false);
    } else if (el.attachEvent) {
        el.attachEvent('on' + ev, fn);
    } else {
        el['on' + ev] = fn;
    }
};

pros: easy and clear to understand, small.
cons: inefficient when used consecutive times, every time it’s called a new checking is made in order to determine which feature is available for attaching event listeners.

memoization version a

Function rewriting is a powerful feature on JavaScript, it allows a function to be redefined by itself. When a function is invoked it can perform some tasks and before it returns, it can be redefined. Here memoization happens on lines 4, 9 and 14 depending on which feature is available in the host environment (browser).
During the feature detection on this version a of memoization, the appropriate event attacher is first invoked then the function is rewritten.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
var addEvent = function (el, ev, fn) {
    if (el.addEventListener) {
        el.addEventListener(ev, fn, false);
        addEvent = function (el, ev, fn) {
            el.addEventListener(ev, fn, false);
        };
    } else if (el.attachEvent) {
        el.attachEvent('on' + ev, fn);
        addEvent = function (el, ev, fn) {
            el.attachEvent('on' + ev, fn);
        };
    } else {
        el['on' + ev] = fn;
        addEvent = function (el, ev, fn) {
            el['on' + ev] =  fn;
        };
    }
};

pros: very efficient, feature detection is made only once as the addEvent function is rewritten with the appropriate event listener attacher function.
cons: cost of the event listener function invocation prior to function rewriting, cost of function rewriting.

memoization version b

This technique is pretty similar to the one above except that during the feature detection the function is first rewritten with the appropriate event attacher then it is invoked only in the first execution.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
var addEvent = function (el, ev, fn) {
    if (el.addEventListener) {
        addEvent = function (el, ev, fn) {
            el.addEventListener(ev, fn, false);
        };
    } else if (el.attachEvent) {
        addEvent = function (el, ev, fn) {
            el.attachEvent('on' + ev, fn);
        };
    } else {
        addEvent = function (el, ev, fn) {
            el['on' + ev] =  fn;
        };
    }
    addEvent(el, ev, fn);
};

pros: as efficient as version a, feature detection is made only once too.
cons: cost of function rewriting first then cost of first invocation.

closure

A closure is a protected variable space, created by using nested functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var addEvent = (function () {
    if (window.addEventListener) {
        return function (el, ev, fn) {
            el.addEventListener(ev, fn, false);
        };
    } else if (window.attachEvent) {
        return function (el, ev, fn) {
            el.attachEvent('on' + ev, fn);
        };
    } else {
        return function (el, ev, fn) {
            el['on' + ev] =  fn;
        };
    }
}());

pros: detection is made as soon as the script where this function resides in is executed (note parenthesis on line 15), hence the detection is made only once, returning the appropriate event listener attacher to be used on new calls, useful when using it consecutively.
cons: initial execution cost as it is executed right after loading, but considering its size and complexity can barely be noticed. Can be a bit obscure for beginners unfamiliar with closures. For those I suggest reading Douglas Crockford’s Private Members in JavaScript. Closure consumes more memory because it carries with them the containing function’s scope.

closure + singleton + lazy loading

This technique makes use of 3 patterns:

  • closure is defined by nested functions (lines 1 and 4) where the inner function (line 4) has access to its outer function variable setListener (line 2), it is invoked on line 22.
  • singleton is defined by using a single variable (line 2) that defines a function on lines 7, 11 and 15. It is checked on line 5 and always invoked on line 20.
  • lazy loading is used here for assigning the singleton variable when demanded as opposed to immediately after singleton definition.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
var addEvent = (function () {
    var setListener;
 
    return function (el, ev, fn) {
        if (!setListener) {
            if (el.addEventListener) {
                setListener = function (el, ev, fn) {
                    el.addEventListener(ev, fn, false);
                };
            } else if (el.attachEvent) {
                setListener = function (el, ev, fn) {
                    el.attachEvent('on' + ev, fn);
                };
            } else {
                setListener = function (el, ev, fn) {
                    el['on' + ev] =  fn;
                };
            }
        }
        setListener(el, ev, fn);
    };
}());

pros: no time consuming when script is loaded, first execution (closure) is very fast as it only returns a function, event listener attacher is defined only once on demand.
cons: it can be a bit obscure by mixing up 3 different techniques, after the first invocation every subsequent invocation has a verification on line 5 which will always return false however this is very subtle and also very cheap in JavaScript. All cons listed on closure pattern above.

closure + memoization version a

Here, closure takes part by returning an inner function that has access to its outer function setListener on line 2, it immediately invoked on line 24 returning a function that invokes setListener function. Within setListener function definition (lines 2-19) the memoization takes part inside every if-else by rewriting itself with the appropriate event attacher. In this technique the appropriate event attacher is first invoked during feature detection then the setListener function is rewritten.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
var addEvent = (function () {
    var setListener = function (el, ev, fn) {
        if (el.addEventListener) {
            el.addEventListener(ev, fn, false);
            setListener = function (el, ev, fn) {
                el.addEventListener(ev, fn, false);
            };
        } else if (el.attachEvent) {
            el.attachEvent('on' + ev, fn);
            setListener = function (el, ev, fn) {
                el.attachEvent('on' + ev, fn);
            };
        } else {
            el['on' + ev] =  fn;
            setListener = function (el, ev, fn) {
                el['on' + ev] =  fn;
            };
        }
    };
 
    return function (el, ev, fn) {
        setListener(el, ev, fn);
    };
}());

pros: very efficient when used consecutively, pretty elegant code (you can use to impress your boss)
cons: it can also be a bit obscure, specially for those not familiar to JavaScript patterns. All cons listed on closure pattern above.

closure + memoization version b

This technique is pretty much the same as above the only difference is that during feature detection which happens only once, the setListener function is first rewritten (lines 4,8,12) then it’s invoked on line 16, only during the first call.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
var addEvent = (function () {
    var setListener = function (el, ev, fn) {
        if (el.addEventListener) {
            setListener = function (el, ev, fn) {
                el.addEventListener(ev, fn, false);
            };
        } else if (el.attachEvent) {
            setListener = function (el, ev, fn) {
                el.attachEvent('on' + ev, fn);
            };
        } else {
            setListener = function (el, ev, fn) {
                el['on' + ev] =  fn;
            };
        }
        setListener(el, ev, fn);
    };
 
    return function (el, ev, fn) {
        setListener(el, ev, fn);
    };
}());

pros: same pros on version a above
cons: same cons on version a above

conclusion

Very different techniques were presented in this post, all them have their pros and cons, it’s up to the developer to choose which one to use, however the facade one is not indicated for performance issues but it can be useful to explain/introduce this design patterns for beginners.
The examples above uses the most common JavaScript design patterns and can be combined in your applications for different purposes rather then event attacher, chosen for being a real world situation where developers face everyday.

What’s going on with Firebug?

[UPDATE - july 2nd]

Since my post yesterday, the firebug developers have released another version of the tool (1.4.0b4) it seems more stable, great job!

————

When Mozilla launched Firefox 3.5 a couple days ago, I was forced to upgrade my FB (firebug) to the new 1.4 beta version, although the new FB has been greatly improved it still has a lot bugs as you’d expect on a beta version, but that’s not the bad part, the new FF3.5 is really fast and has really cool new features but FB is making the browser choke and freeze all the time, it’s no news that FB has become more and more slow and buggy each version.

I don’t mean to sound harsh and I have always been a fan of FB, it was the one tool that brought us from the dark ages of alert() debugging to a more professional debugging, but something has to be done urgently, rightnow you can’t even think about debugging heavy web apps.

So, what could be done? Maybe Mozilla could adopt it as part of the browser, I know it would probably need major refactoring, but as far as I can tell built-in tools are much faster, like Webkit’s Web Inspector (Safari only, the Chrome implementation has a lot of bug fixing to do) and also Opera’s Dragonfly, as for IE… well you know, not worth mentioning ;) Or maybe the FB developers could focus more on performance than adding new features.

I know I could switch to Safari, Chrome or even Opera, but the point is FF really deserves a great developer tool like FB was a while ago :)

Limitation on call stacks

Fellow Yahoo! Nicholas Zakas has blogged about some browsers limitations and recently ran some tests in order to check browsers call stack sizes.

Not surprisingly, different browsers have different call stack sizes. Also not surprisingly, the method that they use to determine the call stack varies as well. The various call stack sizes I could measure are (give or take, might be off by 1 or 2):

  • Internet Explorer 7: 1,789
  • Firefox 3: 3,000
  • Chrome 1: 21,837
  • Opera 9.62: 10,000
  • Safari 3.2: 500

In each case, the browser will end up stopping your code and (hopefully) display a message about the issue:

  • Internet Explorer 7: “Stack overflow at line x”
  • Firefox 3:”Too much recursion”
  • Chrome 1: n/a
  • Opera 9.62: “Abort (control stack overflow)”
  • Safari 3.2:”RangeError: Maximum call stack size exceeded.”

We ran some tests ourselves in order to get the call stack sizes on other browsers/hosts/OSs using this simple code:

var i = 0;
 
function recurse () {
    i++;
    recurse();
}
 
try {
    recurse();
} catch (ex) {
    alert('i = ' + i + '\nerror: ' + ex);
}

And got some interesting numbers/errors results:

on Ubuntu 8.04:
stack sizes

  • Firefox 3.0.10: 3,000
  • Chromium 2.0.174.0~svn20090412r13572: 21,828
  • Rhino 1.4r2: 338
  • SpiderMonkey 1.7.0: 1,000

errors

  • Firefox 3.0.10: InternalError: too much recursion
  • Chromium 2.0.174.0~svn20090412r13572: RangeError: Maximum call stack size exceeded
  • Rhino 1.4r2: js: exception from uncaught JavaScript throw: java.lang.StackOverflowError
  • SpiderMonkey 1.7.0: InternalError: too much recursion

on Windows XP SP3:
stack sizes

  • Internet Explorer 6.0.2900.5512 SP3: 2,541
  • Internet Explorer 7.0.5730.13: 2,555
  • Internet Explorer 8.0.6001.18702: 3,076
  • Firefox 2.0.0.20: 1,000
  • Firefox 3.0.10: 3,000
  • Firefox 3.5 Beta 4: 3,000
  • Chrome 2.0.172.31: 21,838
  • Safari 4 Public Beta (528.16): 43,688
  • Opera 9.64: n/a

errors

  • Internet Explorer 6.0.2900.5512 SP3: Error: Out of stack space
  • Internet Explorer 7.0.5730.13: Error: Out of stack space
  • Internet Explorer 8.0.6001.18702: Error: Out of stack space
  • Firefox 2.0.0.20: InternalError: too much recursion
  • Firefox 3.0.10: InternalError: too much recursion
  • Firefox 3.5 Beta 4: InternalError: too much recursion
  • Chrome 2.0.172.31: name: RangeError, type: stack_overflow, message: Maximum call stack size exceeded
  • Safari 4 Public Beta (528.16): RangeError: Maximum call stack size exceeded
  • Opera 9.64: Abort (control stack overflow). Script terminated.

You can see this code live and running here.