1 July 2008

Coding idioms

Just like human languages, programming languages have idioms — commonly used structures that don’t really fit with the typical use of the language. The number of idioms in the language is closely related to how flexible it is. Languages like Perl seem to have dozens of idioms — Perl is often called an idiomatic language — whereas Java doesn’t seem to have so many.

Idioms seem to arise where it’s difficult to construct something in a particular language. For instance, in Java there’s no simple list or map initialisation syntax. So one idiom I’ve seen in Java is this way of initialising a list:

List list = new ArrayList() {{
    add("one");
    add("two");
    add("three");
}};

The double-braces aren’t a special syntax, they’re just the combination of an anonymous inner class and an instance initialiser block to add items to the list at construction time. Instance initialiser blocks are quite rare, so the above code looks like magic to many Java developers.

Of course, I’d never use a structure like this in production Java code — the idiom is too unusual and hard to understand. I don’t want other developers wondering what the code does when they need to fix bugs in it.

One idiom in Perl is making the equivalent of a switch statement. Perl doesn’t have a built-in syntax for switch like C and Java, so Perl programmers often have a coding idiom like this:

for ($action) {
    /list/ && do { print list($cgi, $articles, $template); last; };
    /edit/ && do { print edit($cgi, $articles, $template); last; };
    /view/ && do { print view($cgi, $articles, $template); last; };
    /add/ && do { print add($cgi, $articles, $template); last; };
    /delete/ && do { print del($cgi, $articles, $template); last; };
    die "Action not supported: $action";
}

This idiom uses many quirks in Perl to work its magic. First, there’s the for loop, which sets the ‘topic’ of the block to $action. Normally, it would iterate through all the values provided, but by just providing one value, you’re using the topic-setting functionality of the loop but not the looping behaviour.

The regex matches at the start of each switching line, like /list/, are matching against the topic, $action. Any successful regex match runs the code block following. The last command in each block is the equivalent of the break statement in other languages.

A simpler example of this idiom is applying the multiple regexes to the same variable:

for ($date) {
    s/^\[//; # strip leading bracket
    s/\[$//; # strip trailing bracket
}

In JavaScript, a language I’ve become familiar with only recently, many of the idioms seem to be around anonymous functions. Here’s a very common idiom for limiting the scope of variable declarations, sometimes (but not very accurately) called ‘namespacing’:

var obj = (function ($) {
    var foo = "bar";
    return {
        prop: function () {
            alert(foo); // "bar" -- foo is visible here
        }
    };
})();
alert(foo); // "undefined" -- foo is not visible here

This can be extended by using the call and apply methods on JavaScript’s Function object to provide a scope or argument for the execution of the anonymous function:

(function ($) {
    // ...
    // within this block, 'this' is obj and '$' is jQuery
    // ...
}).call(obj, jQuery);

There are probably many more idioms like this in programming languages. What are the ones you’ve come across?