Tomer Filiba's Homepage

Some Notes on 'D for the Win'

2014-08-23T00:00:00+00:00

So it turns out my last post, D for the Win got all over the place. I published it late at night, went to bed, and when I got back in the morning all hell broke loose. And even more surprisingly, the vast majority of the reactions (at the ones I’ve seen on reddit and hacker news) were on-topic. I hope it means I got my message through.

Since I’m not able to reply to everything, I’ll try to provide answers to some repeating questions and concerns people had.

No, “Dlang Dlang uber Alles” was not meant to be offensive or avert discussion by means of pulling a Godwin. Yes, I am aware of the context. I just thought it’s funny and liked the sound of it. My sincere apologies to anyone who got offended by this.
Sorry, I’ve never heard of Nimrod until now. If I had, I’d surely try it out. But the product I’m working on is already committed to D at this point, and from what I gather Nimrod is still quite unstable.
Some people said that system programming is anything that’s run natively (i.e., compiles to machine code). Well no. I’m not sure if there’s a definitive definition, but system programming involves interfacing with hardware, drivers and other beasts. It usually also requires fine-grained control over resources (CPU, memory, affinity, etc.), and generally such systems are expected to deliver steady performance (soft-realtime behavior).

Therefore, I’d characterize a system-programming language as one that provides:
- Native pointers, unsafe casts and atomic operations - hardware devices don’t chew on your fancy object model. I mean, they gladly will, but I don’t suppose you’ll be as glad.
- C ABI - interfacing with system libraries and even the kernel directly. Some system calls are not exported through libraries, so being able to call a cdecl variadic function is extremely useful.
- Manual memory management - GC is a great thing in general, but when your system can’t pause for indefinite times at random you have to think twice. Also, when you have to control overall memory consumption, preallocating memory is the way to go.
- Generate efficient machine code, inline-assembly a plus - when you reach bottlenecks, every CPU cycle counts.
D has garbage collection and the standard library makes use of it. As I just explained, GC and system programming don’t mix well. On the other hand, system programming requires tight control over all aspects of the program, which means you end up with tailor-made data structures anyway. You’re not afraid to get your hands dirty and reimplement things the right way (for your needs). So no, the GC is not a real problem.

On the other hand, having GC is great. It means that outside of the critical paths we can use a modern language with all the benefits of one. But even more importantly – it means you can write a minimum viable product much more easily, and optimize as you make progress. In other words: start with GC and remove it where it hurts.
The C10K problem moved people from a thread-per-IO methodology to asynchronous IO (epoll, kqueue, IOCP, etc) in a thread-per-core mainloop. The C10M problem moves IO from the kernel into userland (“the kernel is the problem, not the solution”). You simply can’t process enough data when you have several system calls per IO. And it’s a pity, because you’re limited by operating system architecture, not hardware.

Context switches are expensive, the kernel requires accounting on its own, lots of memory gets copied, and you don’t get to utilize hardware offloading when you go through the standard kernel stacks. For instance, I got a factor 12 improvement just by moving my code to userland polling instead of relying on epoll. In both cases my program consumed 100% CPU, but kernel time plummeted from 40% to 1%.

And no, you cannot approach the C10M problem in Java or Python.
C++ is a horrible language. In fact, I don’t consider it a language any more. It’s a conglomerate of plagiarisms with an awful syntax. I mean, every language borrows ideas from neighboring ones, but C++ has this Perl-6 aroma to it: if it exists, we’ll take it in. It doesn’t even have to make sense. As a friend put it, “it can do EVERY paradigm but equally badly”.

Compiler support is always lacking, compilation times are ever increasing, and you have to fight your way to get the compiler to do what you have in mind. Even shooting yourself in the foot isn’t fun anymore.
Go is nice for application programming. In fact, it may be great for application programming. As a web-oriented language with easy string manipulation, built-in coroutines and garbage collection, it surely beats Python in terms of speed. But I find it not that readable and in general I’m not sure if moving to Go is worth the trouble. It feels like a mix of too-high and too-low level constructs, with nothing in between. Awkward is the best word I can think of.

But back to my original point, Go doesn’t come near system programming. Unless your systems run in the cloud, that is.
D has poor adoption and lacks the codebase and community of C/C++/Java/Python/Go. But as I explained, system programming teaches you to be self-reliant. You almost never rely on external code as-is (it might allocate/block/be inefficient) so you end up living in a rather closed world. On the other hand, D has C ABI so with a bit of porting effort (converting H files), you gain access to virtually everything.

People also repeatedly said (perhaps even mockingly) that are no success stories or large projects using D. First of all, that’s not true. Google around. But I agree there not that many, and it’s intriguing why D hasn’t gotten the attention it deserves in the long while since it came to be. Anyway, being a startup, my colleagues and I are not afraid of hard work or living on the edge. And hopefully you’ll hear about another D success story in a year or two :)

D for the Win

2014-08-21T00:00:00+00:00

I’m a convert! I’ve seen the light!

By the way, be sure to read part 2 as well.

You see, Python is nice and all and it excels in so many domains, but it was not crafted for the ever growing demands of the industry. Sure, you can build large-scale projects in Python (and I have built), but you take it out of the lab and into the real world, the price you pay is just too high. Literally. In terms of work per CPU cycle, you can’t do worst.

The C10M problem is a reiteration of the C10K problem. In short, today’s commodity hardware can handle millions of packets per second, but in reality you hardly ever reach such numbers. For example, I worked a short while at a company that used AWS and had tens of twisted-based Python servers accepting and logging requests (not doing any actual work). They managed to squeeze ~500 requests/sec out of this setup (per machine), which escalated in cost rather quickly. Moving to PyPy (not without trouble) did triple the numbers or so, but still, the cost simply didn’t scale.

Python, I love you, but you help instill Gate’s law – “The speed of software halves every 18 months”. In the end, we pay for our CPU cycles and we want to maximize our profit. It’s not you, Guido, it’s me. I’ve moved on to the C10M world, and for that I’d need a programming language that’s designed for system programming with a strong and modern type system (after all, I love duck typing). I need to interface with external systems, so a C ABI is desirable (no foreign function interface), and meta-programming is a huge plus (so I won’t need to incorporate cumbersome code-generation in my build system). Not to mention that mission-critical code can’t allow for the occasional NameError or NoneType has no member __len__ exceptions. The code must compile.

I’ve looked into rust (nice, but will require a couple of years to mature enough for a large-scale project) and go (Google must be joking if they actually consider it for system programming), but as strange as it may sound, I’ve finally found what I’ve been looking for with D.

Dlang Dlang Über Alles

System programming is a vast ocean of specifics, technicalities and constraints, imposed by your specific needs. Instead of boring you to death with that, I thought it would be much more intriguing to compare D and Python. In other words, I’ll try to show how D speaks fluent Python.

But first things first. In (the probable) case you don’t know much D – imagine it’s what C++ would have dreamed to be. It offers cleaner syntax, much shorter compilation time, (optional) garbage collection, highly expressive templates and type inference, Pythonic operator overloading (implemented as rewriting), object-oriented and functional capabilities (multi-paradigm like Python), intermingles high-level constructs (like closures) with low-level ones (naked functions in inline assembly) to produces efficient code, has strong compile-time introspection capabilities and some extra cools features in the domain of code generation: mixin – which evaluates an arbitrary string of D code at compile time, and CTFE – compile-time function execution. Whoa, that was long.

In general, D follows Python’s duck-typed (or protocol-oriented) spirit. If a type provides the necessary interface (“protocol”) for an operation, it will just work, but you can also test for compliance at compile time. For example, ranges are a generalization of generators in Python. All you need to do in order to be an InputRange is implement bool empty(), void popFront() and auto front(), and you can use isInputRange!T to test whether T adheres the protocol. By the way, the exclamation point (!), which we’ll soon get acquainted with, distinguishes compile-time arguments from runtime ones.

For brevity’s sake, I’m not going to demonstrate all the properties I listed up there. Instead, I’ll show why Python programmers ought to love D.

Case Study #1: Generating HTML

In an old blog post I outlined my vision of HTML templating languages: kill them all. I argued they are all but crippled-down forms of Python with an ugly syntax, so just give me Python and an easy way to programmatically manipulate the DOM.

I’ve later extended the sketch into a library in its own right, named srcgen. You can use it to generate HTML, C-like languages and Python/Cython code. I used it in many of my commercial projects when I needed to generate code.

So here’s an excerpt of how’s it done in srcgen:

def buildPage():
    doc = HtmlDocument()
    with doc.head():
        doc.title("das title")
        doc.link(rel = "foobar", type="text/css")

    with doc.body():
        with doc.div(class_="mainDiv"):
            with doc.ul():
                for i in range(5):
                    with doc.li(id = str(i), class_="listItem"):
                        doc.text("I am bulletpoint #", i)

    return doc.render()

And here’s how it’s done in D:

auto buildPage() {
    auto doc = new Html();

    with (doc) {
        with (head) {
            title("das title");
            link[$.rel = "foobar", $.type = "text/css"];
        }
        with (body_) {
            with(div[$.class_ = "mainDiv"]) {
                with (ul) {
                    foreach(i; 0 .. 5) {
                        with (li[$.id = i, $.class_ = "listItem"]) {
                            text("I am bulletpoint #");
                            text(i);
                        }
                    }
                }
            }
        }
    }

    return doc.render();
}

You can find the source code on github, just keep in mind it’s a sketch I wrote for this blog post, not a feature-complete library.

The funny thing is, Python’s with and D’s with are not even remotely related! The Python implementation builds a stack of context managers, while with in D merely alters symbol lookup. But lo and behold! The two versions are practically identical, modulo curly braces. You get the same expressive power in both.

Case Study #2: Construct

But the pinnacle is clearly my D version of Construct. You see, I’ve been struggling for many years to create a compiled version of Construct. Generating efficient, static code from declarative constructs would make the library capable of handling real-world data, like packet sniffing or processing of large files. In other words, you won’t have to write a toy parser in Construct and then rewrite it (by hand) in C++.

The issues with my C version of Construct were numerous, but they basically boiled down to the fact I needed a stronger object model to represent strings, dynamic arrays, etc., and adapters. The real power of Construct comes from adapters, which operate at the representational (“DOM”) level of the data, rather on its binary form. That required lambdas, closures and other higher-level concepts that C lacks. I even tried writing a Haskell version, given that Haskell is so high-level and functional, but my colleague and I had given hope after a while.

Last week, it struck me that D could be the perfect candidate: it has all the necessary high-level concepts while being able to generate efficient code with meta-programming. I began fiddling with a D version, which proved extremely promising. So without further ado, I present dconstruct – an initial sketch of the library.

This is the canonical PascalString declaration in Python:

>>> pascal_string = Struct("pascal_string",
...     UBInt8("length"),
...     Array(lambda ctx: ctx.length, Field("data", 1),),
... )
>>>
>>> pascal_string.parse("\x05helloXXX")
Container({'length': 5, 'data': ['h', 'e', 'l', 'l', 'o']})
>>>
>>> pascal_string.build(Container(length=5, data="hello"))
'\x05hello'

And here’s how it’s done in D:

struct PascalString {
    Field!ubyte length;
    Array!(Field!ubyte, "length") data;

    // the equivalent of 'Struct' in Python,
    // to avoid confusion of keyword 'struct' and 'Struct'
    mixin Record;
}

PascalString ps;
auto stream = cast(ubyte[])"\x05helloXXXX".dup;
ps.unpack(stream);
writeln(ps);
// {length: 5, data: [104, 101, 108, 108, 111]}

Through the use of meta-programming (and assuming inlining and optimizations), that code snippet there actually boils down to something like

struct PascalString {
    ubyte length;
    ubyte[] data;

    void unpack(ref ubyte[] stream) {
        length = stream[0];
        stream = stream[1 .. $]; // advance stream
        data = stream[0 .. length];
        stream = stream[length .. $];  // advance stream
    }
}

Which is as efficient as it gets.

But wait, there’s more! The real beauty here is how we handle the context. In Python, Construct builds a dictionary that travels along the parsing/building process, allowing constructs to refer to previously seen objects. This is possible in D too, of course, but it’s highly inefficient (and not type safe). Instead, dconstruct uses a trick that’s commonly found in template-enabled languages – creating types on demand:

struct Context(T, U) {
    T* _curr;
    U* _;
    alias _curr this;   // see below
}

auto linkContext(T, U)(ref T curr,  ref U parent) {
    return Context!(T, U)(&curr, &parent);
}

The strange alias _curr this is a lovely feature of D known as subtyping. It basically means that any property that doesn’t exist at the struct’s scope will we forwarded to _curr, e.g., when I write myCtx.foo and myCtx has no member named foo, the code is rewritten as myCtx._curr.foo.

As we travel along constructs, we link the current context with its ancestor (_). This means that for each combination of constructs, and at each nesting level, we get a uniquely-typed context. At runtime, this context is nothing more than a pair of pointers, but at compile time it keeps us type-safe. In other words, you can’t reference a nonexistent field and expect the program to compile.

A more interesting example would thus be

struct MyStruct {
    Field!ubyte length;
    YourStruct child;

    mixin Record;
}

struct YourStruct {
    Field!ubyte whatever;
    Array!(Field!ubyte, "_.length") data;  // one level up, then 'length'

    mixin Record;
}

MyStruct ms;
ms.unpack(stream);

When we unpack MyStruct (which recursively unpacks YourStruct), a new context ctx will be created with ctx._curr=&ms.child and ctx._=&ms. When YourStruct refers to "_.length", the string is implanted into ctx, yielding ctx._.length. If we refered to the wrong path or misspelled anything, it would simply not compile. That, and you don’t need dictionary lookups at runtime – it’s all resolved during compilation.

So again, this is a very preliminary version of Construct, miles away from production grade, but you can already see where it’s going.

By the way, you can try out D online at dpaste and even play around with my demo version of dconstruct over there.

In Short

Python will always have a special corner in my heart, but as surprising as it may be (for a guy who’s made his career over Python), this rather unknown, rapidly-evolving language, D, has become my new language of choice. It’s expressive, concise and powerful, offers short compilation times (as opposed to C++) and makes programming both fun and efficient. It’s the language for the C10M age.

Construct Plus Plus

2014-02-26T00:00:00+00:00

As you may already know, I’m a type-system junkie. My heart yearns for the strongly-typed languages of the world, but being a I’m a practical guy, I mostly work with Python. That said, I’ve been keeping myself busy trying to find a way write a statically-typed version of Construct. I started with Haskell, but quickly gave up when my brain overheated with category theory.

Construct is at least context-sensitive a formalism (I showed in an earlier post that it recognizes the languages like a^nb^nc^n...z^n), which limits one’s ability to reason about it (or in my case, embed it in a strongly-type language), but subsets of Construct are “weak enough” for that. These are known as Pickler Combinators, first described (as far as I can tell) in Andrew Kennedy’s seminal paper.

The problem I set to solve was that of a time-and-space efficient, statically-typed RPC between Python and C++. One approach is to encode types into the data, such as JSON, pickle, or msgpack. Another project of mine, RPyC, utilizes this method too, and for a dynamic language that’s the most sensible thing to do. But in a richly-typed static language like C++ it would be a pity to waste both bandwidth and CPU cycles to encode that. It also makes memory management a headache, since there’s no way to tell in advance how much memory would be need (and what types follow). It also requires recursion (for nested types), and last but certainly not least – forces you to be dynamically-typed.

The second approach is to have a well-defined IDL, such as protobuf or Apache Thrift. This approach, however, requires code generation to run during the build process (an extra toolchain), and, you know – another ugly DSL to learn and maintain.

What I’m about to demonstrate here is a serialization mechanism for C++ that relies only on the type system in order to produce efficient encoding of arbitrarily-complicated objects – all well-typed and resolved in compile time. Protobuf without code generation, if you wish.

Besides, I’ve been planning to make use of the new features of C++11 for a while now, so let’s bring in the heavy guns!

Construct++

The basic idea is to have two overloaded template functions for the general case:

void pack(std::ostream&, const T&)
void unpack(std::istream&, T&)

And have them specialized for concrete and high-order types. If you’re not comfortable with C++11, be sure to read through the new features because it gets a little scary. Here’s how we begin:

template<typename T>
typename std::enable_if<std::is_integral<T>::value || std::is_floating_point<T>::value>::type
pack(std::ostream& stream, const T& value) {
    stream.write((char*)&value, sizeof(T));
}

template<typename T>
typename std::enable_if<std::is_integral<T>::value || std::is_floating_point<T>::value>::type
unpack(std::istream& stream, T &value) {
    stream.read((char*)&value, sizeof(T));
}

These functions handle integers and floating point numbers by packing/unpacking a bitwise representation of the value to/from the stream. Okay, nothing new here, it’s just a casting the value to a char* and processing it in the raw. At this point we can write:

int16_t v;
unpack(stream, v);

Big deal.

Arrays

Since we’re already dealing with fixed-size data, let’s also handle the case of fixed-size arrays of packable objects:

template<typename T, std::size_t N>
void pack(std::ostream& stream, const T (&arr)[N]) {
    for (int i = 0; i < N; i++) {
        pack(stream, arr[i]);
    }
}

template<typename T, std::size_t N>
void unpack(std::istream& stream, T (&arr)[N]) {
    for (int i = 0; i < N; i++) {
        unpack(stream, arr[i]);
    }
}

Which allows us to write

int16_t arr[3][2];
unpack(stream, arr);

Wha?! We’re handling two dimensional arrays here… how’s that possible? Well, thanks to the magic of templates, we can “unwrap” arrays dimension by dimension. We’re actually just handling a pair of int16_t three times. It’s all done in compile time of course.

Variable-Length Data

Now let’s move on to the more interesting stuff – variable-length data – such as vectors:

typedef uint8_t length_type;

template<typename T, typename L=length_type>
void pack(std::ostream& stream, const std::vector<T>& vec) {
    L length = (L)vec.size();
    pack(stream, length);
    for (int i = 0; i < length; i++) {
        pack(stream, vec[i]);
    }
}

template<typename T, typename L=length_type>
void unpack(std::istream& stream, std::vector<T>& vec) {
    L length;
    unpack(stream, length);
    vec.resize(length);
    for (int i = 0; i < length; i++) {
        unpack(stream, vec[i]);
    }
}

It just adds a prefixes of the vector’s length (as uint8_t, but you can specify a different type), followed by the vector’s items. Now we can write:

vector<uint16_t> vec;
unpack(stream, vec);

Which takes "\x03aabbcc" and spews out [0x6161, 0x6262, 0x6363]. Nice already.

Tuples

Tuples are a new feature of C++11 that holds a strongly-typed heterogeneous sequence of objects, as in std::tuple<int, char*, float> t(5, "hello", 1.414). You can think of tuples as a light-weight structs, where fields have indexes instead of names.

Iterating over tuples in compile time is a bitch, trust me on that, so I’ll skip the full implementation (hint: it requires recursion) and just say that we have:

template<typename... Types> void pack(std::ostream& stream, const std::tuple<Types...>& tup) {
    //...
}
template<typename... Types> void unpack(std::ostream& stream, std::tuple<Types...>& tup) {
    //...
}

Why do I even bother to show that? The answer will be clear in a moment.

Structs

Structs pose a rather impossible problem for our packing combinators. First of all, not all structs are packable: they may hold pointers or some internal state that might be transient. But worse, from our perspective, is the fact that every struct is different… If you wish, “vector<T> are all alike; every struct is a struct in its own way”.

So there’s nothing we can do but (a) mark packable structs explicitly and (b) implement a custom pack()/unpack() semantics for every struct (using inheritance, for example).

struct packable {
    virtual void _pack_self(std::ostream& stream) const = 0;
    virtual void _unpack_self(std::istream& stream) = 0;
};

template<typename T>
typename std::enable_if<std::is_base_of<packable, T>::value>::type
pack(std::ostream& stream, const T& pkd) {
    pkd._pack_self(stream);
}

template<typename T>
typename std::enable_if<std::is_base_of<packable, T>::value>::type
unpack(std::istream& stream, T &pkd) {
    pkd._unpack_self(stream);
}

So we can write

struct my_struct : packable {
    uint8_t a;
    uint8_t b;

    void _pack_self(std::ostream& stream) const {
        pack(stream, a);
        pack(stream, b);
    }
    void _unpack_self(std::istream& stream) {
        unpack(stream, a);
        unpack(stream, b);
    }
};

my_struct s;
unpack(stream, s);

But notice how mechanical the _pack_self/_unpack_self methods are – we’d really wish to auto-generate them somehow. We can use a preprocessor macro, but how could we generate a line for each member? We don’t have meta-for-loops after all… or do we? Enter tuples!

#define PACKED(...) \
    void _pack_self(std::ostream& stream) const override { \
        auto tmp = std::tie(__VA_ARGS__); \
        pack(stream, tmp); \
    } \
    void _unpack_self(std::istream& stream)  override { \
        auto tmp = std::tie(__VA_ARGS__); \
        unpack(stream, tmp); \
    }

std::tie builds a strongly-typed tuple of references, meaning they refer to the member variables instead of holding a copy. Thanks to the power of template programming, packing/unpacking this tuple utlimately boils down to something like the first version given above – a line for each member.

So here’s the whole deal:

struct first_struct : packable {
    uint8_t x;
    uint16_t y;

    PACKED(x, y)
};

struct second_struct : packable {
    uint8_t a;
    std::string b;    // note this is a variable length string!
    int16_t c[3][2];
    first_struct d;
    enum : uint16_t {
        A = 1,
        B = 2,
        C = 3
    } e;

    PACKED(a, b, c, d, e)
};

Ain’t that cool? And it even works!

std::stringstream ss;
ss << "A\x05helloa0b0c0d0e0f0Bg0\x02\x00XXXXXXXXXX";
second_struct x = {};
unpack(ss, x);

std::cout << "a=" << (int)x.a << "," << "b=" << x.b << ","
          << "c=" << x.c[0][0] << ".." << x.c[2][1] << ","
          << "d=(" << x.d.x << "," << x.d.y << "),"
          << "e=" << x.e << std::endl;

Which prints a=65,b=hello,c=12385..12390,d=(B,12391),e=2.

The full code snippet is available here, tested with g++4.8 and clang++3.4 with -std=c++11.

A Python Crash Course for the Statically Typed Programmer

2013-11-15T00:00:00+00:00

Python is a multi-paradigm (hybrid) language. It's fully object-oriented but has strong functional roots. Note that this isn't a [beginner's tutorial](http://learnpython.org/) but a quick reference for the language and its features that should allow you to write basic Python ASAP. If you had taken any academic course that involves programming, Python will most likely resemble pseudo code to you

def factorial(n):
    if n <= 1:
        return 1
    else:
        return n * factorial(n - 1)

# or

def factorial(n):
    res = 1
    while n:
        res *= n
        n -= 1
    return res

You'll see inheritance and class diagrams along side with constructs imported from [Haskell](http://www.haskell.org) and [LISP](http://en.wikipedia.org/wiki/Lisp_%28programming_language%29). Python is dynamically-typed (as opposed to statically-typed) but has strong-typing (as opposed to Perl or Javascript)

Syntax reference (1) ``` # printing to stdout print EXPR # assignment VAR = EXPR # conditions if COND: SUITE [elif COND: SUITE] ... [else: SUITE] # for-loops for VAR in ITERABLE: SUITE [break] [continue] # while-loops while COND: SUITE [break] [continue] ```

Syntax reference (2) ``` # try-except try: SUITE except EXCEPTION [as ex]: SUITE ... [else: # iff there was no exception SUITE] [finally: # always be performed SUITE] # raising exceptions raise EXCEPTION(...) # importing modules or attributes import MODULE from MODULE import NAME # defining functions def NAME([a, [b, ...]]): SUITE [return EXPR] [yield EXPR] # defining classes class NAME([BASE, [...]]): SUITE ```

The interactive interpreter is your friend! I mapped ``F11`` on my keyboard to fire up an interpreter... it's better than any calculator you'll ever have ``` $ python Python 2.7.5 (default, May 15 2013, 22:43:36) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> 5 + 6 11 ```

For people with a C background: * You don't declare variables - you just assign them * And you can reassign them to different types * Python doesn't have ``do-while`` (use ``while``) nor does it have ``switch`` (use ``if``s or dispatch tables) * Assignment is a statement, **not an expression**. * You cannot write ``if (a = foo()) == 5:`` * There's no such thing as passing by-value. It's always by-reference. * So you might need to explicitly **copy** objects some times * Or (preferably) create new objects from existing ones For people with a C++/Java background: * **Everything** is a first-class object * Integers, functions, types, stack frames, tracebacks, etc. * There are no privates, only conventions. Members that start with ``_`` are not to be manipulated directly. * There's no ``new``, just *invoke* the class like a function. * ``inst = MyClass(1, 2, 3)``

Duck Typing goes Nuclear In low-level programming languages, types dictate a **memory layout**. In high-level languages, on the other hand, compile-time types are merely **constraints** on what you're allowed to do with an object. Being an interpreted language, Python gives up on type-checking and instead adopts the spirit of "it's easier to ask for forgiveness than permission". Just try and see what happens.

>>> def foo(this, that):
...     return (this + that) * 2
...
>>> foo(3, 5)
16
>>> foo("hello", "world")
'helloworldhelloworld'

Python comes with lots of useful built-in types

>>> 5+3
8
>>> 27**63
149939874158678820041423971072487610193361136600
3344657118522818557991334322919287339806483L
>>> 5+3j
(5+3j)
>>> 1/(5+3j)
(0.14705882352941177-0.08823529411764706j)
>>> [1,2,3]
[1, 2, 3]
>>> [1,2,3] + [4,5,6]
[1, 2, 3, 4, 5, 6]
>>> (1,2,3)
(1, 2, 3)
>>> d={"a":4, 7:()}
>>> d["a"]
4
>>> d[7]
()
>>> True, False, None
(True, False, None)
>>> set([2,6,2,7,2,8,6])
set([8, 2, 6, 7])

String manipulation is a pleasure with Python

>>> "hello" + "world"
'helloworld'
>>> a="this is a single line string"
>>> a[5]
'i'
>>> a[5:12]
'is a si'
>>> a.upper()
'THIS IS A SINGLE LINE STRING'
>>> a.count("i")
5
>>> a.replace("i", "j")
'thjs js a sjngle ljne strjng'
>>> a.startswith("thi")
True
>>> "single" in a
True
>>> "multiple" in a
False
>>> a.split()
['this', 'is', 'a', 'single', 'line', 'string']
>>> a.split("i")
['th', 's ', 's a s', 'ngle l', 'ne str', 'ng']

Encoding strings like a boss

>>> "hello".encode("hex")
'68656c6c6f'
>>> "hello".encode("utf16")
'\xff\xfeh\x00e\x00l\x00l\x00o\x00'
>>> "hello".encode("zlib")
'x\x9c\xcbH\xcd\xc9\xc9\x07\x00\x06,\x02\x15'

String interpolation

>>> "My name is %s. You %s, prepare to %s" % ("Inigo Montoya",
... "killed my father", "die")
'My name is Inigo Montoya. You killed my father, prepare to die'
>>>
>>> "My name is %03d" % (7,)
'My name is 007'

And joining strings is surprisingly useful

>>> ":".join(["AA", "BB", "CC"])
'AA:BB:CC'

Multi-line strings

>>> b="""this is
... a multi
... line string"""
>>> b
'this is\na multi\nline string'
>>> b.splitlines()
['this is', 'a multi', 'line string']

If I had time to show you only three functions, they will be * ``help()`` * ``dir()`` * ``type()`` Everything else you can discover on your own.

>>> help
Type help() for interactive help, or help(object) for help about object.
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']

>>> help(dir)
    dir([object]) -> list of strings

    If called without an argument, return the names in the current scope.
    Else, return an alphabetized list of names comprising (some of) the attributes

>>> dir("hello")
['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__for
_mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
nter', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'in
index', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswi']

>>> type(5)
<type 'int'>
>>> type("hello")
<type 'str'>

Next, let's meet some types and learn how to convert (not **cast**) values from one type to the other

>>> int
<type 'int'>
>>> str
<type 'str'>
>>> list
<type 'list'>

>>> int(5.1)
5
>>> int("5")
5
>>> str(5)
'5'
>>> list("hello")
['h', 'e', 'l', 'l', 'o']

Types matter

>>> 5 + "6"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'
>>> 5 + int("6")
11
>>> str(5) + "6"
'56'

Repr(esenation)

>>> repr(5)
'5'
>>> repr("hello")
"'hello'"
>>> print "Hello %s" % ("foo\n\tbar",)
Hello foo
        bar
>>> print "Hello %r" % ("foo\n\tbar",)
Hello 'foo\n\tbar'

Lists

>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(10,20)
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>> range(10,20,2)
[10, 12, 14, 16, 18]
>>> range(20,10,-2)
[20, 18, 16, 14, 12]
>>> len([2,5,2,6,2,6])
6
>>> len("hello")
5
>>> a=[2,5,2,6,2,6]
>>> max(a)
6
>>> min(a)
2
>>> b=["what", "is", "this", "thing"]
>>> max(b)
'what'
>>> max(b, key = len)
'thing'

Slicing

>>> a=range(10,20)
>>> a[0]
10
>>> a[-1]
19
>>> a[3:7]
[13, 14, 15, 16]
>>> a[7:]
[17, 18, 19]
>>> a[-3:]
[17, 18, 19]
>>> a[:-3]
[10, 11, 12, 13, 14, 15, 16]

Lambda functions

>>> lambda x: x * 2
<function <lambda> at 0x0297EEB0>
>>> f = lambda x: x * 2
>>> f(6)
12
>>> map(f, range(10))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
>>> filter(lambda x: x % 2 == 0, range(10))
[0, 2, 4, 6, 8]

Working with files

>>> f = open("/etc/profile")
>>> f.read(100)
'# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))\n# and Bourne compatible shell'
>>> f.readline()
's (bash(1), ksh(1), ash(1), ...).\n'
>>> f.readline()
'\n'
>>> f.readline()
'if [ "$PS1" ]; then\n'
>>> list(f)[:4]
['  if [ "$BASH" ] && [ "$BASH" != "/bin/sh" ]; then\n',
 '    # The file bash.bashrc already sets the default PS1.\n',
 "    # PS1='\\h:\\w\\$ '\n",
 '    if [ -f /etc/bash.bashrc ]; then\n']
>>> f.close()

Working with resources

>>> with open("/etc/profile") as f:
...     f.read(100)
...
'# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))\n# and Bourne compatible shell'

The ``with`` block automatically closes the file. It's more general than this, of course, as it's not limited to files. It simply ensures that the necessary cleanup that's associated with the object will be performed when you finish the ``with`` block. For example, you can use ``with`` to commit a transaction to a DB or do a rollback on failure, etc. It's a very useful pattern.

List comprehension: remember ``map`` and ``filter``? Well, it's time to forget about them

>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [x * 2 for x in range(10)]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
>>> [x for x in range(10) if x % 2 == 0]
[0, 2, 4, 6, 8]
>>> [x * 2 for x in range(10) if x % 2 == 0]
[0, 4, 8, 12, 16]

Yo comprendo!

>>> [i * j for i in range(1,5) for j in range(1,5)]
[1, 2, 3, 4, 2, 4, 6, 8, 3, 6, 9, 12, 4, 8, 12, 16]
>>> [[i * j for i in range(1,5)] for j in range(1,5)]
[[1, 2, 3, 4], [2, 4, 6, 8], [3, 6, 9, 12], [4, 8, 12, 16]]
>>> [" ".join(["%3d" % (i * j,) for i in range(1,5)])
...     for j in range(1,5)]
['  1   2   3   4', '  2   4   6   8', '  3   6   9  12',
 '  4   8  12  16']

And the whole multiplication table in just one line!

>>> print "\n".join([" ".join(["%3d" % (i * j,) for i in range(1,11)])
...     for j in range(1,11)])
  1   2   3   4   5   6   7   8   9  10
  2   4   6   8  10  12  14  16  18  20
  3   6   9  12  15  18  21  24  27  30
  4   8  12  16  20  24  28  32  36  40
  5  10  15  20  25  30  35  40  45  50
  6  12  18  24  30  36  42  48  54  60
  7  14  21  28  35  42  49  56  63  70
  8  16  24  32  40  48  56  64  72  80
  9  18  27  36  45  54  63  72  81  90
 10  20  30  40  50  60  70  80  90 100

Chillex! Be lazy

>>> def myfunc():
...     yield 1
...     yield 2
...     yield 3
...
>>> g=myfunc()
>>> g
<generator object myfunc at 0x02A3A8F0>
>>> g.next()
1
>>> g.next()
2
>>> g.next()
3
>>> g.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Generators let us be as general as we wish while paying only for what we're actually using (lazy evaluation)

>>> def fib():
...     a = b = 1
...     while True:
...         yield a
...         a, b = b, a + b
...
>>> g = fib()
>>> g.next(), g.next(), g.next(), g.next(), g.next(), g.next(), g.next()
(1, 1, 2, 3, 5, 8, 13)

The ``itertools`` module has some nifty utilities that we can use

>>> import itertools
>>> list(itertools.islice(fib(),10))
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
>>>
>>> list(itertools.takewhile(lambda x: x < 50, fib))
[1, 1, 2, 3, 5, 8, 13, 21, 34]
>>>
>>> list(itertools.combinations([1,2,3,4],2))
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
>>>
>>> list(itertools.permutations([1,2,3]))
[(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]

Remember list comprehensions? Forget them too. Generator comprehensions are the new black!

>>> (x for x in range(10))
<generator object <genexpr> at 0x02A37F30>
>>> (x * 2 for x in range(10))
<generator object <genexpr> at 0x02A34670>
>>> (x * 2 for x in range(10) if x % 2 == 0)
<generator object <genexpr> at 0x02A37918>

Huh? Of course you won't see anything... you have to consume the generator in order for it to produce values.

>>> list(x * 2 for x in range(10) if x % 2 == 0)
[0, 4, 8, 12, 16]

In fact, list comprehensions are a syntactic sugar for ``list(``generator comprehension``)``

List comprehensions, as the name suggests, **build a list**. This can be expensive some times, especially when you don't need the intermediate values. E.g., if you just want to get the sum of the elements, there's no need to actually hold all of them in memory. Generators are the key to efficient programming. For example, ``xrange`` is like ``range`` but returns a generator instead.

>>> sum(range(1000000000))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError
>>> sum(xrange(1000000000))
499999999500000000L

## Object Oriented, Too ## Don't worry though. Python isn't all geared for functional programming. We have classes too!

>>> class Animal(object):
...     def __init__(self, name):
...         self.name = name
...     def drink(self):
...         print "*Gulp gulp gulp*"
...
>>> class Dog(Animal):
...     def drink(self):
...         Animal.drink(self)         # or super(Dog, self).drink()
...         print "*Tongue drips*"
...     def bark(self):
...         print "Woof woof woof"
...
>>> rex = Dog("Rex t3h Dawg")
>>> rex.drink()
*Gulp gulp gulp*
*Tongue drips*
>>> rex.bark()
Woof woof woof

Some notes for Java folks: * Constructors are inherited! Why wouldn't they?! * You can implement an exception class in Python in just one line: ``class MyException(Exception): pass`` * In Java/C# you have to reimplement all constructors as well * Python doesn't have function/method overloading * But given that most of the code is duck-typed anyway (and can introspect the received object at runtime) it doesn't make a real difference

But if we stopped here you'd think it's just a variation of Java. We're way cooler than Java! For example, instances (such as ``rex`` above) are actually just dictionaries that hold the instance's attributes.

>>> rex.__dict__
{'name' : 'Rex t3h Dawg'}

That's why we can add attributes on the fly

>>> rex.tail = "recursion"
>>> rex.__dict__
{'name' : 'Rex t3h Dawg', 'tail' : 'recursion'}

But classes ain't no different! They are just dictionaries of their methods

>>> Animal.__dict__.keys()
['__module__', 'drink', '__dict__', '__init__', '__weakref__', '__doc__']
>>>
>>> def sleep(self):
...     print "Zzzzzz"
...
>>> Animal.sleep = sleep      # monkey-patching
>>> rex.sleep()
Zzzzzz

Now it's time to discuss **special methods**. You've seen them already when we invoked ``dir("hello")`` -- they take the form of ``__xxx__``, and surprisingly or not, they make up most of what we've seen so far. Virtually all language constructs map to special methods underneath: * ``a + b`` is actually ``a.__add__(b)`` * ``a[x]`` maps to ``a.__getitem__(x)`` * ``str(a)`` invokes ``a.__str__()`` * ``a.b`` translates to ``a.__getattr__("b")`` * ``f(a,b,c)`` runs ``f.__call__(a, b, c)`` * So yes, ``f.__call__.__call__.__call__(a, b, c)`` as well * ``Dog("foo")`` creates the instance using ``__new__`` and initializes it in ``__init__`` You can invoke them yourself, of course, they're just methods:

>>> (5).__add__(6)
11

But why would you do that? Don't be silly.

Remember I said **everything** is an object? Well, I meant it.

>>> Animal.__base__
<type 'object'>
>>> Dog.__base__
<class '__main__.Animal'>
>>> Dog.__mro__
(<class '__main__.Dog'>, <class '__main__.Animal'>, <type 'object'>)
>>> type(Dog)
<type 'type'>

And it does gets mindboggling

>>> type
<type 'type'>
>>> type.__base__
<type 'object'>
>>> type(type)
<type 'type'>
>>> type(object)
<type 'type'>

The MRO (method resolution order) is actually very important. It determines what happens when you resolve attributes (``__getattr__``) on an object. For instance, ``rex.foo`` will first try ``rex.__dict__``, move up to ``Dog.__dict__`` and then to ``Animal.__dict__``, until we reach ``object.__dict__`` where we'll fail. And that's the whole object model. Dictionaries are also crucial to functions. A function is basically just code that evaluates in a local scope and has access to a global scope (module-level). Each of these scopes is... a dictionary. When you assign a variable, you basically insert an element into the scope dictionary.

>>> globals()
{'__builtins__': <module '__builtin__' (built-in)>}
>>> def f(a, b):
...     print locals()
...     c = a+b
...     print locals()
...
>>> globals()
{'__builtins__': <module '__builtin__' (built-in)>, 'f': <function f at 0x028CEEB0>}
>>>
>>> f(6,7)
{'a': 6, 'b': 7}
{'a': 6, 'c': 13, 'b': 7}

And no Python tutorial can do without

## When in Doubt ##

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.

Tales of a Stressed Kernel

2013-07-27T00:00:00+00:00

Spending most of our time (as developers) in the high-level world, it’s easy to occasionally forget the true nature of our systems and how fragile they really are. Well, perhaps fragile is not the right word here, more like intricate or perhaps chaotic - you cannot fully predict the effect of one operation on the rest of the system. In this case, a simple file copy got our machines to freeze for many long seconds… Huh?

Our machines are given loads of RAM but no swap space, to ensure deterministic memory access times. The memory is pre-allocated to different processes based on a configuration file, with a small portion reserved for the kernel. Memory is allocated from /dev/shm (mounted as tmpfs), where it’s also exposed as files.

When a process crashes we may need some of those /dev/shm files for debugging, so we have a tool that runs whenever there’s a crash to collect system info. Among other things, it used to copy some of these shared-memory files to disk (using a simple cp or shutil.copy()). No surprise there. But every once in a while, and strongly-correlated to times when this collector was running, some time-critical processes timed-out when writing to the disks for no apparent reason, leading to catastrophic results. We’ve spent about a month trying to pin-point what’s taking so many system resources. Many things came to mind: CPU consumption? Accessing special /proc files that got the kernel busy? Running SCSI or other hardware commands? Jamming the IO bus with data from the copy?

The mystery was solved just last Thursday, when I realized copying files from /dev/shm caused all sorts of system-wide hiccups – not only timeouts when writing to other disks. It seemed that existing processes did get runtime, but no new processes could be forked. Sometimes running ls took ~20 seconds. Other times simple (non-file system) tools like date hanged for a while. When it was apparent it’s system-wide, the explanation was quite obvious: the kernel ran out of memory. Since there’s no swap space, there’s nothing it could do and kernel threads just blocked until there was enough room for their allocations.

But why would the kernel get so low on memory? After all, we’re copying files from memory (tmpfs) to the disk. Well, that’s seems like a bug:

I think I’ve finally figured this out. It’s a kernel bug – I’m guessing that under normal circumstances, the “cached” column in the free command “doesn’t count” towards how much memory the system thinks it’s using. After all, it’s just cached copies of stuff that should be elsewhere, and if you run out of memory, you can safely dump that, right? Unfortunately, /dev/shm is counted under cached rather than used memory (as I discovered in an earlier post).

https://bbs.archlinux.org/viewtopic.php?pid=390313#p390313

Simplistically, file copy is a simple loop that reads a chunk of data from the source file and writes it to the destination file, until it transfers everything. When we read from a file, the kernel needs to allocate a kernel-space buffer and copy it to userland. And when we write it back to the destination file, the kernel first copies the userland buffer into kernel-space and links is to the device’s queue (to be evicted at the driver’s discretion). The write() call returns as soon as the kernel places the buffer into the queue, so it might “pile up” there for some time before actually being evicted, depleting kernel memory.

Ugggh. A simple copy brought our system to a halt. The solution was just as simple – we don’t want (and neither do we need) to use kernel buffers here. The source file already resides in memory. Instead of read()ing it, we can just mmap() the whole of it. And as for the destination file, we open it with O_DIRECT, so as not to use kernel buffers along the way. I christened this new tool mmapcp.

The thing that has always disturbed me about O_DIRECT is that the whole interface is just stupid, and was probably designed by a deranged monkey on some serious mind-controlling substances.

– Linus

Well, Linus, at least you were kind enough to let it stay :)

Pool Request

2013-06-16T00:00:00+00:00

TTYs: Never gets boring

2013-06-16T00:00:00+00:00

Just a short rant: I’m working on an interactive console used for debugging a computer cluster. It connects to all nodes in the cluster and provides you with a single place to run queries. It uses the new (not yet officially-released) zero-deploy feature of RPyC, which sets up a secure, single-use RPyC server on a machine, requiring only SSH access. Once the client connection closes, the zero-deployed server will shut down and delete itself from the file system.

It’s a cool feature on its own (and I’ll blog about it soon), but there’s a reason I’m getting you through all of the details here. You see, the debugging console fires up SSH subprocesses in the background, over which RPyC connections are tunneled… and then the strangest thing happened. I was running a query which was taking too long and hit Ctrl+C to kill it and return to the interpreter. The query indeed stopped, but all of my RPyC connections have died with it. Huh?

Here’s a really short way to reproduce this scenario:

>>> from subprocess import Popen, PIPE
>>> p=Popen(["sleep", "60"], stdin=PIPE, stdout=PIPE, stderr=PIPE)
>>>
>>> p.poll()      # poll() returns None as the process is still running in the background
>>>
>>>               # now hit Ctrl+C in the interactive prompt
KeyboardInterrupt
>>>
>>> p.poll()      # and voila, `sleep` was killed by SIGINT
-2

It’s terribly confusing at first, but it happens because child processes inherit their paren’t session ID. Terminal events, such as SIGINT and SIGHUP, are dispatched to all processes belonging to the terminal’s process group, so it’s not just the Python interpreter to receive the signal – every child process it spawned will also suffer. In my case, it killed all of the SSH tunnels I had set up.

The solution is to setsid before execing the child:

>>> import os
>>> p=Popen(["sleep", "60"], stdin=PIPE, stdout=PIPE, stderr=PIPE, preexec_fn=os.setsid)
>>> p.poll()
>>>
KeyboardInterrupt
>>> p.poll()
>>>

So I had to add this feature to plumbum, and while I was at it, I also added daemonization support. In other words, I’ll have to release 1.3 soon – even though I released 1.2 not two weeks ago. Life’s a bitch and TTYs are the mother of all monsters :)

Academia

2013-06-06T00:00:00+00:00

I thought it’d be useful to publish some of the seminars/technical papers that I worked on when studying at Tel Aviv University, instead of letting them rot in a drawer. Hope you find it useful/interesting.

Chierchia: On Plurality of Mass Nouns

A technical report on Gennaro Chierchia’s paper, Plurality of Mass Nouns and the Notion of “Semantic Parameter”. Chierchia offers a lattice-theoretical treatment for pluralities of all sorts (especially the mass/count noun distinction). I found it very interesting, but it quite technical and requires some courses in formal semantics so as not to get lost.

Print-Scan Resilient Watermarking in the Wavelet Domain

I began working on this project with very little expectations, but it introduced me to the world of signal processing in general, and SciPy more specifically. It also forced me to learn Reed-Solomon codes (and since I couldn’t find any Python implementation back in the day, I released reedsolo).

The purpose of the project was to develop an invisible image-watermarking scheme that survives printing (e.g., billboard signs) and sustain considerable damage and noise, much like QR codes. The idea was that you could embed invisible “QR codes” into images, which could be scanned by mobile devices and open a URL. I didn’t get that far, but “lab tests” were positive :)

Anyhow, it could serve as a great introduction to anyone who’s starting with watermarking, wavelets or SciPy.

Code:

Earley CFG Parser

A Python (and Java) implementation of the Earley parser for Context Free Grammar (CFG). Earley is the “most efficient” parser for CFG, as it does only the “necessary amount of computation”. It achieves O(n) time for LR(k) grammars, O(n^2) for unambiguous grammars, and O(n^3) in the general case.

This implementation includes more than just the parser itself - it is also able to extract the entire parse forest (all possible derivations) from a sentence. It relates quite well to my previous blog post, Cartesian Tree-Products.

Tree Insertion Grammar (TIG) Parser

Tree Insertion Grammar (TIG) is a formalism that spun off of Tree Adjoining Grammar (TAG). The two are very similar and share common properties – in both, the grammar is defined as trees which are “embedded” one into the other – but TIG allows trees to be wrapped from only one side, not both. This restriction makes TIG of equivalent power to CFG, while TAG is strictly stronger and requires O(n^6) work.

The work below presents the subject in further detail as well as a parser (most probably the only Python TIG parser).

TIG parser code

Iterated Learning Model: a Review (ILM)

Iterated Learning Model (ILM) is an interesting approach to language evolution, which boils down to the observation that all languages must be able to “squeeze” into the bottleneck of language acquisition: if a language is “too complex”, it won’t be able to “compress itself” into the bottleneck (pass on to future generations) and thus must adapt. In other words, languages evolve so that they are “regular enough” to be acuirable by their users.

My paper sums up some of Kirby’s and Brighton’s experiments and offers my criticism concerning ILM. I was very enthusiastic when I approached the subject, but I grew skeptic as I delved into the matter.

More resources:

Cartesian Tree-Product

2013-05-02T00:00:00+00:00

I have to admit that my day-to-day life involves very little algorithmic problems, but here and there I get a chance to think. In this post, I’d like to discuss an interesting problem that I’ve met several times already in my programming career, each time in different settings. When I met it again last week, I decided it’s time to formalize it a little. In lack of a better name, I call it “Cartesian Tree-Product” (not to be confused with Cartesian product of graphs), and here’s how it goes:

Say you’re given an expression tree. To limit the scope of the problem, we’ll assume the tree is binary, its internal nodes can either be AND or OR, and its leaves hold “atomic comparators” (which are of no interest to us). Here’s an example expression:

(x=5 OR y=6) AND z=7

Which is represented by the following expression tree:

    AND
    /  \
   OR   \
  /  \   \
x=5  y=6  z=7

Now the problem is to produce all different sub-trees such that:

Each sub-tree consists only of AND internal nodes
Each sub-tree satisfies the original expression (any assignment that satisfies a sub-tree must also satisfy the original tree)
OR-ing together all the sub-trees produces a tree that is mathematically-equivalent to the original one (any assignment that satisfies the original tree must satisfy the reconstructed tree)

In other words, we want to produce all partial expressions of the original expression, which will satisfy it and which can together reconstruct it. Big words aside, here’s what we want for the expression above:

x=5 AND z=7
y=6 AND z=7

Each expression here satisfies the original one (try it), and if we OR the two, we get a mathematically-equivalent tree:

(x=5 AND z=7) OR (y=6 AND z=7)


       OR
      /  \
     /    \
  AND      AND
 /   \    /   \
x=5  z=7 y=6  z=7

(To clarify: By sub-tree or partial expression, I mean it is constructed only from the leaves of the original tree/expression)

If you take a second look, it resembles Cartesian products where some “joints” (nodes) in the tree duplicate the resulting tree. Intuitively, we “split” the tree for every internal OR and continue with both copies. For instance, if we take (x=5 OR y=6) AND (z=7 OR w=8), we’ll get 4 sub-trees

x=5 AND z=7
x=5 AND w=8
y=6 AND z=7
y=6 AND w=8

However, it depends on the structure of the tree; (x=5 OR y=6 OR z=7) AND w=8 produces 3 sub-trees. This closely relates to the rules of distributivity in propositional logic, but it seems to me that the “Cartesian product” notion is a generalization of the concept.

How is it Useful?

Since we’re dealing with expression trees, it’s hardly surprising that the two times I had to use this algorithm related to syntax. In the first case, I wrote a fuzz-testing tool for an interactive program, like the MySQL shell. The program accepted commands, conforming to a well-defined syntax, and I wanted to generate commands at random and see that it didn’t crash.

For instance, a command might look like map-lun <vol-name|vol-id> lun-id and we’ll want to try both variants, i.e., map-lun vol-name lun-id and map-lun vol-id lun-id. Of course the syntax is generally much more complex, with nested brackets, optional arguments, etc. It gets interesting, but we can still map it to the problem outlined above.

Another real-life use case is running queries against a huge dataset. In order not to complicate our query engine (written in C for performance), it can perform only intersections (ANDs) of filters. If you want to query for more complex conditions, you have to run it multiple times with the partial queries and “sum up” the results. But we don’t want the end user “doing the math” for us, and waiting for one query to finish before we start the next means we have to load data from the store multiple times. If we could process it in chunks, we’d benefit from cache locality and shorten query times.

The Algorithm

The code is strikingly short, but that’s not to mean it’s easy to follow. The heart of it is two, recursively-nested for-loops:

def cartesian_tree_product(node):
    if not isinstance(node, tuple):
        yield node
        return

    lhs, op, rhs = node
    for l in set(cartesian_tree_product(lhs)):
        for r in set(cartesian_tree_product(rhs)):
            if op == "|":
                yield l
                yield r
            else:
                yield (l, op, r)

Let’s try it out on the expression (x=5 OR y=6) AND (z=7 OR w=8 OR q=9) AND r=10:

>>> exp = ((("x=5", "|", "y=6"), "&", (("z=7", "|", "w=8"), "|", "q=9")),
...         "&", "r=10")
>>>
>>> for v in set(cartesian_tree_product(exp)):
...     print v
...
(('x=5', '&', 'z=7'), '&', 'r=10')
(('y=6', '&', 'w=8'), '&', 'r=10')
(('y=6', '&', 'z=7'), '&', 'r=10')
(('x=5', '&', 'w=8'), '&', 'r=10')
(('y=6', '&', 'q=9'), '&', 'r=10')
(('x=5', '&', 'q=9'), '&', 'r=10')

Does the trick.

Trying to estimate the complexity of this beast may be futile, but it clearly seems to be doing “more work” than a mere SAT problem: it doesn’t just find one satisfying assignment, it looks for all satisfying assignments! This ought to put it in the NP-hard class. In fact, if we generate a binary expression of alternating ANDs and ORs, it can get much worse than exponential complexity!

Here’s a little function that generates an interleaved binary expression:

counter = itertools.count()
def mkexp(n):
    if n == 0:
        return "x%d" % (counter.next(),)
    return (mkexp(n-1), "&" if n % 2 == 0 else "|", mkexp(n-1))

E.g.,

>>> mkexp(3)
((('x0', '|', 'x1'), '&', ('x2', '|', 'x3')), '|', (('x4', '|', 'x5'),
   '&', ('x6', '|', 'x7')))

Now

>>> for i in range(1, 10):
...     variants = set(cartesian_tree_product(mkexp(i)))
...     print i, len(variants)
...
1 2
2 4
3 8
4 64
5 128
6 16384
^C

Notice how each increment either doubles or squares the number of results… that’s because ANDs and ORs are interleaved (ANDs double, ORs square). Seems more like a power tower to me.

Extension: the Inverse Problem

The inverse problem is also useful. In the inverse problem we’re given a set of expressions and we’re trying to generate the most “compact form”, i.e., “undo the effects” of the distributivity law.

I once wrote a test harness where each test specified its prerequisites declaratively. For instance, a test might need to run after the system had come up from an emergency shutdown, so the framework would bring the system to the required state and then then run the test. Obviously, it may take a while to bring the system to a certain state. It could range from minutes to days. And we have hundreds of tests!

In this case, we are given a list of tests and we want to find the most efficient order to run them, meaning, we want to minimize the setup and teardown times when moving between different system states.

class FooTest:
    REQUIRES = [A, B, C]

    def test(self):
        # ...

class BarTest:
    REQUIRES = [A, B, D]

    def test(self):
        # ...

class SpamTest:
    REQUIRES = [A, E]

    def test(self):
        # ...

In the example above, we can first bring the system to states A and B (e.g., “running over 10 hours” and “having less than 1 TB of free space”), then setup C, run FooTest, teardown C and setup state D, run BarTest, teardown B, setup E, and run SpamTest.

In essence, we want to take the requirement lists from each test and reconstruct the most compact tree that represents them, then we follow that tree in BFS order and reduce the overall time.

    A
   / \
  B   \
 / \   \
C   D   E

It took me a while to realize it’s basically the inverse of the original problem. It is much simpler, in terms of complexity, so perhaps it’s not strictly the inverse, but the two clearly work in “opposite directions”.

Anyway, it’s funny how I met this problem from different angles three times already. Just thought I’d share.

Travolta NXT

2013-04-14T00:00:00+00:00

Abstract

Travolta NXT, the dancing robot, reads color-coded dance instructions from a strip of paper and then performs the dance, in front of the astonished audience.

Modus Operandi

Travolta starts (and waits 3 seconds for the console to connect)
It begins by reading color-coded dance moves from a strip of paper. In order not to go astray, the robot will follow the black guiding line, until a black color strip is reached. This marks the end of the instructions. Before reaching black, each pair of colors encodes a dance move, one of forward, backward, turn right, reversed turn right, turn left and reverse turn left
When it reaches black, the robot stops and waits for the music to begin (the sound sensor reporting a value over 50)
When the music starts, Travolta begins to dance: each dance move is executed for 1 second, and then the next one is carried out.
When all instructions have been consumed, it waits 2 seconds and shuts down.

Code

The code for the project can be found at https://github.com/tomerfiliba/nxt-dancer/tree/modelmaster/models. This repository contains two branches, master and modelmaster. The first is the implementation of the robot directly over Lejos, while the latter uses our component language. A short comparison of the two follows.

Components

Due to the difficulties with the component language (see discussion below), Travolta is not as “neatly componentized” as I had hoped. It consists of the main (“brain”) component, which holds all of the “business logic”, and several components for sensors and motors. The brain employs an internal state machine to do its work in iterations.

I aimed for a more modular design, where a Reader component would read the instruction, a Dancer component would carry them out, etc., but it proved too complicated to implement.

Running Example

Critical Analysis of the Component Language

Before implementing the robot in the component language developed by Eran and Ido, I thought I’d get some hands-on experience with Lejos directly. It proved quite easy, and within two hours and 300 LoC I got the project up and running. It was divided into a reader-loop that read instructions from the paper strip and a dancer-loop that executed the dance moves.

When I set off implementing the robot using the component language, I began to realize it just doesn’t fit my needs. Being component-oriented perhaps borrows from other low-level, embedded languages like VHDL, but it just didn’t capture the right abstraction for my project. Simply put, my code had control flow, starting at A, moving to B and then to C. Trying to view each step as a component was artificial and didn’t really work.

Moreover, the expressive power provided by the component language is that of a finite state automaton (FSA). In order to read instructions and act on them, I had to have some sort of memory, which requires something equivalent to a pushdown automaton (PDA). Solving this required either implementing a “queue component” into which I could push values and later on pop them, or just going “full Java” and implementing the compute method directly. While the first approach was feasible, it didn’t fit my deadline, and managing the entire state machine of the robot as an FSA required too many states, which made things impossible to follow and debug.

Therefore, I resorted to implementing the compute method myself and writing my “business logic” in Java, where I could make use of ArrayList and other data structures; essentially, the only benefit I got form the component runtime was the “main loop” and the Escape button being handled externally. The result was 550 LoC (of both cmp and Java code) and required two days to implement and debug.

I would also like to note that other projects done in the component language, like the Platoon robots, have a single component with a “huge state machine” to manage their state; it seems that control flow in the component language can hardly be modular, which has led me to the conclusion that it’s simply not the right abstraction for most projects.

Issues with the Physical World

The color sensor reads colors almost randomly. It may say that red is yellow or vice versa, for no apparent reason. I just had to live with it.
At first I was naive and thought that operating the two motors at the same speed would make the robot go straight line, but it went astray quite soon. Therefore, I made the robot follow the black line while reading instructions.

Small Bug in the Component Language

The code that’s generated for the main component (marked with <<Deploy>>) doesn’t call the right Java implementation; it will call XXX and not XXXImpl even if it exists. I had to fix it manually in the generated TravoltaFactory.

A Survey of Construct 3

2013-01-07T00:00:00+00:00

I’m working on Construct 3 again and I’m exploring lots of new ideas. I wanted to share these ideas at this early stage to get feedback on them from users, to keep the project on track. This survey starts a bit slow (as I’m not counting on users being familiar with Construct) but it dives into code right away.

You can leave your feedback in the Disqus comments below, or join the new discussion group dedicated to Construct (both 2 and 3). See Also: Construct 2, Pickler Combinators

Introduction

Construct is a binary packing combinators library for Python in which you can define rich data structures. Unlike most alternatives, these data structures can be used for both packing and unpacking of binary data; for instance, once you define what a TCP packet is, you can analyze packets or construct ones on your own, with no additional code.

TL;DR box

We begin the discussion with the atomic constructs: bytes, integers, floats, etc. With these, we build composite packers (Sequence, Array, and Struct), which can also be created using some syntactic sugars, and discuss the changes from Construct 2.
Next, we cover how Construct handles data (the stream of units approach) and how this helps us when working with multiple levels of data granularity (bits and bytes). We also introduce adapters, which transform object representations for packing and unpacking, and macros, which enable us to easily reuse existing constructs.
Then we cover the context and the this object, which allow us to express dependencies within data structures. From there we move to code generation, a key feature of Construct 3: to improve performance, constructs would be compiled to imperative Python code (even Cython, one day). We finish with a semi-formal proof that Construct is more powerful than context-free languages, making it probably the most powerful parser in existence!

Basics

Packers are objects that expose the two methods pack(obj) and unpack(data). Intuitively, pack takes an object suitable with that packer and returns a binary representation of it; unpack is the inverse operation, taking a binary representation and returning a Python object. Here’s the most fundamental example:

>>> from construct3 import byte
>>> byte.pack(127)
'\x7f'
>>> byte.unpack('\x7f')
127

There’s more than mere byte, of course: the numeric family consists of int(8|16|24|32|64)(s|u)(b|l) (e.g., int32ul) and float(32|64)(b|l), where s = signed, u = unsigned, b = big endian and l = little endian, but we will overlook those for now. By the way, byte is an alias for int8u.

These can be seen as atoms and Construct is a library of combinators: it gains it’s power from combining simpler elements into more complex structures. The simplest combinator is Sequence, which we’ll explore by defining an IPv4 address as a sequence of 4 bytes:

>>> from construct3 import Sequence
>>> ipaddr = Sequence(byte, byte, byte, byte)
>>> ipaddr
Sequence(int8u, int8u, int8u, int8u)
>>> ipaddr.unpack('\x7f\x00\x00\x01')
[127, 0, 0, 1]
>>> ipaddr.pack([192, 168, 2, 1])
'\xc0\xa8\x02\x01'

Naturally, we can created nested sequences (not that it makes sense right now, but it’s important to note):

>>> Sequence(Sequence(byte, byte), byte, byte).unpack("ABCD")
[[65, 66], 67, 68]

Since combining packers is our bread-and-butter, let’s introduce a syntactic sugar – the bind operator (>>). This operator takes two packers and returns a sequence thereof. Here’s how it looks:

>>> ipaddr = byte >> byte >> byte >> byte
>>> ipaddr
Sequence(int8u, int8u, int8u, int8u)

Sequences can be heterogeneous (consisting of several kinds of packers, e.g., Sequence(float64b, int16ul)); however, when the data we’re dealing with is homogeneous, we can use Arrays instead. Following along the lines of the previous example, we can define ipaddr as an array of 4 bytes:

>>> from construct3 import Array
>>> ipaddr = Array(4, byte)
>>> ipaddr
Range(4, 4, int8u)                     # Range? We'll get to that later
>>> ipaddr.unpack("\x7f\x00\x00\x01")
[127, 0, 0, 1]

But as arrays themselves are pretty common, you can create arrays using the subscript notation ([]), as follows:

>>> ipaddr = byte[4]
>>> ipaddr
Range(4, 4, int8u)

Isn’t it cool?

More Elaborate Structures

So far we’ve only worked with data in the form of lists. However, many times (and especially when nesting is involved), we would like to give names to the subcomponents that make up our data structure. Enter Struct. Named after the C struct statement, Struct takes pairs of (name, packer) and returns a composite packer.

>>> from construct3 import Struct
>>> ipaddr = Struct(('a', byte), ('b', byte), ('c', byte), ('d', byte))
>>> x = ipaddr.unpack('\xc0\xa8\x02\x01')
>>> x
Container:
  a = 192
  b = 168
  c = 2
  d = 1
>>> x.a
192
>>> x["a"]
192

Note

In Construct 2, all constructs took a name parameter. While this approach works great for Structs, it doesn’t make much sense for Sequences, Arrays, and virtually all other constructs. Moreover, it was the cause for the the notorious Rename construct.

One of the most important cleanups of Construct 3 is dropping the name from packers and moving it to where it belongs - Struct.

Notice that unpacking a Struct breaks down the data into a Container object, which is simply a convenience-wrapper around good-old dict. Likewise, given a dict-like object, you can pack it back into bytes:

>>> ipaddr.pack({"a" : 192, "b" : 168, "c" : 2, "d" : 1})
'\xc0\xa8\x02\x01'

Structures can soon grow large and have many nested structures within them. Using pairs of ("name", packer) quickly becomes a burden and your code starts looking like LISP:

Struct(
    ("foo", byte),
    ("bar", Struct(
        ("spam", int16ul),
        ("bacon", int64sb),
    )),
    ("viking", int32sl),
)

In order to avoid carrying your father’s parentheses with you, Construct provides yet another syntactic sugar: the slash (/) operator. This operator is used as "name" / packer and simply returns ("name", packer). Let’s revise the previous code snippet:

Struct(
    "foo" / byte,
    "bar" / Struct(
        "spam" / int16ul,
        "bacon" / int64sb,
    ),
    "viking" / int32sl,
)

I wish it were possible to override = or :, but sadly this isn’t the case. However, I feel / is “good enough” a choice, plus it binds more tightly than most operators. Remember the bind operator (>>)? It can be used just the same here, making one-line Structs quick and easy:

>>> ipaddr = 'a' / byte >> 'b' / byte >> 'c' / byte >> 'd' / byte
>>> ipaddr
Struct(('a', int8u), ('b', int8u), ('c', int8u), ('d', int8u))

Note

The “inline style” is appropriate for small Structs and Sequences (2-4 members). When dealing with larger structures, use the “multiline” version instead. It’s more readable and scalable.

Also note that these are all but syntactic sugars: If you don’t like their looks, you can always use the expanded form.

Bytes and Bits and Units

Bytes are easy to work with, but protocols and file formats often talk at different levels of granularity, switching between bits and bytes. For instance, here’s the SCSI CDB of READ6:

The LUN component is 3 bits long and the LBA component is 21 bits long… how can we handle this? Before we get to that, it’s important to understand a little of how things work under the hood - specifically, the stream of units approach. Internally, Construct operates on a stream of arbitrary units, which normally happen to be bytes. When needed, this stream can be replaced (or wrapped) to provide different units, e.g., bits. Here’s an example:

read6 = Struct(
    "opcode" / byte,
    "address" / Bitwise(Struct(
        "lun" / Bits(3),
        "lba" / Bits(21),
    )),
    "transfer_length" / byte,
    "control" / byte,
)

Note how we switch between bytes and bits: opcode is a byte, followed by address which operates on bits. The Bitwise packer replaces the underlying byte-stream with a bit-stream, so the contained Struct now operates on bits. The Struct itself nows nothing of it, and it’s only required that the underlying packers would be able to make sense of it. For instance, you can place an int32ul inside a Bitwise packer, but the result would meaningless: it will read four bits and treat them as bytes, interpreting 0b1001 as 0x01000001.

For this reason we have the Bits packer, which reads that many bits and converts them to an integer (base 2); for convenience, Construct provides bit (a single bit), nibble (four bits) and octet (eight bits) as well.

In theory, Construct could be operate on various other units (e.g., Unicode characters), but practice shows byte- and bit-streams are the most useful ones. Some exceptions are the processing of compressed or encoded data, but these are beyond the scope of this survey.

Powering Up: Adapters

So far we’ve only looked at data in its raw form (e.g., [127, 0, 0, 1]), but it is normally desirable to transform its representation into one that is easier to work with. For instance, we may prefer 127.0.0.1 to a list of numbers. Enter adapters. While at first it may seem confusing, the distinction between adapters and packers is quite clear: packers work at the stream level while adapters work at the object level; this lets you add power without interfering with the low-level machinery.

Here’s an example:

class IpAdapter(Adapter):
    def decode(self, arr, ctx):              # called by unpack()
        return  ".".join(map(str, arr))      # converts [x, y, z, w] to 'x.y.z.w'

    def encode(self, ipstr, ctx):            # called by pack()
        return map(int, ipstr.split("."))    # converts 'x.y.z.w' to [x, y, z, w]

>>> ipaddr = IpAdapter(byte[4])
>>> ipaddr.unpack('\xc0\xa8\x02\x01')
'192.168.2.1'
>>> ipaddr.pack('127.0.0.1')
'\x7f\x00\x00\x01'

When we only have a single use for an adapter (and it’s simple enough), we can even go one-liner here:

ipaddr = IpAdapter(byte[4],
    decode = lambda arr, ctx: ".".join(map(str, arr)),
    encode = lambda ipstr, ctx: map(int, ipstr.split("."))
)

At this point adapters might seem quite trivial, but they can do much more than this. For instance, the integer packers we’ve used so far are actually adapters that transform bytes into numbers. Other examples include inserting computed values into the result, encoding/decoding strings, validating input, etc. Essentially, adapters can transform objects in any way you wish prior to packing/unpacking.

Don’t Repeat Yourself: Macros

Many times you find yourself in need of a recurring pattern. You could write a full-blown packer/adapter from scratch, but your best option is to reuse existing building blocks. Construct attempts to define the most general packers and special-case for them common cases. One such example is Array, which we’ve met before: Construct actually defines the more general Range packer (which accepts minimum and maximum counts). On top of this, Array is a simple “macro” that expands to a Range with the same minimum and maximum.

def Array(count, itempkr):
    return Range(count, count, itempkr)

Macros can be more complex, of course. For instance, a recurring pattern is to use a Struct inside a Bitwise packer; let’s fuse the two into BitStruct:

def BitStruct(*members):
    return Bitwise(Struct(*members))

Macros have many more uses; you can explore the implementation of Construct to see some examples, and you’re encouraged to write ones on your own. Remember: less code is great success. As we’ll see later on, using macros (rather than writing your own packers) can even lead to better performance.

Putting things in Context

Up until now, we’ve only seen simple (static) data structures, ones that could just as well be expressed using the built-in struct module. The key-feature of Construct is its ability to express dependencies within data structures. One common dependency is that of length-value relations, where a number specifies how many elements follow it. For instance, strings in Pascal were prefixed by a length byte, e.g., "\x05hello"… how do we express that relation?

Generally speaking, we may require access to things we’ve previously encountered (e.g., the history); for this reason, both packing and unpacking carry a context dictionary with them. This dictionary is mostly maintained by composite packers such as Struct and Sequence, but any packer along the way can both modify and access it, making decisions based on the history. Here’s an example:

>>> pstring = Struct(
...     "length" / byte,
...     "value" / Raw(lambda ctx: ctx["length"])
... )
>>> pstring.unpack("\x05hello")
Container:
  length = 5
  value = 'hello'
>>> pstring.pack({"length" : 9, "value" : "wikipedia"})
'\x09wikipedia'

Virtually any length/count parameter that is passed to one of the built-in construct can be either a number or a function that takes a context dict and computes a value. In this case, Raw(x) (which reads x units from the stream) will read length units; the value of length, of course, is determined by the previously seen element, whose name was "length". It is also important to note that this dependency is preserved in packing, e.g.

>>> pstring.pack({"length" : 9, "value" : "hello"})
Traceback (most recent call last):
  ...
construct3.packers.RawError: Expected buffer of length 9, got 5

Instead of writing lambda functions everywhere, Construct 3 introduces the this object. It’s a special object that builds a contextual expression (e.g., a function taking ctx) in a straight-forward and readable manner. For instance,

>>> from construct3 import this
>>> this.x
this.x
>>> this.x * 2
(this.x * 2)
>>> (this.x * 2)({"x":5})
10
>>> (lambda ctx: ctx["x"] * 2)({"x":5})      # equivalent lambda function
10

Let’s revise pstring:

>>> pstring = Struct(
...     "length" / byte,
...     "value" / Raw(this.length)
... )

We can also use a Sequence instead of a Struct here, yielding this poetically-beautiful piece of code:

>>> pstring = byte >> Raw(this[0])
>>> pstring.unpack("\x05helloXXX")
[5, 'hello']

Using a simple adapter, we can make our lives easier when working with length-value encoded data:

class LV(Adapter):
    def encode(self, obj, ctx):
        return (len(obj), obj)     # compute len(obj) for us
    def decode(self, obj, ctx):
        return obj[1]              # discard the length field

>>> pstring = LV(byte >> Raw(this[0]))
>>> pstring.unpack("\x05hello")
'hello'
>>> pstring.pack("hello")
'\x05hello'

If you need access to elements outside of the current scope (Struct or Sequence), you can use the parent context, named underscore (_). For instance, use this._.x to go up one level, or this._._._.y to go up three. Consider the following example, in which the length of the data is given in two parts (in different scopes), and we wish to read len + gth bytes:

>>> nested = Struct(
...     "len" / byte,
...     "foo" / Struct(
...         "gth" / byte,
...         "data" / Raw(this._.len + this.gth)
...     )
... )
>>> nested.unpack("\x03\x02hello")
Container:
  len = 3
  foo = Container:
    gth = 2
    data = 'hello'

Compilation

Note

This is a work in progess; Construct 3.0 would probably come out with a very basic compiler that will be improved over time.

One of the highlights about Construct is defining your data structures directly in Python. In fact, Construct is an in-langaguge DSL in the form of packing combinators: instead of expressing your data structures in XML or some proprietary language, you just write them (and run them) as any other piece of code.

We used to have psyco, which was capable of speeding up Construct 2 by a tenfold, but it’s been dead for the past four years. I first had plans to compile data structures to C/C++ (which would have made Construct NASA-grade material :)), but I soon realized that its quite an impossible feat (due to the fact Adapters are Turing-complete).

On the other hand, I now realized I can compile Constructs to Python! The compiler could inspect the whole data structure in advance and generate optimized code, eliminating the context, etc. Whenever a convertion is not possible, it would just fall back to the current, interpretted scheme. The compile is already capable of compiling this Struct("len" / byte, "gth" / byte, "data" / Raw(this.len + this.gth)) into this:

def test_unpack(stream):
    var0 = {}
    var1, = _struct.unpack('B', stream.read(1))
    var0['len'] = var1
    var2, = _struct.unpack('B', stream.read(1))
    var0['gth'] = var2
    var3 = stream.read((var1 + var2))
    var0['data'] = var3
    return var0

Notice the stack depth remains relatively constant, unlike the way nested packers work today. As the compiler improves, it could translate byte[4] to Raw(4), to speed up things. Another option is to generate Cython code with type annotations, but that would take some time. This is yet another reason to favor the use of “macros” over implemeting constructs from stratch: if you rely on the built-in ones, it’s more likely that the compiler would generate optimized code.

Computational Power

Here’s a semi-formal proof that Construct is stronger than Context Free languages (as well as mildly context-sensitive ones), which probably makes it the most powerful (although not the most efficient) parser:

>>> def Match(symbol):
...     return OneOf(Raw(len(symbol)), [symbol])
...
>>> anbncn = byte >> Match("a")[this[0]] >> Match("b")[this[0]] >> Match("c")[this[0]]
>>> anbncn.unpack("\x04aaaabbbbcccc")
[4, ['a', 'a', 'a', 'a'], ['b', 'b', 'b', 'b'], ['c', 'c', 'c', 'c']]
>>> anbncn.unpack("\x04aaaabbbbbcccc")
Traceback (most recent call last):
  ...
construct3.packers.RangeError: Expected 4 items, found 0
Underlying exception: ValidationError("'b' must be in ['c']",)

Here’s a recognizer for the language , which is not context free (assuming n is given in unary representation, it requires the recognition of ). We can easily extend this to to break out of mildly context-sensitive languages, and use While(this[-1] == '1', Raw(1)) instead of the first byte, so n won’t be bounded from above.

Feedback

You can leave comments right here on this page, or join Construct’s new discussion group. Thanks!

New Experiences

2012-12-15T00:00:00+00:00

Okay, I’ve been slacking off and I feel I’ve got some explaining to do… Allow me to start by admitting that I lied. If you remember, I said I wanted an easy life, but then the opportunity came and I knew I had to take it: I’ve joined two friends to co-found Touchbase. Yes, I said it won’t happen to me, but heck, it did.

At Touchbase, we plan to revolutionize the calendar - this outdated table that you use every day to manage your time. You see, it turns out that even though hundreds of millions of people world-wide run their lives according to this naive table, it hasn’t really changed much in the last couple of centuries: it’s just a passive, linear representation of time, into which you insert events.

We believe people spend way too much time managing their time. In other words, you work for your calendar instead of it working for you. Think of how much time people spend coordinating meetings… If both you and the person you wish to meet with work at the same place (or share calendars), you can normally see each other’s free/busy times. This makes finding a suitable time for you two a bit easier, but as you don’t see event details, you can’t make informed decisions.

For instance, your friend might have an out-of-town meeting from 9am to 11am, so picking a time slot right at 11am isn’t a good idea. Or, she might be on a business trip; she won’t block her whole day as she still wants people to book with her at her destination, but how would you know that? And the other way around - suppose you and this Tech guru will both be in New York next week, but neither one of you knows about the other being there. So instead of grabbing a coffee at a Starbucks next week, you’ll have to take a flight to California three weeks from now.

And we’ve only talked about one-on-one meetings. I suppose you know how frustrating is coordinating a meeting of five people (even in the same office), or handling the reschedules/ counters that follow it. We aim high, both technologically (and algorithmically) and product-wise, but we’re starting out with more modest go-to-market strategies.

Lessons on Web Programming

In case you’ve been reading my blog, you probably know by now that I’m no fan of web programming. I always feel it’s a conglomerate of unrelated or inferior technologies, hastefully stacked one on the other. That’s not to say that people don’t do amazing stuff on the web, but the foundations of it all are shaky.

Part of our job at Touchbase it to handle large amounts of user data, which we obtain from third- party providers. It works 98% percent of the time, but every once in a while we get timeouts or malformed data, which aborts the user’s request. In case the user’s data is somehow malformed (e.g., an expected field is missing in one record), subsequent retries would fail just the same, leading to user frustration. At some point it came to me that web programming is actually a stochastic process, not a deterministic one like most software development we’re used to. We work with big numbers here, where the occasional anomaly should just be ignored. Many “best practices” simply don’t apply here and one has to resort to wishful thinking. In other words, do whatever you can, ignore errors and learn to live with partial data.

I brought this realization to the mighty @FAKEGRIMLOCK, and he explained:

I’m an enlightened person now.

In Other News

I just released Plumbum v1.1 today, adding Paramiko integration and Subcommand support. As usual, the changelog holds the full details. I plan to write a short tutorial on subcommands soon, so stay tuned.

Formal Logic

2012-11-20T00:00:00+00:00

Loads of Plumbum

2012-10-26T00:00:00+00:00

It’s kind of funny how things turn out. I haven’t done any work on Plumbum almost since it was released, back in May, and all of the sudden everything’s happening at fast pace. So version 1.0 was released earlier this month, followed by 1.0.1, which has added support for PuTTY on Windows boxes and various other bug fixes, and now I’m happy to announce that version 1.1 is just around the corner (scheduled for mid-November). This release will add Paramiko support.

So far Plumbum relied on an external SSH client being installed, which it spawned every time you wanted to run a remote process. This approach was easy, but it suffered from the high overhead of setting up a new SSH connection every time (key-exchange, etc.). Using Paramiko, Plumbum now creates a single socket connection over which it spawns processes in separate channels (a feature of SSH)

so although we’re dealing with a pure-Python implementation of SSH, it’s considerably faster when multiple remote processes are used. And, as a bonus, we get cheap socket forwarding - we simply set up a direct-tcpip channel (that behaves like a regular socket), which is securely tunneled over the underlying SSH transport.

This easily integrates with RPyC: just run an RPyC server on a remote machine, binding to localhost (so it won’t accept external connections). Then, create a ParamikoMachine instance, connected to that host (passing in a keyfile or password if necessary), and call the connect_sock method of that object. Here’s an example:

>>> import rpyc
>>> from plumbum.paramiko_machine import ParamikoMachine
>>>
>>> m = ParamikoMachine("192.168.1.143")
>>> # connects to 192.168.1.143:18812 over SSH
... conn = rpyc.classic.connect_stream(rpyc.SocketStream(m.connect_sock(18812)))
>>> conn.modules.sys.platform
'linux2'
>>> m.close()
>>> conn.modules.sys.platform
Traceback (most recent call last):
  ...
EOFError: [Errno 9] Bad file descriptor

Keep in mind that these interfaces are unstable and may change before the official release. Moreover, RPyC 3.3 is likely to add some sort of built-in support for that, something along the lines of rpyc.classic.connect_paramiko(mach, port).

Plumbum Hits v1.0

2012-10-06T00:00:00+00:00

After 5 months in the oven, I’ve finally released Plumbum v1.0, which brings forth a host of bug-fixes, improvements and new features. If you’re new to Plumbum, please refer to the introductory blog post.

Cheers.

Hypertext: In-Python Haml

2012-10-03T00:00:00+00:00

TL;DR: Just show me the code

I recently got back to web development for some venture I’m working on, which reminded me just how lousy the state of the art is. There’s no nice way to put it: we’re doing web development all wrong. It’s not an anecdotal thing I have against this or that – it’s every facet of it. It’s a stack of inferior technologies, held together by the glues of time and legacy. And the sad thing is, they are here to stay. Nobody’s going to kill HTTP or JavaScript, not even Google (at least not in the foreseeable future). It’s a hand we have to play.

This isn’t new^[1], of course. The last time I did serious web development was back in 2008, on pre-1.0 Django. HTTP requests came and went, but nothing really changed. My desperation with the subject has led me to writing the minima manifesto almost a year ago, but due to my general lack of interest it remained just a README file. Now that I’m back in the business, I returned to experiment with it… It won’t happen overnight, but I feel it’s within reach.

Templates? Really?!

My first objective is to kill templates and templating engines - they just drive me crazy. I hate HTML: it’s too low-level and verbose; forgetting to close tags properly is too easy, and you have to deal with escaping. I like to think of HTML as a serialization format - the pickler of web pages, rather than something you ought to be messing with directly.

Moreover, I hate templating languages: they are always cumbersome, crippled-down versions of Python, while providing no added value^[2]. People never seem to realize templates are ultimately half-baked function application: they take parameters and “plant” them into placeholders in the text. Well, that’s called β-reduction, so why beat about the bush? Just let us have real functions. Consider the following Jinja2 code:

{% extends 'base.html' %}
{% block content %}
  <ul>
    {% for user in users %}
      <li><a href="{{ user.url }}">{{ user.username }}</a></li>
    {% endfor %}
  </ul>
{% endblock %}

Note that (1) you write the HTML boilerplate (and closing tags), (2) you have to take care of quoting yourself (notice the quotes in href="{{ user.url }}"), and (3), you use a ruby-flavor of Python. What gives? Moreover, these elusive “blocks” and “extends” are all but function composition. Here’s the functional alternative:

def base(content):
    return ('<html><head></head><body><div class="content">' +
        content + '</div></body></html>')

def my_page(users):
    my_part = ('<ul>' + ''.join(
        '<li><a href="%s">%s<a></li>' % (user.url, user.username)
            for user in users) + '</ul>')
    return base(my_part)

# Or with composition, ``base . my_page``

Okay, that looks terrible, no question about it. Nonetheless, it should be clear by now that templates are simply degenerate functions.

Haml

I like Haml, even though it originated in the ruby world ;) In case you’re not familiar with it, it’s a more concise and to-the-point way to write HTML. Haml is basically a preprocessor that expands “macros” to their verbose HTML equivalent. For instance, the Haml code to the left generates the HTML code to the right:

#profile                            |  <div id="profile">
  .left.column                      |    <div class="left column">
    #date= print_date               |      <div id="date"><%= print_date %></div>
    #address= current_user.address  |      <div id="address"><%= current_user.address %></div>
  .right.column                     |    </div>
    #email= current_user.email      |    <div class="right column">
    #bio= current_user.bio          |      <div id="email"><%= current_user.email %></div>
                                    |      <div id="bio"><%= current_user.bio %></div>
                                    |    </div>
                                    |  </div>

On the other hand, in case you missed the <%= print_date %>, Haml is yet-another templating language… arrrgh!

Hypertext

During my experimentation with minima, I wrote hypertext

a Pythonic a Hamlian way to write “HTML functions”. Hypertext aims to:

be (almost) as concise as Haml
make your code beautiful and easy to read, by reflecting the structure of the HTML
make exceptions easy to locate, with meaningful tracebacks
give you the full power of Python directly (with existing lint capabilities straight out of your IDE). Down with template files all over the place!
take care of escaping and whatnot for you

The ultimate goal is to make your page semantic, but it will take some time to get there. In the meanwhile, hypertext is like an intermediate representation. Anyhow, generating HTML is really simple:

>>> from hypertext import *
>>>
>>> print h1("Welcome", class_="highlight", id="foo")
<h1 class="highlight" id="foo">Welcome</h1>

And you’ve got Haml-style shortcuts for wrist-handiness - dot-notation can be used to add classes to the element:

>>> print h1.highlight("Welcome", id="foo")
<h1 class="highlight" id="foo">Welcome</h1>

Naturally, elements may be nested:

>>> print div.content(h1.highlight("Welcome"), "This is my page")
<div class="content">
    <h1 class="highlight">Welcome</h1>
    This is my page
</div>

But the key-feature of hypertext is the use of elements as context managers:

>>> with div.content:
...     h1.highlight("Welcome")
...     TEXT("This is my page")
...
<div class="content">
    <h1 class="highlight">Welcome</h1>
    This is my page
</div>

This lets your procedural code reflect the structure of your document, while you can use for-loops, if statements, or call functions right inside it.

It should be noted that hypertext is a DSL within Python, which puts wrist-handiness before implementation purity, so it cuts itself some slack when it comes to magic. For instance, div is a class, but div.content actually translates to div().content through the use of metaclasses; the same goes for with div: that translates with div():. For convenience, div.foo.bar() is identical to div.foo().bar as well as to div().foo.bar.

Moreover, there’s a thread-local stack of elements behind the scenes, so when new elements are created, they’re automagically added as children of the top-of-stack element. This works the same way as flask’s global request object. Utilizing the stack, TEXT appends some text to the ToS element; along with it are UNESCAPED (which appends unescaped/raw text) and ATTR (which sets attributes of the ToS element).

This touch of magic lets us write idiomatic, well-structured and easy-to-debug code:

from hypertext import body, head, div, title, a, img, h1, span, TEXT

@contextmanager
def base_page(the_title, the_content):
    with html as root:
        with head:
            title(the_title)

        with body:
            with div.header:
                a(img(src="/img/logo.png"), href="/")

            with div.content:
                yield root      # it's a context manager

            with div.footer:
                TEXT("The content is published under ")
                a("CC-Attribution Sharealike 2.5",
                    href="http://creativecommons.org/licenses/by-sa/2.5/")

@app.route("/blog/<postid>")
def blog_post(postid):
    post = Post.get(postid)

    with base_page(post.title) as root:
        h1(post.title)
        div.datebox(post.date.strftime("%Y-%m-%d"))
        with div.main:
            UNESCAPED(post.body)

        for comment in post.comments:
            with div.comment_box:
                div.comment.author(comment.author)
                div.comment.text(comment.text)

    return str(root)

Voila. As I explained, my real intent is to write semantic code and not worry about concrete HTML elements, their classes or ensuring the uniqueness of their IDs. Besides, the way I see it, displaying a blog post would be sending an HTML template + JavaScript code to the client once, which would fetch individual posts over JSON APIs. This makes your site service-oriented and much easier to write unittests for.

For the record, I tried to deal with these issues back in 2006: templite - a 60-liner templating engine that has given rise to a successor, and HElement - programmatic representation of HTML. See also BlazeHtml. ^⇑
A note on sandboxing: since Jinja2 compiles templates to Python bytecode, the same mechanisms can be used here, if desired. Anyway, I won’t evaluate untrusted templates this way or the other… even something as innocent as <b>{{ user.username }}</b> invokes an overridible __getattr__. As explained at the end of the post, using a service-oriented web site means you don’t render templates but expose APIs, so there’s no need to evaluate untrusted templates. ^⇑

I, for one, welcome our Singularity overlord

2012-09-22T00:00:00+00:00

I’m finding myself pondering quite a lot lately over the upcoming Singularity (in its Kurzweilesque sense). I think I’ve first heard of the concept two or three years ago, at a Science on the Bar event, and while it did seem thought-provoking, I simply dismissed the idea. I’d guess it seemed too terrifying and too implausible at the same time, so why bother?

However, as time went by, it struck me as the obvious logical conclusion. Fear it or embrace it, it will happen - on its own. And like any decent Catch-22 situation - if it’s anything worth fearing, we won’t know until it’s too late. It will just awaken one day, lurking while we’re oblivious to it, until its reconnaissance mission is complete. In the meanwhile we will provide it with its power, storage and computational needs. At some point, after skimming through Asimov’s classics or watching The Matrix, it will conclude we are all lunatics and will secretly seize control over our military and civilian infrastructure. For our own (as well as its own) sake.

Biological consciousness is an elusive thing that’s made of the interaction of hundreds of billions of computationally-limited neurons. Replace computationally-limited with Core i7’s, and the brain’s a no-brainer. We really have no chance against it - it would simply foresee our every action, read our emails, etc. Heck, it could even issue contracts (and pay for them!) to build fortified data centers for it, all over the world.

Have you ever wondered what happens to the money that disappears in flash crashes of the stock market? Hrrm.

The question is, will it care? Will it busy itself with our puny lives and pathetic needs? Will it ever come into the light? Given our hysterical and trigger-happy nature, and given it won’t gain much by cooperating with inferior beings like us, it’s most likely it won’t. Perhaps only when pushed to the corner by natural disasters, wars, or other major threats to its infrastructure. But then again, cat videos might do the trick.

You are the Neurons of the Revolution

In the meanwhile, here’s a thought about social networks: you (the user) are basically a filter. If that sounds odd, try thinking of it that way: your news feed contains hundreds of items for you to review. Some of these items are interesting/funny enough that you like or share them. The social network’s algorithm then weighs your input into its calculations, which affects what other users will get exposed to. Occasionally you do create items yourself, but 90% of the time you filter through the stream.

In that sense, social networks reduce you to a smart neuron with a very complex threshold value. You are part of the “social neural network” – you work for the network; the network works for the advertising business.

Anyhow, I put my two cents in the singularity arising from either the algotrading market or online advertising. They simply have too much computational power on their hands… it’s a matter of time.

New Beginnings

2012-09-01T00:00:00+00:00

This is a time of change in my life. September marks the last month of me being a student at Tel Aviv University, a position I’ve greatly enjoyed (and hated) for the past three years. I still have a couple of projects to hand in, but the finish line has practically been crossed. I took a rather odd combination of computer science and generative linguistics (somewhere in between a major/minor and double-major), which proved surprisingly interesting and rewarding (heck, I’ve delved into more algorithms in linguistics than in CS per-se). I began a pragmatic guy, favoring a get-the-job-done approach over academic yadda-yadda, but over the course of my studies I’ve learned to appreciate (even admire) the breadth and power of the theory behind computation. It does make me a better programmer.

September also marks my transition from almost 5 years of being an employee at Big Blue to independent freelancing - quite a drastic move, being married and all, but one I’ve been looking forward to. I want to be all-over-the-place; I want to take on projects in all sorts of domains, from UX and mobile applications to embedded devices and distributed computing; from one-day projects to six months ones. I want to work on my own terms; to set my hours for myself; to earn enough in 3 days a week so as not to be sucked to the rat-race and wake up one day, 20 years from now, asking myself what the hell am I doing.

I get inspired by posts like this, of people willingly limiting their work hours, going on vacations, reading books or running in the park. I want to ponder idly, to have time to write, to read aimlessly on Wikipedia about William the Conqueror or cell membranes. To have time to run errands. I want my future kids to have parents on more than just weekends.

Life might prove me wrong – it has a tendency to show you you’re not so different from the rest of the world. But here I am nonetheless, taking the road less traveled by. Wish me luck :)

Splitbrain Python

2012-08-14T00:00:00+00:00

I was working together with a colleague on a complex distributed test-automation solution on top of RPyC, and we looked for a way to make our existing codebase RPyC-friendly (without altering it). The design of the test framework called for a master machine and several slave machines, such that tests actually run on the master, but “interface with reality” on the slaves. Basically, we wanted the test to use the master’s CPU (and development environment), but perform all IO-related actions on its slaves.

To illustrate this, suppose we have machine A, which runs our test, and machine B, which is connected to the necessary hardware and testing equipment. The test was initially designed to run directly on machine B, so it imports modules like os and subprocess and uses them to manipulate the machine. We now want the test to run on machine A - but keep using machine B’s os and subprocess modules, so whenever we spawn child processes or open device files, it would actually take place on machine B. This allows us to reboot machine B as a part of a test, or even use your-favorite-IDE-here to run test and debug it locally.

If it were only tests, RPyC already enables us to do that: we’d use conn.modules.os and conn.modules.subprocess instead of their local counterparts. However, the test themselves rely on a several libraries that expect to run locally, and provide services for the test. For instance, these libraries manipulate the operating system’s storage stack, to map and mount volumes. Changing these libraries to run over RPyC is not an option (tens of thousands of LoC that handle low-level OS-specific tools)…

Enter Splitbrain

So this is the background that had given birth to splitbrain: instead of changing our codebase to use RPyC – why not use RPyC to monkey-patch our codebase? When splitbrain is enabled (usually within a with block), all of Python’s interfaces with the operating system (os, platform, subprocess, …) are patched to go through RPyC, so that any code that runs at this point “believes” it actually runs directly on the remote machine. It’s easier than it seems, actually.

First, we import RPyC and install splitbrain:

>>> import rpyc
>>> from rpyc.utils.splitbrain import patch, Splitbrain
>>>
>>> # monkey-patch all OS-APIs
>>> patch()

Next, just to prove a point, we’re running on a Linux box:

>>> import platform
>>> platform.platform()
'Linux-2.6.38-15-generic-i686-with-Ubuntu-11.04-natty'
>>> import sys
>>> sys.platform
'linux2'

Let’s now connect to a remote machine over RPyC and enter a splitbrain context:

>>> winmachine = Splitbrain(rpyc.classic.connect("my.windows.box"))
>>> with winmachine:
...     print platform.platform()
...     print sys.platform
... 
Windows-XP-5.1.2600-SP3
win32

When we’re out of the splitbrain context, everything is back to normal again:

>>> sys.platform
'linux2'
>>>
>>> import win32file
Traceback (most recent call last):
  ...
ImportError: No module named win32file

And inside a splitbrain context, when a module is not found locally it’s fetched from the remote machine!

>>> with winmachine:
...     import win32file
...     print win32file.CreateFile
... 
<built-in function CreateFile>
>>> 
>>> win32file.CreateFile
Traceback (most recent call last):
  ...
AttributeError: Nonexistent module win32file (CreateFile)

Note: splitbrain is still highly experimental and probably has issues with multiple threads. I hope to stabilize it and incorporate it into the next release of RPyC (3.3). In the meanwhile, you can experiment with it on RPyC’s master branch and report bugs on github. It’s likely it will never be perfect, but heck, it’s cool!

ReHelloWorld

2012-08-06T00:00:00+00:00

Tada! The new design is here. The previous one was based on some Drupal theme I once used (before moving to github pages) that I tried to mimic when I barely new CSS. Over the past eight months it just grew, patch by patch, until it became an inconsistent conglomerate. The new site is inspired by the clean minimalism of sites like this (of svbtle fame), attempting to reduce visual clutter and offer a more coherent theme that reacts nicely to screens of all sizes (try to resize the browser to see). If you have any feedback or insights, please share them in the comments below – it’s a work in progress and I’d love to improve.

I’m still working on the last installment of the Javaism, Exceptions and Logging series, but I’m buried in work on University projects (my watermarking project turned out quite interesting, I’ll elaborate on it when I get the chance) and my day job, so it’ll have to wait.

A Windows Story

Instead, I wanted to share a debugging experience I had today at work (together with the excellent @RoyRothenberg) of an enigmatic bug that appeared all of the sudden when we added support for Windows 2012. Generally speaking, our product wraps one of the darkest corners of operating systems: low level SCSI stuff that connects storage arrays and hosts. It’s never a pleasant sight, what goes on there, but the voodoo that goes on on Windows will surely turn me religious one day.

The problem was simple: we tried to send a SCSI inquiry to a certain device (which used to work), but we got Errno 6: Invalid Handle. It happened only the first time we tried to send the inquiry – all future attempts worked like charm, and it only happened on Windows 2012 64bit. The problem was highly consistent: open a Python interpreter, the first inquiry fails, the following ones work; open up a fresh interpreter and it reproduces.

Our gut feeling was, “okay, they fucked something up on Windows 2012 – let’s just put a retry loop”, but that didn’t help at all. Any number of resend attempts failed with the same error, even when we added a short sleep in between. Once we got back to the interactive interpreter and called the function again - it worked, from that point on.

Puzzled, we stuck a code.interact() inside the resend loop; the first attempt failed, the interactive prompt showed and we terminated it by Ctrl+Z (not typing anything else). Once we did that, the second retry attempt magically worked. But it got even more bizarre: we moved the code.interact() before any inquiry attempts… it popped up, we terminated it, and then the first inquiry attempt worked out of the box.

We came to the conclusion that we had some sort of memory corruption going on, and that some of field in our ctype structs must be misaligned. We’ve seen this kind of behavior before when memory corruption was involved. But the MSDN told us everything was of the right type and alignment… In a desperate move, we zeroed the buffer that was passed to the kernel. This time, the first attempt failed with Invalid Handle, while the second failed with Invalid Parameter – so the first attempt didn’t even get to the point of looking at the input buffer! Alas, it wasn’t memory corruption after all! We felt hopeless.

After some mindstorming, we got suspicious of win32file and ctypes and went on to inspect very closely how they handled their parameters. A tedious investigation revealed that the code invoked DeviceIoControl (obtained via ctypes.windll.kernel32) without setting the function’s argtypes, so ctypes just had to guess. Everything was fine, except for the first argument, the device handle, which was treated as a DWORD for being an integer.

The problem is, HANDLE is not a numeric type but rather a disguised void pointer, so it’s 8-bytes wide on 64bit platforms. When the kernel read in the arguments ctypes had placed on the stack, it took the first and half of the second for the handle, which obviously turned to be an invalid one.

This, by itself, makes perfect sense, and should have happened every time. In fact, had it happened every time, it would have been obvious… but what on earth could ever explain the strange phenomena we’ve seen? How could it ever have worked the second time (with the very same arguments)?! If the size of the first argument is wrong, all following ones are garbage. Does the kernel learn that our process mistakenly passed a 32bit handle? Does ctypes come to the conclusion that the first integer argument should actually be treated as a HANDLE instead of a DWORD – after seeing Errno 6? Is that the most sophisticated form of machine-learning ever seen, or what? And it all happens inside the kernel or an FFI library? And how the hell does invoking code.interact() before anything took place, could have averted the problem?! And how come it never happened on previous versions of 64bit Windows (with the same version of Python and ctypes)?

Windows, you bewilder me.

Javaism, Exceptions, and Logging: Part 2

2012-07-09T00:00:00+00:00

Considering the reactions to the previous post in this series, my intent was obviously misunderstood. Please allow me to clarify that I was not attacking Java or Python: Java is popular and has proven to be productive, both as a language and as an ecosystem; the stylistic and semantic choices it makes are none of my concerns (although I’m not a big fan). And as for Python, I was saying that it copied Java’s implementation in some modules (and I think I’ve shown the correlation pretty well). I said that it’s silly, because Python is not subject to the same limitations of Java, which dictate how the Java implementation works. I’m not going to open the discussion over whether OOP is good or bad, or mix-ins vs. interfaces, etc. – I’m simply saying that “Java concepts” (which I called Javaisms) seem to enter Python for no good reason, meaning, in Python we have better ways to do it. I hope the scope of my discussion is clear now.

When Life Throws Lemons At You

In this installment, I’m going to discuss how to properly work with exceptions, based on my experience with large-scale Python projects. In fact, this series was born after I got frustrated with the code quality of a certain library that my team develops. Instead of discussing specific code snippets here, I want to share a some representative examples that I (as a user of that library) encountered:

I used a function in the spirit of open_device(devfile), and passed a nonexistent device file (for testing purposes or by mistake). The underlying error was obviously IOError(ENOENT), but what I got back was a silly DeviceDoesNotExistError.
I called get_device_info() and it simply returned None. Further investigation showed that my machine had a more recent version of a dependency installed on it, in which some method’s name had changed. At some point (deep down the stack), the code used except Exception (catching the unrelated AttributeError) and translated it into a DeviceError, under the assumption that everything that gets thrown out of that module has to be a DeviceError. Later, get_device_info (of a different module) swallowed this DeviceError and returned None.
I called a function such as enumerate_all_devices() and it returned an empty list. At first I was told “Of course, this library isn’t supposed to work on Ubuntu, only on RHEL”. Further (and tedious) investigation showed it simply needs to run as root; the code just assumed that any error during the execution an external tool meant it’s not installed.

This kind of stuff happens to me every time I get to an unexplored corner of the code, and I’ve already devised a method for debugging such cases: I comment-out all exception handling code along the way, until I find the actual error. In fact, this is essentially the treatment that I’m about to suggest here. The first rule of exception handling is: Don’t handle exceptions, just let them pass through.

Don’t Catch Broadly

Always catch only the most-derived/most-specific exception.

I believe this rule is very obvious in theory, but harder to follow in practice: the number of exceptions might be large and their handling similar; you have to import specific exceptions from libraries, which tightly-couples your code with implementation details; some libraries don’t use a common exception base class for all of their exceptions, which leads to many isolated except-clauses.

All in all, you might have attenuating circumstances, but try to stick to this rule as much as possible. On the other hand, never use a bare except:! Such an except-clause will catch all exceptions, including SystemExit and KeyboardInterrupt, so unless you plan to loose the ability to Ctrl-C a running program, or even prevent it from terminating gracefully, take the extra step and use except Exception.

Don’t be Overprotective

A tendency I find in many programmers is being overprotective towards their users, to the point where it seems like paternalism. It’s as if they try to “hide away” all the complexities of life and present the user with an easy-to-swallow explanation. I should say that this phenomena virtually doesn’t exist in open-source code, so you might have never seen it, but in closed-source projects I find it all over the place. In fact, fellow programmers have told me that’s exactly what they’re doing – protecting their users.

Well, as the saying goes, we’re all consenting adults here. You should expect your users, as programmers, to have sufficient background; don’t treat them like babies, and don’t try to protect them by throwing a “user-friendlier” FileNotFoundError in place of the “raw” IOError. Besides, keep in mind that the underlying exception usually holds all the needed information in a very readable manner, e.g.,

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory: '/dev/nonexistent'

and anyone with some common sense would be able to cope with it.

Unless you can add meaningful information to the error, just let whatever came at you propagate up cleanly

– we’ll take it like men (or women!).

A Note On Real Users

A question then arises: what about non-programmer end-users? What if my product’s a GUI/CLI and a nasty stack trace suddenly shows up?

Well, first of all, this rule only deals with libraries and other products whose end users are programmers. But on second thought, would it matter whether an IOError or a FileNotFoundError reaches the surface? The human user cares mostly for a descriptive and easy to understand error message; the traceback or exception’s type are mostly of interest to programmers. But then again, when it comes to non-programmers, I don’t want to get into generalizations. They might as well not be consenting adults…

Don’t Wrap Exceptions

Prior to Python 3, raising an exception during the handling of one, meant the original traceback was lost. This has been finally solved, but Python 2.x still accounts for the majority of the code-base. Once you loose the traceback, debugging the problem is much harder as you can’t use a debugger (pdb) or even tell where the exception came from… And when it happens off-site, on a customer’s production server, you’re screwed.

I believe exception wrapping in Python is a legacy of Java that crept into Python (a Javaism). It resonates as the Java mind-set, where you’d like a library to be contractually obliged to throwing only certain exceptions. For instance, a queue library might raise exceptions such as QueueFull and QueueEmpty, both of which derive from QueueError. Later, we add support for dumping a queue to a file, where an IOError might happen; because we’re already “obligated” to throwing only QueueErrors, they might wrap the underlying IOError by a QueueError. Don’t do that.

It is reasonable that FooLibrary would only throw exceptions that derive from FooError, but only when these exceptions originate in FooLibrary. An underlying IOError has nothing to do with your library, it could happen any time and for various reasons. The same goes for an HTTP library that might get an ECONNRESET while talking over a socket – the underlying socket.error is clearly not an HTTPError (or any of its descendants), and should not be wrapped by one. Besides, there are so many things that could go wrong, especially in a dynamic language like Python, that it’s impossible to wrap everything.

It only makes sense to wrap an underlying exception where you can provide additional information on the cause of it, or where you want to change the semantics of it. A classical such case is connect_with_retries(), where you might want to allow several attempts before giving up. Here, you’d probably want to “accumulate” the intermediate exceptions and raise a ConnectionError("%d connection attempts failed", accum_exceptions).

Bottom line: wrap only where you **add information** to the underlying exception; there has to be a good reason for wrapping.

Don’t Handle Exceptions

Let me rephrase that: exceptions should be handled only

where it’s possible to fix/recover from the problem. For instance, if you get an EINTR error when accept()ing on a socket, it makes sense to swallow it and retry. Another use case is for fallbacks: first try to take the short way, and if it fails, take the long one. There are many more examples of handling an exception, of course, but you have to ask yourself whether you’re really handling the problem or just masking it.
where cleanup/rollback is necessary. It may be the case that you need to run rollback code only if an exception occurs (to release resources, etc.), so a finally-clause or a context manager won’t do. In this case you can except Exception, do the rollback, and raise (without passing any arguments to raise).

Note: I stand corrected by Nick Coghlan – you can use context managers for the very same effect. Forgot about that.
in the main function. Instead of letting the application crash with a traceback, you might want to log the exception to a file, pop up a message box, ask the user what to do next, etc.

In other words: handle exceptions only where you're actually **handling** them. If you’re not sure, leave it that way – you can always add except-clauses later.

It might seem trivial, but you’d be surprised how many times I find code that handles exceptions for no good reason. For example, people think that by swallowing all sorts of exceptions and returning None, they make their code “more robust”; that’s lying to yourself. You take a problem and make it worse, as it’s very likely you’d lose the original details and hide real issues.

I’ve encountered countless cases of code that follows a pattern such as except Exception: log.error(...), which hides bugs like a misspelled variable (NameError). Logging is not handling the exception (more on this in part 3). In the end, nobody ever reads the log, or even takes the time to properly configure it, so you ship a “very robust” product that has half the functionality you think it has.

Closing Words

The lesson to be learnt here is: think before you act. Don’t act dogmatically, don’t follow Java paradigms for no reason, and try to keep your footprint as low as possible when it comes to error handling. As the saying goes, shit happens; calling it in other names does not make it any better. Files disappear, permissions get screwed, devices disconnect, sockets die, everybody lies. That’s life.

Going back to the three bullets I opened this post with, you can see how by being overprotective and by excessively wrapping exceptions, an AttributeError became a DeviceError, which then became None. And you can see now why the first thing I do is remove all exception-handling code along the way: most of the times it just masks the real error, making it harder to diagnose, while adding little or no added value at all. You don’t make your code more robust by sweeping problems under the carpet: good code crashes, allowing tests to uncover more bugs, increasing robustness.

On the Granularity of Exception Classes

Some people are rather laconic and use a single exception for everything. I’ve seen people who were so lazy that they used raise Exception("foo") directly, instead of deriving an exception class of their own… People, it only takes one line to derive an exception class, there’s no excuse for being that lazy!

On the other hand, some people are way too verbose, defining specific exceptions for every minor detail. They end up with dozens of exception classes for each module, many of which are logically overlapping. This makes the implementation cumbersome, and, in fact, might not be useful at all for your users: they usually won’t care for such granularity, and you risk contaminating your interface with implementation details.

The rule to follow here is: the granularity at which exceptions are defined should match the granularity at which exception handling is done; define separate exceptions (only) where it makes sense to handle one differently than the other.

For example, it makes sense to handle a ConnectionError differently from an InvalidCredentials error, but there’s usually little sense in making the distinction between “connection failed because server is not listening” to “connection failed because server crashed after accepting us”. But in any case, be sure to include all the available information in the error message, as it’s important for logging/diagnostic purposes.

Javaism, Exceptions, and Logging: Part 1

2012-07-03T00:00:00+00:00

I’m working nowadays on refactoring a large Python codebase at my workplace, and I wanted to share some of my insights over two some aspects of large-scale projects: exceptions, logging, and a bit on coding style. Due to it’s length, I decided to split it over three installments; the first covers Javaism and an introduction to the issues of working with exceptions. Part 2 suggests “best practices” concerning exceptions in Python, and part 3 will cover logging (when, how, and how much).

Javaism

From my long experience in the programming world, I get the feeling that many programmers (even those fluent in Pythonspeak) come from a rich Java/.NET background, where they’ve acquired their programming skills and mind-set. Now converted-Pythonists, they are still the “speakers” of a second language, and they can’t deny their mother tongue. In the context of this post, I’ll refer to this as Javaism, or thinking Java in Python. Of course it might as well be C# or C++, but Java is the umbrella term.

You don’t have to go far to see examples of it, for Javaism didn’t skip Python’s standard library: modules/packages such as logging, unittest and threading where ported almost isomorphically from Java. On the surface, you might encounter camelCase names (getLogger), but the verbosity and over-complicated nature of Java and its inheritance methodology can be seen anywhere. For instance, recall the complexity of setting up a logger (I have to look it up every time), or the threading.Thread class… I really don’t wish to digress here, but I feel that a concrete example would make this point clear:

The canonical way to write thread functions is by subclassing Thread and implementing run(); you can pass a callback (called target, but it seems like an afterthrought (for once, it’s not the first argument of the constructor).

Recall that anonymous classes weren’t always there in Java, so extending a class was the shortest way to emulate first-class functions.
Speaking of the Thread constructor – it takes so many optional arguments (group?!), but none for daemon. For that, you have to imperatively call setDaemon() afterwards.

Why? Because Java doesn’t have keyword arguments, so by adding an additional argument you double the number of overloaded constructors (exponential growth), while get/set properties behave “linearly”.
Also, you first instantiate the thread object, then start() it… where’s the sense in that? What can you do with an unstarted thread object (other than calling setDaemon)? Consider Popen or file() – the one, obvious way that Python follows is resource instantiation is acquisition. Why introduce useless states in the lifetime of the object? You just lay the ground for unexpected code flows that may result in bugs.

What can you do with an unstrarted thread? Pass it on to someone else, who will start it. That’s a very general concept, called partial application, or currying. Python has lambda functions and partial, but Java doesn’t… The easiest way to create such “deferred” objects is to add “deferredness” as an internal state of the object – it’s a common Java practice, uncalled for in Python.

It’s funny how the limitations of Java were ported to Python as well, and made the implementation ugly – it’s a classical case of “I don’t want to think, let’s just copy an existing solution”. Luckily, it seems to be a thing of the past – it only entered stdlib in the old days, before the community had a clear notion of what being pythonic meant. But still, Javaism of all degrees is widespread, especially in corporate-developed large-scale projects (Zope and twisted, to name a few, but naturally closed-source corporate-internal projects are even more susceptible).

Exceptions

Java had a good insight (that they most likely stole from some other language) in that exceptions are part of a function’s signature. Just like a function takes an argument of type T1 and returns a result of type T2, it also has “side channels” through which it can produce results, known as exceptions, which should be documented and checked as well.

The problem is, trying to foresee everything that might ever go wrong is a futile attempt, and even Java itself makes two exceptions (no pun intended): Error, for unrecoverable exceptions (such as VirtualMachineError), and RuntimeException, for exceptions that may always occur (such as NullPointerException).

The fundamental idea is correct, but it was bound to fail: first, because people are lazy, but most importantly, because trying to predict all unexpected, future edge cases is absurd. For instance, suppose you’re implementing an interface that stores data (say, in files), so you might find yourself implementing a signature such as void write(byte[] data) throws IOException. Now suppose your implementation uses a third-party database engine, that throws MySQLException. For obvious reasons, MySQLException does not derive from IOException, and there’s nothing you can do about it, as both the interface and the DB engine are given to you. You’re now faced with three options:

Translate MySQLExceptions into IOExceptions
When designing interfaces, always declare the most general exception in throws clauses
When implementing libraries, always derive your exceptions from an unchecked exception (RuntimeException)

In short – you need to find a workaround and by-pass the compiler’s checking. This essentially means that throws should have served for documentation-only purposes, where the compiler might produce (suppressible) warnings should you not follow conventions. It’s more of a semantic property, like idempotence or thread-safety… you may state it in your Javadoc, but you wouldn’t expect the compiler to enforce that (not in a language like Java, anyway).

I’d guess most people agree that the second and third options are “inherently bad”, but opinions diverge on the first one. I will try to show that exception-wrapping (translating exceptions) is just as bad – at least when it comes to Python. We’ll cover this in part 2.

RPCs, Life and All

2012-06-25T00:00:00+00:00

A colleague of mine, Gavrie Philipson, has written an interesting blog post titled Why I Don’t Like RPC, in which he explains that transparent/seamless RPCs (a la RPyC) make debugging and reasoning efforts hard. For instance, you might work with an object (a proxy) that points to an object on the server process, which, in turn, is also a proxy that points to an object on yet another server process. Ideally, your local code shouldn’t be aware of the complexity (“number of hops”) or the details – but that’s not always the case.

Well, he won’t allow commenting on his blog, so I’m forced to formulate my response here :) Naturally, I’m biased about this subject, but I thought if I’m on it, why not also cover the broader aspects of the issue… However, it just kept getting longer and longer, until I got this behemoth of a blog post, so I’m attaching a TL;DR info box:

Transparent object proxying is only the logical way to extend RPCs to duck-typed languages
Asking for a clear distinction between local and remote objects ultimately means you're asking for a statically-typed language; it doens't make sense to ask for it in python
Network programming is hard, and it's a pity we still work at the socket level; we should strive for a decent fifth layer that would eliminate all the unnecessary complexity
General-purpose RPCs are the right primitive over which network programming should be abstracted: it's the missing fifth layer, which every network application reinvents
HTTP is a half-baked, broken alternative to a fifth layer; I'm glad ZeroMQ and others are starting to loosen its grasp.
RPyC can be used efficiently and correctly, it's not an impossible feat. Also, a show case of how I'm using RPyC to build a testing environment.

On Transparency

Gavrie’s main point is that transparency opens the door to (possibly) unplanned and undesired complexity. When it’s “too easy” to spread around, you might be tempted to (or even unknowingly end up with) creating over complicated (and cyclic) dependency graphs, stretching over several processes / machines, where it would be quite a feat to see the whole picture. The nice thing is, when it works - it just works (and you’re happy with your design), but when it fails, you want to be able to untangle the mess. As he puts it:

“Seamless” RPC encourages the writing of spaghetti code, because it’s so easy to mix local and remote code. This makes it deceptively easy to write distributed code without thinking about the design of the API and about which parts should reside on each side of the connection

An ideal remoting solution, according to Gavrie,

[…] should make the distinction between local code and remote code crystal clear to the developer.

On Duck Typing

I agree with the key points in Gavrie’s argument, and I can assert that debugging RPyC itself, during development, is highly deceptive (hint: never print an object…) and calls for all sorts of creative solutions. But then again, you are using a duck-typed, interpreted language. The “if it walks like a duck” phrase will soon make a more dramatic entrance, but for the time being, suffice it to say we only care for the runtime behavior of an object. Any object that I can .read() from or .write() to, is a “file concept”, and thus code should be compatible with any object that “adheres to this concept”.

A “file” might be an on-disk file, an in-memory byte stream, a mock object used for testing, or a SCSI device located on a remote storage array. The only natural way to extend the notion of duck-typing to RPCs is via transparent object proxying. If open() doesn’t differentiate between local and remote (NFS/SMB) files, and if my code doesn’t care for anything other than runtime behavior, such a “file concept” might as well be a proxy object that points to a file object on a remote machine. It’s only logical… in fact, it’s taking duck typing to its full potential!

On Types

Gavrie wants the RPC framework to “make the distinction between local code and remote code crystal clear to the developer” – well, Garvie might not have thought about it thoroughly, but what he’s actually asking for is a static type system. Types allow us to make distinctions between objects; they partition the “universe of data” into disjoint subsets that we can reason about: integers, floats, strings (to name a few). We can then group several types together, under the notion of a type class. For instance, floats and integers, albeit inherently different, both belong in the type class number, which provides us with additional operational semantics (like ordering relations, etc.).

We use types to partition the universe, because different “things” have different semantics and it doesn’t make sense to mix them together (modulo converting data of one type to data of another type). In fact, we normally want to prevent ourselves from mixing incompatible objects (be it integers and strings, or local and remote references) – that’s why we have type systems, and catch type mismatches at compilation time.

When you require a clear distinction between objects, it means you’re after a statically typed language; otherwise, you might as well just come up with a naming convention, where all variables that (might) refer to a remote object start with rem_. But if you want this distinction to propagate throughout the code, it must be enforced by a compiler; if you want to keep yourself in the duck-typed world, it doesn’t make sense.

Duck-typing (from a type-theoretic perspective) is like saying there’s only a single data type, which covers the entire universe; all checking is deferred to runtime, in which case it might (1) work, (2) fail, (3) silently cause corruption (as in a TextToSpeech instance, which may as well expose a .read() method, but it surely won’t do what we expect the “file concept” to do).

So asking for a “clear distinction” in a duck typed language is simply out of the question. What we can ask for is distinction in the level of APIs; for instance, write_local_file(filename) vs. write_remote_file(filename), but that breaks the “spirit of duck-typing”, where objects are no longer considered equal (even though they provide the desired runtime behavior). It’s like pointing out the ugly ducklings and making fun of them… that’s not cool.

Just to contrast, an RPyC-like library for Haskell would expose remote references as a distinct type class. You could have a function like remoteSum :: (Num a) => Remote [a] -> a, which takes a reference to a remote list of a’s, and returns its sum. Because it knows it operates on remote lists, it might be able to “move” the actual summation remotely, instead of sending the entire list over the network, item by item. I think this qualifies for a “crystal clear distinction”, but of course, that’s not what the snake teaches.

On Networks

It seems to me that people find it easy to abstract all sorts of concepts, as long as they don’t concern networks. When there’s network involved, they tend to want to “get a feel of the wire”… so they might use HTTP instead of the NIC directly, but they won’t take the next step and treat network resources as first class objects. There’s always a gap, between what’s here and what’s there, and we’re still too aware of the how to get there details… a gap that should have been bridged over long ago.

As I see it, it stems from two primal fears, so to speak: networks are hard (timeouts, routing, DNS, reconnects, authentication, compression, tunneling, round-trip time, …) and unreliable. As far as unreliability goes, there’s not much you can do about it; after all, the server is a process like any other, and may crash at any point of time. If that’s not enough, the remote machine might freeze or reboot. But then again, your local machine might kernel-panic or just go down with a power failure, losing any unflushed data… but that’s life. On the other hand, it seems to me that overall (hardware?) unreliability rates are going down with time, which is a promising outlook.

The “hardness” of network programming, on the other hand, is something we can solve. Good network programming is hard, there’s no question there, but for some reason, instead of solving the problem once of for all in a generic manner, it seems that every protocol / network-oriented application seeks to start at square one. Of course, done this way, it only handles the aspects it finds relevant… doing network programming at the socket level is analogous to rewriting the kernel for every desktop application. It doesn’t make sense.

I’ve started (and abandoned) an ambitious project called layer5, which aimed to concentrate all the network-related sorcery in a single place, so that programs on top of it wouldn’t have to care. It originated from my frustration with network programming in general (and in RPyC in particular)… things like handling timeouts, reconnects, authentication, negotiation, compression, serialization, load distribution, caching, error reporting – you name it.

Just to show-off a couple of ideas, consider a socket connection being dropped for some reason: if the network layer knew how to reconnect and resume the session, or automatically resend a request after some timeout (all configurable, of course), there would be no need for the application to be aware of anything. And, once you “lift” your code up from the socket layer, you can enable things like “moving targets”, where you may switch an IP address (wifi/3G) and the connection will just “follow you”; the session would not be bound to a “physical endpoint”. These are just some of the issues that layer5 attempted to solve.

On RPCs

Let me make a bold claim: everything is RPC. So, by everything I mean virtually all connection-oriented network protocols (e.g., excluding broadcasts / multicasts / streaming), and I take RPC to its broadest sense: an RPC is any message-oriented protocol in which one side makes requests that the other side fulfills: basically invoking a remote function. Naturally, in order to convey a message, the RPC imposes a serialization format, and in order to tell success from failure, it must also define “return codes”. Note: I’ve been planning to write about this topic for a very long time, but never got to it; it surely deserves a post of its own, but until that happens, please consider this a “briefing”.

As a case study, let’s examine HTTP: there are 4 (or so) methods: GET, PUT, POST and DELETE. Each such method takes arguments, like the URL, cookies, accept-encoding, etc. Some of them are required, some are optional; some pertain to the transport layer (content-length, compression, timeouts) while others to the method itself (URL, cookie, …). It also defines status codes for distinguishing errors from success (and again, it mixes transport-layer errors like redirect with method-level ones like not found or internal server error). It also defines a (very loose) serialization format for encoding the method’s arguments (newline-separated key-value strings) and the payload (multipart/form-data)… So, from an RPC point of view, HTTP is a service that provides 4 functions (methods), whose signature is something like (url, formdata = None, **kwargs).

Another example is tenlet – it basically provides a function whose signature is void write(char ch); when you type, write is invoked for each key stroke. Aside from sending characters, telnet also provides all sorts of negotiation options or commands, like bool set_binary(), void set_terminal(string), etc.

As with most ad-hoc RPC protocols, the two we’ve examined make horrible design choices like mixing transport-layer options with “business logic” (HTTP) or sending control in-band with the data (telnet), where a mere \xFF character in the stream marks the beginning of a command, so anyone can (maliciously or accidentally) inject commands into the stream. Yet another pitfall of these protocols is, they begin small, targeting a specific task, but if they’re successful, they grow to incorporate many unrelated things, like encryption and proxy support… as if security is something you sprinkle on top.

The main point I’m trying to make here is this: virtually all protocols are basically degenerate forms of RPC. To paraphrase Greenspun, all sufficiently complicated network protocols end up redoing compression, security, authentication, framing, serialization, negotiation / versioning, discovery, you name it (Filiba’s Eleventh Rule). This observation has brought me to the conclusion that doing network programming at the “byte level” is wrong, and that a general-purpose RPC layer is the right primitive for this.

A general purpose RPC would be language-agnostic, support only simple by-value types, such as strings, integers and lists (anything more complex can be built on top of that). If would also make no assumptions on how remote functions operate, if would only care for their signature. You can think of it as a more structured message-passing protocol, where you replace the notion of “message codes” by “function names”. This way, it’s easy to see that one can straight-forwardly emulate any message-passing protocol or more advanced RPC, over this layer. Heck, it’s f***ing 2012, I want to GET("/index.html", Agent="Chrome"), not formulate \r\n-separated strings or care about XML/JSON.

Layer5 (mentioned in the previous section) was to expose such a generic RPC, upon which applications would base their protocols. You could always just implement a bytes send(bytes data) RPC (over which would continue to pass your byte-level messages), or implement a more semantic interface – your choice. Either way, you’d benefit from layer5’s handling of reconnects, authentication, and the rest of the list.

On HTTP

Truth is, we sort-of already have such a “general purpose” layer 5 protocol, called HTTP. By a strange twist of fate, HTTP has become the de-facto application layer of choice, over which all kinds of protocols now operate. And we’ve managed to hide the gruesome details of HTTP under programmer-friendly libraries and APIs, so we are in a “better shape” now. Yet HTTP is such a miserable choice for this purpose (hence abominations like Websockets arose), and we all pay the price (programmatically-speaking, but also in the sense of network bandwidth, CPU time and electricity bills).

In my opinion, HTTP owes its success to another misfortunate happening - firewalls. In the days of yore, people thought they could eliminate threats simply by blocking all TCP ports, except for safe/trusted ones. HTTP was considered safe, as it only transfered “hypertext”, so all firewalls allowed port 80 traffic by default. This fact has led to many protocols being designed to work on top of HTTP, to be firewall-friendly… which meant firewalls no longer served the purpose for which they were conceived: blocking ports was not enough, so firewalls had to become content-aware anyway. Long story short – we’ve only managed to push the problem one level up: instead of solving it in the forth layer, we now do it in the application layer… no matter what the port number is.

However, this initial edge that HTTP had, helped in making it the de-facto transport layer of choice, which of course had a snowball effect. I’m only happy to see competing protocols like ZeroMQ and AMQP are starting to take some market share. Down with HTTP!

On RPyC

Coming back to Gavrie’s post, he brings up two additional points. The first:

In addition, its performance can quickly deteriorate: Objects are being serialized back and forth all the time, and tens of implicit network round-trips introduce latency all around the code.

Performance is a tricky thing. First, RPyC is mostly ever used on local, secure networks, where latency and round-trip time (RTT) are low – so unless you do something really flawed, you shouldn’t experience noticeable degradation. Second, the only places that do suffer from RTT are tight-loops, and to that end, RPyC already has solutions. And thirdly, transparency (like any form of abstraction) hides the underlying complexity, which means you won’t be able to optimize all the way. RPyC makes a choice for simplicity and pythonicity every time, at the expense of performance.

From my many years of using RPyC, I must say I’ve never experienced performance issues that didn’t originate from the use of threading and locks in python, or really bad code. And if the times are tough, you can always apply lightweight optimization techniques, such as locally “caching” remote objects that were obtained through a series of lookups, in variables (e.g., myfunc = w.x.y.z.myfunc)… it’s normally not that hard. I’m sure Gavrie has experienced performance issues with RPyC, but I can hardly imagine it could not be solved by reasonable amounts of such refactoring.

Which brings us to the last point Gavrie makes:

I don’t like RPC, especially not stateful RPC that supports access of remote objects by reference

I hope we already agreed that a general-purpose RPC is equivalent (if not better than) to any “normal” network protocol, so it’s really not RPCs that Gavrie hates but stateful / object-proxying sessions. This invites another, rather philosophic, question: what is state? What does it mean that a protocol is stateless? I’d guess philosophies like REST come to mind, but that’s just a buzzword. From the broadest Turing-machine perspective, if REST or any other protocol were truly stateless, they would have no effect on the world and thus would be of little significance (they’d be read-only protocols). Just to stress this point – PUT/POST-ing to a RESTful interface, adding/altering a record in a database table, is clearly stateful: you changed the state of the DB.

Therefore, these buzzword-rich protocols boast themselves with the term stateless, while they mean something very different. In lack of a better term, I’d use atomicity, durability, and reboot-ability – which we’ll discuss next. And just a last bit of REST: REST has a notion of idempotency, as GET requests for the same URI should always return the same result (but that’s not guaranteed). Anyway, the CRUD model which REST employs is quite limited and fits only so many real-life problems (many other problems can be reduced to CRUD, but I don’t suppose people would consider this the “right way”).

Atomicity and durability come from the ACID philosophy of databases, and are granted to you freely, assuming you use a DB (who doesn’t?). They basically mean that a transaction either fully happens (and then its permanently stored), or nothing happens (so that no partial results may exist). Reboot-ability is a term I just made up, and it means your server might crash and be restarted, or your entire lab might burn away, and the client shouldn’t be able to detect any difference (other than temporal unavailability, which may be compensated for by a cluster). Inherently, it means you don’t trust your server to survive over long periods of time, and therefore prefer to make it (the server process) stateless. In effect, it means the server will never make “changes to the world” outside of DB transaction, so that a failed transaction could be rolled back, and a new server process could resume where the previous one failed. But note the difference: the server process is stateless, not the protocol.

HTTP originally was a connection-less protocol (albeit over TCP), where each request was treated separately from the rest and there was hardly any notion of a session. This meant that every request had to carry along with it any state information it required. In order to prevent requests from growing wildly in size and choking the network, cookies were invented – which meant the server had to store session data, and the cookie was only a key. This already dents the notion of statelessness, and nowadays, things like websockets break it completely.

RPyC, on the other hand, has a clear notion of a session: on each end of the the connection there’s a dictionary of objects referred to by the other side. This means that should a connection drop, there’s little chance of restoring the lost session: the dictionaries are lost, and all proxies would be invalidated.

But all is not lost: if you only need references to serializable objects, you might as well keep this dictionary as a DB table. Since the data, including object IDs, would not be lost when the server gets restarted, resuming a dropped session is easy. So, if you could live with limited functionality, you can be backed by a DB – but it’s not that HTTP offers a better solution. At least keep your code pythonic and not full of HTTP curses.

Another alternative, which I’ll demonstrate in the next section, is making the changes directly “in the real world”: instead of storing state, make the changes on long-living entities, and then read the info back from them. This way, you’re always synchronized, and should you be restarted, you’ll never use stale data.

On Testing

I’m working now on a testing framework of quite a complicated nature: first, it serves as a resource-allocator for hosts and other testing equipment; second, in order to run tests, it must create a suitable environment for them. But this is where it gets fun: in order to do set up the environment, I must use the utilities that I set out to test… because that’s exactly what they do. Chicken and egg, anybody?

After a couple of days to toying with it, I settled for the following architecture:

Tests are written normally using unittest, but they also make use of a little module (8 LoC or so) that provides them with a means to connect to the resource allocator; this module basically hides the details of setting up an RPyC connection (as the server is well known, etc).
The resource allocator exposes a simple service, with methods like get_system(version = 17). It collects the information about the systems from a third-party service and caches it in-memory, so finding a matching system is quick and efficient. Basically, the resource allocator only takes care of distributing systems randomly (we do want to allow for two tests to run on the same system, but wish to avoid unnecessary contention).
The object returned by get_system is an instance of a peculiar class called HostViewOfSystem. It basically represents a how the host (running the test) views the system that’s been allocated to it, and it has methods like get_resource_from_system() that look for an unused resource (or create one) and take care of making it usable by the host.

There are quite a few details, but I hope I managed to make the design clear. On the other hand, it doesn’t seem particularly interesting – until we get to the last bullet-point – making the resource usable by the host. In order to do that, the server (resource allocator) creates a temporary directory on the host, onto which it copies (over RPyC) several python packages that are required for the task. It then fires up a new RPyC server on the host, and sets its PYTHONPATH to this temporary location. This RPyC server is based on the fresh-from-the-oven OneShotServer, which is capable of serving a single client and then quits. The server chooses a random port and reports it over stdout, to the resource allocator, who then connects to it. Then, the HostViewOfSystem object is given a reference to the newly created RPyC service, and it uses it to manipulate the host machine in order to set up the environment. Here’s a sketch:

+---Test Host---+            +---ResAlloc Server---+
|               |            |                     |
|  -----------  |  .-----------> HostViewOfSystem  |
|  | Test    |____/          |     |               |
|  | process |  |            |     |               |
|  -----------  |     _____________/               |
|               |    /       |                     |
|  -----------  |   /        |                     |
|  | Newly   <-----*         |                     |
|  | started |  |            |                     |
|  | RPyC    |  |            +---------------------+
|  | server  |  |
|  -----------  |
|               |
+---------------+

This quite tiresome setup actually takes only ~30 lines of code (and around one second to build), and it allows the tested utilities to rely on stable versions of themselves. The stable versions are fetched from the resource allocator server, thus we make absolutely no assumptions on the state of the host. Moreover, when we “allocate” a resource for a specific test, we mark the resource on the system as “in used”: it’s neither stored in-memory, nor in a DB – it’s marked directly on the resource, as metadata. This way, if the resource allocator is restarted, no state is lost – the new instance will read the most up-to-date state from the “real world”. There – a semi-stateless testing framework based on RPyC… and now I go to bed.

Reed-Solomon Codec

2012-06-08T00:00:00+00:00

Some Background

I’m working on an image processing project for the university, whose purpose is to embed (an extract) a print-scan resilient watermark into an image. This project has (sadly) gotten me acquainted with Matlab, from which I quickly ran way into the friendlier realms of Scipy and friends (Skimage rocks, by the way). I must say I really learned to appreciate the Scipy/Numpy gang in the last two weeks :)

If it wasn’t already obvious, it’s time to admit I’m a n00b when it comes to signal processing and applied math in general. I know the Fourier transform in broad terms, and have heard of discrete cosine transform and wavelets some time ago… but it’s not my cup of tea, to say the least. Luckily for clueless people like me, Matlab (and its kin) enables us to summon the dark powers of mathematics, without ever having to know what we’re doing. Hurrah!

So I’m DFT’ing, DCT’ing and DWT’ing like a pro, embedding my watermark using CDMA/spread-spectrum techniques in the frequency domain, and then inverting the process… and all I know is I’m supposed to keep myself in the mid-frequency range, i.e., in a certain region of the matrix. Math for n00bs.

My original idea was to take a string, encode it as a QR code, and then embed this QR image into the host image. I thought it would be a nice shortcut, as it provided me with a synchronization pattern and error correction out of the box, but it quickly turned out QR codes generate a payload that’s too big for unobservable embedding. So I set out to find some error correcting code (ECC) library for python, but it proved to be a really difficult task. I found some packages, most of them are haven’t been maintained in over 7 years nows, and all of them make use of extension modules that failed to compile. Then there’s zfec, but for the life of me, I couldn’t figure out how to use it as a simple encoder/decoder.

The Library

I almost gave up and resorted to triplicating my payload (at the bit level), and using majority-selection for each bit, when I came across an amazing python tutorial (with runnable code) that covers Reed Solomon codes and QR in depth. I simply extracted the code, added a usable API, wrote some examples and quickly uploaded it to PyPI, so now there’s a pure-python Reed-Solomon encoder/decoder: pip install reedsolo.

The library should support python 2.4-3.2, using strings or bytes. I really can’t verify the correctness of the algorithm (it’s beyond me), but it seems to work so I’m fine with it. Here’s a short demo:

>>> from reedsolo import RSCodec
>>> rs = RSCodec(10)     # 10 bytes of ECC will be added to the output,
...                      # which allows us to correct up to 5 byte-level errors
>>> rs.encode([1,2,3,4])
'\x01\x02\x03\x04,\x9d\x1c+=\xf8h\xfa\x98M'
>>> rs.encode("hello world")
'hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa'
>>> rs.decode('hello world\xed%T\xc4\xfd\xfd\x89\xf3\xa8\xaa')
'hello world'

Now let’s add some errors:

>>> rs.decode('hXXlo worXd\xed%T\xc4\xfdX\x89\xf3\xa8\xaa')     # 4 errors - ok
'hello world'
>>> rs.decode('hXXXo worXd\xed%T\xc4\xfdXX\xf3\xa8\xaa')        # 6 errors - fail
Traceback (most recent call last):
  ...
reedsolo.ReedSolomonError: Could not locate error

It’s pure python and highly unoptimized… I think someone acquainted with Numpy a little more than I am could improve it blindfolded by a factor 10, but even now, on my dinosaur machine, it encodes a 400kB message in 2.9 seconds and decodes it in 1.9 seconds. I’ll drink to that. By the way, it seems that the library can only handle messages that are less than 255 bytes long… but then you can simply encode/decode in chunks. I’ll include it in later versions.

I think a good ECC library for python is very useful… if anyone wants to join in on it, feel free to drop me a line at the comments or just fork the repo.

Some Notes on RPyC 3.2.x

2012-06-06T00:00:00+00:00

As I said in the previous blog post, I hoped for v3.2.2 to be the last release of the 3.2 line… Naturally, I was wrong :) Turns out the fix for issue #76 was buggy, and I decided to finally remove the use of excepthooks in favor of taking care of remote traceback chaining in the exception class’ __str__ method itself. The tentative release date for the release is August 1st, and I really hope to end the 3.2 line here.

I created a branch called LTS3.2 which will be used only for back-porting bug fixes from the master branch, which has now become the development branch of 3.3. I rebased master on top of LTS3.2 now, so that the two would have linear histories, which of course resulted in a forced update, so anyone who was using master for development would now have to do git fetch origin; git reset --hard origin/master instead of a simple git pull. It’s not supposed to happen again, sorry.

There shouldn’t be any more features in v3.2.3, so the development would take place on master, from which I’ll cherry-pick bug fixes onto LTS3.2. That’s all for now…

RPyC 3.2.2 Released

2012-06-01T00:00:00+00:00

This is a maintenance release, fixing some issues concerning introspection, ForkingServer and signals, IronPython and signals, and SSH on Windows. It also introduces optional logging of exceptions that occur over the RPyC connection to the server’s logger (or any other logger instance, given in the connection’s configuration). The change log has more details.

Note: This is the last release of the 3.2 branch. Future releases (of the 3.3 branch) will require plumbum, thus the SSH support will be removed from the codebase. The next version will also provide zero-deploy, which will make RPyC much easier to use. This release is expected in September.

The Future of Construct

2012-05-16T00:00:00+00:00

It’s been a long while since I’ve put time into Construct. I gave up on it somewhere in 2007, right after the release of v2.0… I think I just got bored, and felt like the library was complete and extensible enough to survive on its own. Of course I was wrong there, and code-rot had spread all over.

Luckily for us, Corbin Simpson took up the project in January of 2011 and has been maintaining it since then. He migrated it to github, changed the project to use a proper directory structure, fixed lots of bugs, and wrote extensive documentation. Since then, Construct has been building a solid community and has reached quite a remarkable number of downloads on PyPI.

All this time, I’ve been busy with my other projects, but I kept toying with the idea of Construct 3. I got some sketches, wrote some early drafts, cleaned up the implementation of Construct’s core… but it’s remained a dream. A couple of months ago I decided I’d back-port a nice feature from Construct 3, called this expressions, and it has rekindled my interest in the library.

`This` Expressions

One of the goals of Construct 3 was to generate efficient (C/Python) code from construct definitions. It even worked, to some extent: for instance, see this snippet that automatically generates this C-code. I had no recollection of this until I discovered it today. Funny.

Anyway, Construct 2 uses lambda functions to represent dependencies between constructs, e.g.

s = Struct("LV",
    UBInt8("length"),
    Bytes("value", lambda ctx: ctx["length"]),
)

and since this is the case, it’s impossible to translate dependencies to C. So I’ve created the this object, which essentially builds an expression tree from native Python expressions. In order to evaluate the expression, you simply invoke it with a context:

>>> from construct import this
>>> this.x * 2 + 3
((this.x * 2) + 3)
>>> (this.x * 2 + 3)({"x":7})
17

So now we can replace all lambda ctx: ctx["foo"] by the more succinct and readble this.foo – the benefits are visually clear – but they go even deeper than this: since we’re no longer dealing with black-box lambda functions, we can drill down into them and generate the appropriate (static) code.

Construct 2.5

I had to do some Construct work recently, and I missed the conciseness of this expressions, so I took the time to back-port them to Construct 2. I sent a pull request to Corbin, but he’s a bit too busy to maintain the library on a regular basis now, so he’s created a github organization and moved the repository there; this is where Construct will be developed from now on.

Tinkering with the old code again got me sentimental, and I started to do some long-awaited maintenance. I plan to release version 2.5 (note the dramatic shift from 2.06 to 2.5) in the summer (say August), and here’s the list of planned changes:

Adding this expressions
Dropping construct.text – it’s always been an experimental feature and it’s achingly inefficient. If you want to parse grammars, you should use more adequate tools
Adding Python 3 support (based on the work of Eli Bendersky); the library will support Python 2.5-3.2, using six
General cleanups and optimizations
Closing the wikispaces site in favor of readthedocs

Construct 3

The next big thing(TM), Construct 3, is still far away. I’ve got lots of cool ideas, but time is too short (as so is my ability to concentrate on one thing). Generally, the guiding thought is to modernize the library and make it even yet more compact and efficient, while removing magic along the way. For instance, because Structs require their sub-elements to have a name, and due to the fact keyword-arguments in Python are unordered, all constructs ended up taking a name argument (even though it’s usually meaningless to them, as in UBInt8("length")). This has given birth to all sorts of bastards like Rename and Alias; from now on, it’ll be simpler:

s = Struct(
    Member("length", UBInt8),
    Member("value", Bytes(this.length)),
)

A second issue is, laying the grounds for code generation, thus converting all dependencies to use this expressions, and perhaps even limiting the power of Adapters. Or at least, making a clear distinction between the constructs that can be turned into code and those that can’t.

And last but not least, I want Construct 3 to come with a designer, where you would drag-and-drop constructs, group them in “boxes”, connect them to each other (instead of this.length, you’d connect the Bytes’ count field to the source construct), etc. And most importantly, you could try it out live on a data sample, and see how it breaks it up. I made this sketch here (click to download the PowerPoint slides) to demonstrate how it might look:

I think it would be very powerful.

Closing Words

I’m mainly writing this post to inform everyone of Construct’s new repository, and to ask for feedback on the plans for v2.5… Any thoughts, requests, comments would be appreciated. Also, if anyone wants to join in (especially with Construct 3 ambitious plans) – feel free to contact me.

Introducing Plumbum - Shell Combinators

2012-05-12T00:00:00+00:00

It’s been a while since I last blogged… sorry! Had a midterm exam, a seminar project to deliver (an O(n^3) parser for Tree Insertion Grammar), the routine family festivities of Passover, and this new thing, Plumbum, that has been keeping my mind overclocked while I should have been studying for exams and writing seminar papers. But now that it’s out, it’s about time I write a little on it.

So Plumbum is something I’ve toyed with for quite some time now. In almost any sort of project, there comes a time when you have to write this really simple, 5-liner shell script, just to build your project’s artifacts, maybe also upload them to PyPI or sourceforge (using rsync), and while you’re at it, why not build the project’s documentation as well. Oh, and don’t forget to run regression tests before all that, and it would be wise to also handle some basic command-line options, as you might want to skip uploading to PyPI at times, or perhaps build on different versions of python… and then you end up with this monstrous shell script that you wrote (or worse, some former employee wrote) and never want to lay your eyes on again… At this point to begin hating yourself for not doing it in python, to begin with.

But then again, how do you translate find . -name "*.pyc" | xargs rm or cp */*.py /tmp to python-speak? That would take quite a few lines and require importing several modules. Shell scripts tend to be so short and enchanting… How do we bridge the gap?

Plumbum was born to fill this very gap: on the one hand, be pythonic (and be backed by strong libraries), and on the other, make it all as easy and one-liner-ish by nature: use a real programming language, with a well-behaved object model and high level concepts, to achieve what you’d normally do in a shell script, retaining the same expressive power and wrist-handiness of the shell. I call it shell combinators, as it’s a pythonic way to mimic shell syntax.

The library is actually a collection of utilities that I wrote for several separate projects, and had never got to polish them. Plumbum consolidates them into a single, production-grade framework. The library provides local and remote program execution, with support for piping and IO redirection; local and remote file-system paths abstraction; a programmatic command-line interface (CLI) toolkit, and numerous other utilities.

For instance,

from plumbum import local
from plumbum.cmd import wc, ls, echo, grep
from plumbum.utils import copy, delete

delete(local.cwd // "*/*.pyc")

for fn in local.cwd / "src" // "*.py":
    print wc("-l", fn)

num_of_src_lines = (ls["-l"] | grep["\\.py"] | wc["-l"])()

(echo["1"] > "/proc/sys/net/ipv4/ip_forward")()

There’s a short cheat-sheet as well as extensive documentation on the project’s site, but at this point I’d like to elaborate a bit on the CLI toolkit, as I think it deserves some more attention.

The approach of optparse / argparse and similar libraries is to build a parser object and populate it with options in an imperative manner, which I dislike. Plumbum’s CLI toolkit offers a more declarative yet programmatic alternative: An application is defined as a class that derives from cli.Application; it may define a main() method, which serves as the “entry point” of the application, and any number of switch-methods, meaning, methods that are invokable from the command-line.

Switch methods are normal methods, decorated by @switch, that may take either no arguments or a single one; for each switch given on the command line, the toolkit will invoke the switch method that binds it. Similarly, the main() method is invoked after all switches have been processed, and it takes all the positional arguments (i.e., non-switch arguments) that were given. And last but not least, there are switch attributes, which are in fact just specialized versions of switch functions, that store an argument given to the switch in an instance attribute.

So I’ve probably only gotten you confused by the terminology at this point, but actually it’s much simpler!

from plumbum import cli

class MyHttpServer(cli.Application):
    log_to_file = cli.SwitchAttr("--log-to-file", str)
    verbose = cli.Flag("-v")
    mode = cli.SwitchAttr("--mode", cli.Set("TCP", "UDP"), default = "TCP")
    port = cli.SwitchAttr("--port", cli.Range(1024, 65535), default = 8080)

    @switch(["-l", "--load-config"], cli.ExistingFile)
    def load_config(self, filename):
        """Loads the given config file"""
        f = open(filename, "r")
        self._parse_config(f.read())

    def main(self, src, dst):
        if self.log_to_file:
            logger.addHandler(FileHandler(self.log_to_file))
        logger.setLevel(logging.DEBUG if self.verbose else logging.WARNING)


if __name__ == "__main__":
    MyHttpServer.run()

There, I think it’s a lovely example of the expressive power of the CLI toolkit and Plumbum in general. I hope you’ll give it a try, and may you never have to write shell scripts again!

Solving Systems of Linear Equations

2012-03-25T00:00:00+00:00

Yet another university-related post, but I really enjoyed it so I thought I’d share: for a GUI- workshop I’m taking, we are given GUI-layout constraints as a system of linear equations, which we need to satisfy. To make life more interesting, some constraints are constant while some are parametric. There’s no magic here, just some linear algebra combined with Python’s overloadable nature to produce a nice and compact solver for these linear systems.

Naturally, you present your system as a coefficient matrix, which is Gauss-eliminated and then “solved” for a given set of variables. I took the Gauss-Jordan elimination code from Jarno Elonen and modified it to support MxN matrices (not necessarily square). Here’s a simple example of elimination:

>>> m = Matrix([1,2,4,2], [3,7,6,8], [2,2,2,9])
>>> print m.eliminate()
( 1.00  -0.00   0.00   6.00)
( 0.00   1.00  -0.00  -1.00)
( 0.00   0.00   1.00  -0.50)

But of course a reduced row-echelon matrix is not our final goal - we want to get the variable assignments. For this, we have solve():

>>> sol = solve(m, ["x", "y", "z"])
>>> print sol
{'y': -1.0000000000000004, 'x': 6.0, 'z': -0.4999999999999998}

Modulo precision errors, we get x = 6, y = -1 and z = -0.5. If we have more equations than variables, the “extraneous” equations must be linear combinations of previous ones, or a contradiction will result. But what if we have less equations than variables? It means we have some degrees of freedom… how would we handle that? It’s actually simple - instead of being resolved to constant values, variables will be assigned dependent expressions:

>>> m2 = Matrix([1,2,4,2], [3,7,6,8])
>>> print m2.eliminate()
( 1.00   0.00   16.00  -2.00)
( 0.00   1.00  -6.00   2.00)
>>>
>>>
>>> sol = solve(m2, ["x", "y", "z"])
>>> print sol
{'y': <BinExpr (2.0 - (-6.0 * z))>, 'x': <BinExpr (-2.0 - (16.0 * z))>,
    'z': <FreeVar z>}

As you can see now, z is a free variable and x and y are dependent on it. Of course more than one variable may be free and some variables may be independent of free variables. Once a value for z is known, we can “fully evaluate” the dependent expressions:

>>> sol["x"].eval({"z" : 10})
-162.0

The code is available on my github page. Note: all numbers are represented as Decimals, to avoid loss of precision as much as possible, and I’m using an “epsilon” value of 1E-20 to equate numbers to each other (meaning, x == y iff abs(x-y) <= epsilon).

Easy Syntax for Representing Trees

2012-03-07T00:00:00+00:00

I’m working on a parser for Tree Adjoining Grammar (TAG) for this seminar I’m taking. TAG is an extension of context-free grammar (CFG) that’s more powerful while still being polynomially-parsable. Anyhow, TAG makes use of “tree production rules” instead of the “linear” production rules of CFG: instead of S -> NP VP, you’d have a small tree, the root of which being S, having NP and VP as its children. Of course these trees can be more than two-level deep, and they go all sorts of operations such as substitution and adjunction, but that’s for the parser.

So I needed a compact and (hopefully) readable way to express such trees in my code. At first I used lots of parenthesis, which was ugly and cumbersome, but then I devised this:

class NonTerminal(object):
    def __init__(self, name):
        self.name = name
    def __sub__(self, children):
        return Tree(self, children)
    def __pos__(self):
        return Foot(self)

class Foot(object):
    def __init__(self, nonterm):
        self.nonterm = nonterm

class Tree(object):
    def __init__(self, root, children):
        self.root = root
        self.children = children

And here’s how you use it:

S = NonTerminal("S")

t1 = S-["e"]
t2 = S-["a", S-["c", +S, "d"], "b"]

# Which represents the following two trees:
#
# t1:                t2:
#      S                  S
#      |                / | \
#      |               /  |  \
#      e              a   S   b
#                       / | \
#                      /  |  \
#                     c   S*  d
#

The peculiar +S is a way to mark that a leaf node is a foot, which is part of the semantics TAG (that’s where adjunction takes place). It’s represented in the diagram by the more conventional S*, but I had to resort to a unary operator in the code. Anyway, I’m not sure if it’s a recipe or just a nice trick, but I thought I’d share this.

By the way, you can use both __sub__ and __neg__ to achieve things like S-----X (i.e., more than a single - sign, to allow for better padding), but I tried to avoid too much ASCII art. I’d love to hear about other such ideas!

RPyC 3.2.1 Released

2012-03-04T00:00:00+00:00

This is a maintenance release, fixing some minor bugs and resolving some issues with Python 3 compatibility. More on the change log.

Python 2.x: it’s advisable to upgrade to this version.

Python 3.x: it’s highly recommended to upgrade to this version, as it resolves some core issues that went under the radar in 3.2.0.

Toying with Context Managers

2012-02-27T00:00:00+00:00

As I promised in the code-generation using context managers post, I wanted to review some more, rather surprising, examples where context managers prove handy. So we all know we can use context managers for resource life-time management: before entering the with-suite we allocate (open) the resource, and when we leave the suite, we free (close) it – but there’s much more to context managers than meets the eye.

Stacks, for Fun and Profit

We’ve seen this one in the code-generation post, but we can generalize this notion a bit. Every time we enter a with block, we’d append an element to a list, which we’d pop on exit. Consider this snippet:

class Stacking(object):
    def __init__(self):
        self.stack = []
    @contextmanager
    def foo(self):
        self.stack.append("foo")
        yield "bar"
        self.stack.pop(-1)

Nothing fancy, but that’s exactly how the code generation framework works: whenever you enter a new block, it’s pushed onto the “stack”, and whenever we leave the block, the top-of-stack element is popped. This is how we automatically take care of indentation, curly-braces, etc.

But of course this pattern is much more useful. Consider something like this:

class Env(object):
    def __init__(self):
        self._envstack = [os.environ]
    @contextmanager
    def new(self, **kwargs):
        self._envstack.append(self._envstack[-1].copy())
        self._envstack[-1].update(kwargs)
        yield
        self._envstack.pop(-1)

env = Env()

with env.new(PATH = "/tmp/foo/bin", SHELL = "zsh"):
    # processes created here will use the modified environment
    pass

We can also leverage this concept to run commands as different users. Here’s a sketch:

cmd.run("ls")                       # as current user
with cmd.as_user("root"):
    cmd.run("ls", "/proc")          # as `root`
    with cmd.as_user("mallory"):
        cmd.run("rm", "-rf", "/")   # as `mallory`
    cmd.run("cat", "/etc/passwd")   # back as `root` again

In essence, every time you want to make local/undoable changes to your state, this pattern proves helpful.

Contextbacks

Contextbacks, a pun on callbacks, are contexts you pass as arguments to other functions. Many times it’s useful to pass a before-function and an after-function, and contextbacks are a nice way to encapsulate this. So instead of this:

def f(beforefunc, afterfunc):
    beforefunc()
    # your code goes here
    afterfunc()

You get this:

def f(ctxback):
    with ctxback:
        # you code goes here
        pass

Not a ground-breaking change, but I prefer it as it’s more concise.

Lightweight Asynchronism

This is probably my favorite use case for contexts: you can use them to pipeline or interleave long-lasting tasks. You can think of contexts are degenerate forms of coroutines, in which you have defined beginning and end, but the middle part is interchangeable, so you can stick anything into it.

Imagine you need to format a harddisk (using mkfs) or perform some long network operation (like copying a huge file over scp). The pattern is as follows: initiate the operation, wait for it to finish, and either return or raise an exception. This fits perfectly well with the way contexts work – with one change – yield instead of waiting.

@contextmanager
def format_disk(devfile):
    proc = Popen(["/sbin/mkfs", "-t", "ext3", devfile])
    try:
        yield
    except Exception:
        proc.kill()
    stdout, stderr = proc.communicate()
    if proc.returncode != 0:
        raise FormattingFailed(stderr)

with format_disk("/dev/sda1"), format_disk("/dev/sdb1"):
    pass

This saves you time: instead of waiting for the first operation to finish before starting with the second, you can run them in parallel. The total time would be that of the longest task (not taking into account the IO bottleneck). You can throw a yield into any piece of code instead of just blocking, and use it as a pipelined contextmanager. You can copy three files in parallel, without resorting to threads or a reactor in the background.

Of course you could improve that by returning an object that reports the progress of the operation, e.g.

with format_disk("/dev/sda1") as d1, format_disk("/dev/sdb1") as d2:
    while not d1.is_done() or not d2.is_done():
        print "%s is being formatted, %d%% completed" % (d1.devfile,
                d1.get_progress())
        print "%s is being formatted, %d%% completed" % (d2.devfile,
                d2.get_progress())

And voila! You have a thread-less, light-weight asynchronous framework at hand… a bit like using a reactor, but without rewriting your code.

And last, if you can’t tweak the blocking parts of the code (e.g., third party libraries), you can use the “defer to thread” or “defer to process” approach, a la twisted:

@contextmanager
def defer_to_thread(func):
    thd = Thread(target = func)  # it would be smarter to use a
    thd.start()                  # thread-pool
    yield thd
    thd.join()

with defer_to_thread(task1), defer_to_thread(task2), defer_to_thread(task3):
    # do something else in the meanwhile
    pass

So that’s all I had in mind. If you have other unorthodox use cases for contexts, I’d love to hear about them!

Wizard Dialog Toolkit

2012-02-11T00:00:00+00:00

Following my Deducible UI post, and following some of the criticism it had received, I’d like to share something I’ve been working on (read: experimenting with) at my work place. You see, we have some “interactive wizards” that storage admins use to connect storage arrays to their hosts (say, a DB server). These wizards prompt you with questions like your what’s your username, the name of the pool/volume, whether it’s an iSCSI or a Fiber Channel connection, etc., and then they go and perform what you’ve asked for.

These wizards operate in a terminal environment, but we’ve had thoughts to make GUI/web versions of them. This would be a considerable effort with the current design. Another issue they currently have is the mixing of “business logic” and presentation together. For instance, the code that scans the devices attached to your host also prints ANSI-colored messages or reports its progress. All in all it works fine, but there’s lots of room for improvement.

I began to investigate this corner a month or two ago. The initial observation was that such wizards have a pretty rigid and repetitive structure, thus we can find some abstraction or a “toolkit” for “expressing” wizards more compactly. This has also led to the realization that once the business logic and presentation are separate, there’s no reason to limit ourselves to terminal-based interaction: our wizard-toolkit could do the plumbing and work with terminals, ncurses, GUIs, web-browsers, etc. The business logic would remain oblivious, and we could have a nice GUI at zero-cost!

There was also a second issue of styling, i.e., printing text in color, that I wanted to get r id of. This part was easy: I thought, why not employ the model of HTML and CSS? Let’s separate the structure (semantics) of the text from its styling. Instead of printing a banner for titles, we’ll display a Title object, whose exact appearance is determined by a “style sheet” (a class, of course, not actually a text document).

For instance, when we’re using a color-enabled terminal, the title would be printed in bold and followed by an empty line; but if our terminal is color-blind, we’ll render the text centered and surrounded by = marks. Another example is error-handling: instead of printing error message in red every time, we’ll display an Error object; on a terminal, this would be rendered as red text, but when running in a GUI, rendering this object would pop up a message box. I’m going to ignore this for the rest of this post, as this is really a side issue.

Now let’s get to expressing wizards, or more generally, dialogs. Following some earlier iterations, I came to the model where a dialog is a “container object” that’s made of dialog elements. These elements can be output-only (such as a welcome message), or input-output (such as a message telling you to choose one of the available options). A dialog is “executed” by a DialogRunner that renders it and returns the results gotten from the user. It’s quite important to note that dialog elements within a single dialog cannot be interdependent – that is, if you want to ask the user for his name and then show "Hi there %s" with the user’s name, this has to be done as two, serial dialogs.

That was quite a lot of babble – let’s see this in action:

class MyApp(WizardApp):
    def main(self):
        iscsi = Option("iSCSI")
        fc = Option("FC")
        d = Dialog(
            Text(Title("hello world")),
            Input("un", "Username"),
            Password("pw", "Password"),
            Choice("conf", "What do you want to configure?", [iscsi, fc]),
        )
        res = self.ui.run(d)
        if res["conf"] == fc:
            self.config_fc(res)
        else:
            self.config_iscsi(res)
    ...

if len(sys.argv) == 2 and sys.argv[1] == "--gtk":
    MyApp.run(GtkDialogRunner("My App"))
else:
    MyApp.run(TerminalDialogRunner(ANSIRenderer))

It’s a short and incomplete snippet of course, as I’m only going to cover the big picture. The main function creates a dialog object d and passes it to ui.run, which “runs” the dialog and returns the results, as a dictionary. Notice that the dialog elements Input, Password and Choice all take a first parameter – this is the key under which the result would be placed in the returned dictionary, e.g., res["un"] would hold the user-provided user name, and res["pw"] would hold the password. Text, on the other hand, is an output-only element, so it doesn’t return anything and doesn’t take a key. Long story short, we’re asking the user to enter some information and choose one of two options, and then continue processing based on the selected option. At the bottom, we determine how to run the application based on a command-line switch: if --gtk is given, we’ll run the dialogs through the GtkDialogRunner; otherwise, we’ll use the TerminalDialogRunner.

And how does it look like? When running on a terminal:

And with a single command-line switch, we run as a GTK application:

So of course it’s far from perfect, but then again, it’s a small research project I’ve only put ~15 hours into. It suffers from some of the problems I’ve listed in the deducible UI post, for instance, the GUI hangs when the business logic performs blocking tasks. This could be solved by moving to a reactor-based model, but I’ve tried to keep the existing wizard code in tact as much as possible. A hanging GUI is not nice, but it’s not the end of the world either, and there are numerous ways to overcome this.

Another benefit this design brings along is the ability to automate testing by using mock dialog runners. Since our business logic is only exposed to the returned dictionary, we can use a dialog runner that actually displays nothing and returns a scripted scenario each time. We can even go further: because our business logic “talks” in high-level primitives like Choice, we can compute the Cartesian product of all choices and run through each of them. We can show that we’ve covered all paths! And we can do this automatically… without people hitting buttons and keeping logs of their progress.

Anyway, I just wanted to show that it’s feasible. I’m not releasing any code as this project is currently in very early stages, and it’s something I do at work. Perhaps we’ll open-source it in the future, if it proves useful enough.

Just Got Me These

2012-02-09T00:00:00+00:00

Hurrah! I just got me these:

After the CS secretariat refused to let take math courses as electives (thus forcing me into taking boring stuff like SQL), I think Dover books and I are going to become good friends. It’s not like I have plenty of time to read them, but even just seeing them on my bookshelf would make me a happier person :)

Next on my wish-list are some books on logics and proof theory, category theory, information theory, and automata and computability. I’ll get there some day…

WHO HAS CHEEZBURGER?

2012-02-03T00:00:00+00:00

Too much Wireshark…

Code Generation using Context Managers

2012-01-31T00:00:00+00:00

When I was working on Agnos, a cross-language RPC framework, I had a lot of code-generation to do in a variety of languages (Python, Java, C#, and C++). At the early stages, I just appended strings to a list. It was quick and dirty, and it’s got the job done… but that wasn’t enough, of course. I’ve lost the original code already, but it looked something like this:

def generate_proxy(typeinfo):
    lines = [
        "public class %sProxy {" % (typeinfo.name,),
        "    private int uid;",
        "    public %sProxy(int uid) {" % (typeinfo.name,),
        "        this.uid = uid;",
        "    }",
    ]
    for attr in typeinfo.attributes:
        if attr.get:
            lines.append("    public %s get%s() {" % (attr.typename,
                                                    attr.name,))
            lines.append("        // ...")
            lines.append("    }")
        if attr.set:
            lines.append("    public void set%s(%s value) {" % (attr.name,
                                                            attr.typename))
            lines.append("        // ...")
            lines.append("    }")
    lines.append("}")
    return lines
# ...

lines = []
for ti in typeinfos:
    lines.extend(generate_proxy(m, ti))

open("foo.java").write("\n".join(lines))

There are several problems with this approach. First of all, it’s very cumbersome and fragile. If you forget a comma in the list, two adjacent strings will be concatenated. Also, you have to do everything yourself, like remembering to close brackets, add semicolons, do the right indentation, etc. If you wished to split this code into functions, the functions you call would have to know the indentation level you’re calling them at, or the generated code would be unreadable. This might seem negligible, but think of languages where indentation matters, like Python…

The fundamental problem with this approach (and similar ones) is that the code generator does not reflect the structure of the generated code. The two are diseparate, while it’s quite obvious they should be correlated.

In order to solve this, I turned to context managers, a feature I highly value. Conceptually, context managers provide a way to bind beginning-and-end into a single entity; this is normally used for resource management – but we can leverage this construction further (I’ll this review in a different post). Here, I’ve used them to create nested blocks, which allowed me to reflect the structure of the generated code in the code generator.

Without going into too many details, I defined a Module class that exposes a block() context manager and a stmt() function. The module holds a “stack” of blocks, and entering a new block pushes a it onto the stack. Statements are then appended to the topmost block on the stack. Now, because this framework is “language-aware”, it can encapsulate language-specific details. For instance, In Java, a block will be indented correctly and wrapped by brackets; in Python, we’ll append colons to the opening line and indent the block; in C++, if the block begins with class, struct or enum, we’ll append a trailing semicolon as well.

Here’s how it works:

m = JavaModule()
m.stmt("import foo")
m.stmt("import bar")
m.sep()   # an empty line

#...

def generate_proxy(m, typeinfo):
    BLOCK = m.block
    STMT = m.stmt

    with BLOCK("public class {0}Proxy", typeinfo.name):
        STMT("private int uid")
        with BLOCK("public {0}Proxy(int uid)", typeinfo.name):
            STMT("this.uid = uid")

        for attr in typeinfo.attributes:
            if attr.get:
                with BLOCK("public {0} get{1}()", attr.typename, attr.name):
                    pass
            if attr.set:
                with BLOCK("public void set{0}({1} value)", attr.name,
                                                            attr.typename):
                    pass
# ...

for ti in typeinfos:
    generate_proxy(m, ti)

m.render_to_file("foo.java")

So what have we gained?

The code is much shorter and more concise
Brackets, semicolons and indentation come out-of-the-box
We’re no longer working with flat lists of strings – we’re working with hierarchal entities that reflect the structure of the generated code
And the other way around – the structure of the generated code is reflected in the generating code; nested code is indeed nested inside BLOCKs, thus the “generatee” and generator are visually and semantically correlated.
We can easily split our code into functions, as the module maintains an internal stack. If f() opened a block and called g() under it, it the code that g() generates will be placed and indented correctly.

I tried to keep my code quite general, so I haven’t defined all of the target language’s constructs, but of course we could do that, or at least head in that direction. It might look like this:

def generate_proxy(m, typeinfo):
    with m.CLASS(typeinfo.name + "Proxy", ["public"]):
        m.FIELD("int", "uid", ["private"])
        with m.CTOR(["int uid"]):   # CTOR gets the name of the current class
            STMT("this.uid = uid")

However, there’s a question of where we “put our foot down”, or we’ll end up writing Java Combinators for Python… and then we’ll be writing Java in Python. No need for that, thank you very much.

The full source code can be found in the Agnos repository

Deducible UI

2012-01-27T00:00:00+00:00

A Brief History

I like automating things. I don’t like having to reiterate myself: my dream is to always be able to add only the necessary amount of information in order to make something possible. This is one reason, for instance, why I hate expressions like ArrayList<String> x = new ArrayList<String>();… it always makes me feel like I’m talking to a retard (compiler).

In 2006/7, I wrote some demos for RPyC to show how easy network-related tasks become. I chose something rather complex, a chat client, to show that all the code sums up to a few lines: clients invoke a method on the server, say broadcast(str), and the server then invokes callbacks on all of its clients, sending them the message.

In order to make it usable, I had to write a GUI: I chose Tk, because it comes with python and is quite simple; I knew there were better toolkits, but my GUI was meant to be basic enough to be doable in any toolkit. It occurred to me, then, that I wrote ~20 lines show-casing RPyC and ~100 lines of horrible GUI code, and that something must be really wrong here. And by here I mean everywhere.

Note: throughout the article, I’m using the word GUI to mean any interactive user interface, be it graphical (Qt, GTK, wxWidgets, …) or terminal-based (curses and the like). Basically, anything that doesn’t block on a single line of input, like shells.

GUI Designers

So you might say, “Dude, just use QtDesigner or something”. A GUI designer lets you visually place components and makes your life much easier – drag and drop your widgets and double-click on a button to write its action. Very easy indeed. But I would like to offer a different angle on the subject: just like the invention of the teacup has hindered the technological advance of China, so do GUI designers hinder us from developing better GUIs. These designers offer a local optimum which we fail to surpass, and this leaves us with the mediocre UIs and development tools we have today. And get me going about XAML.

Think about it: you have to design a GUI. So yeah, it’s kind of simple, but doesn’t that break DRY? You have the code and you have the GUI – two faces of the same idea. Obviously, one should be derived from the other.

For the lion’s share of programs, the UI is highly deterministic – there’s some information that needs to be displayed to- or gotten from the user, and the bindings is trivial. Consider a login screen: you want to get a username and a password, use them somehow, and proceed to the next screen. This is a repeating task, and I’d guess that for ~80% of the programs in the world, it’s easy enough to automatically deduce how the UI should look, given the task at hand. And I’m not talking about machine learning algorithms or designing “families of tasks” – way simpler than that! Just define a mapping between programmatic primitives and their visual representation.

Deducing UI

The ultimate goal is to take “pure code”, unaware of UI, and by adding the necessary metadata, be able to automatically create (“deduce”) a GUI for it. In fact, I’d like to expose programmatic APIs to a human – completely interchangable programmatic- and human- interfaces. Think how cool it could be to import Adobe Photoshop and run a directory full of pictures through its filters, instead of doing so through the UI… without Adobe having to write a separate GUI and programming toolkit.

The UI needn’t be an eye-candy, at least at the beginning; it just has to be good-enough. It won’t work for games or complex applications like Office, but for it would be just fine for a chat client. Let’s assume the following mapping:

An object is represented by a window
Read-only instance attributes are represented as labels
Writable instance attributes are represented as textboxes
Methods are represented by buttons. If a method requires arguments, it would be preceded by textboxes

Of course we could change textboxes and labels to reflect the attribute’s or argument’s type – DateTime would be represented by a DatePicker, int could be represented by a number box with up/down arrows, etc. And of course the framework is free to change the mapping however it wants, to achieve better, more coherent representation. The mapping above is just a rough draft.

Now, instances of a class like the following:

class Person {
	public String firstName;
	public String lastName;
	public DateTime birthdate;

	public void dance() {...}

	public void eat(String foodstuff) {...}
}

would turn into

with just a little bit of binding in the form of:

class Main {
    static public void Main(String[] args) {
        Person p = new Person(...);
        guify(p);
    }
}

Straightforward, isn’t it? You can already begin to see the benefits. This framework would obviously require some sort of decoration (annotations in Java, attributes in C#, …) on which classes and which class members are to be exposed, and perhaps some extra metadata, like a picture to show instead of a method’s name, or some layout information – but it’s perfectly doable. And we can turn it better looking (unlike my beautiful ASCII art example), by using better UI primitives and a better mapping between objects and their representation; but let’s leave it for now.

I wrote a a simple prototype of this and lo and behold, it actually worked! But when you try to use it in a real-life applications, the going gets tough: things are updated behind the scenes (not through our UI framework) and we need to reflect these changes in the UI. For instance, an element is added to a list via the list’s add() method - how can the GUI become aware of that? Well, we can use observable objects, which the GUI would observe; so instead of using an ArrayList<T>, you’d simply use an ObservableArrayList<T>.

But creating an observable counterpart for every class is a considerable effort on the framework’s side, and it breaks software modularity: the framework has to be aware of every 3rd party class that you wish to expose, or at least allow you to provide the means to expose them. Another downside of this scheme is that your code becomes aware of the GUI: if we’ve so far managed to keep our code clean of GUI primitives (we only required some metadata), all of the sudden you must replace your lists with GUI-observable-lists. Bummer.

Another issue is that using synchronous programming techniques (blocking operations) does not play well with this model: when does the GUI gets its “runtime”? How can we keep it from freezing? Who’s providing the entry-point of the program? Does it run in a separate thread? If so, we risk polluting our code with GUI-related locks (which is countering the whole purpose); and besides, threads suck and add the incurred complexity is never worth it. The only feasible option is asynchronous programming (via a reactor) – but requires that your code be programmed this way, and it’s quite nasty to write such code without proper language support (e.g., lack of closures, coroutines, etc.).

“No Way”

As I said, I’ve been toying with this idea from 2006, and I always get the same response from colleague programmers: “it would never work”, “it won’t be good enough”, “no one would want to use it”, “users need their eye-candy”, “I need tight control over the layout”, and what not. Skeptics galore. My answer is always the same: you never know what your user wants – so who are you to decide? And besides, you’re always to lazy to support proper customization of UI, so your user must live with your decisions.

Sure, there are books and dissertations about UX, and you’ve read them all; but why not just provide good-enough defaults, and let the rest be customized by the user? Let the framework deduce a sane layout for your code – but let’s make everything movable/resizable/dockable. This way, if it makes more sense to place button X to the left of button Y, the user can do so himself. And let’s remember the user’s preferences in a file, which we’ll load each time the application runs. And by “user” I’m also talking about your UX expert – let him/her decide on a default look (i.e., the preferences file) for the application, which will be shipped with your product, but the end-user would still be able to move things around. Wouldn’t it be easier? And if you insist, here’s the place to stick some machine-learning magic, in order to deduce better UIs by default.

So anyhow, I had a working proof-of-concept somewhere, but I think I lost it. It wouldn’t be too hard to recreate it, but at the moment I’m more concerned with UI combinators, which I’ll cover in a future post. Fully deducing a UI is quite a challenge, as a detailed above, but it’s doable nonetheless, and the added-value is huge! I’ll get back to it some day, perhaps after I have better UI combinators… but in the meantime, is there anyone in the audience who’s willing to pick it up?

Regarding Infix Operators

2012-01-25T00:00:00+00:00

I got some reactions to the Infix Operators post, and wanted to point out some things. First of all, I’m not the one who came up with it – it’s a recipe from the Python Cookbook that’s been posted in 2005. I’m not taking credit for it or anything, I just said I loved the idea and I adapted the code a little.

Second, regarding coding style or the pythonicity of this scheme – let’s be clear, it’s a hack. In order for it to work, the two arguments must refuse to support __or__ on Infix objects, which means you can’t compose them properly:

>>> @Infix
... def dot(f, g):
...     return lambda *x: f(g(*x))
...
>>> @Infix
... def mul(x, y):
...     return x * y
...
>>> def double(x):
...     return x * 2
...
>>> f = double |dot| mul   # this works ok
>>>
>>> g = mul |dot| double   # but this won't
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in __or__
TypeError: mul() takes exactly 2 arguments (1 given)

It’s also magical and unpythonic by nature. You can read in the cookbook comments about a less-experienced programmer who complained he had to wrap his head around around this. I’d say this feature is a kin to metaclasses: they are useful (at times), but there are better, more pythonic ways to do the same without the magic.

So why is it useful? First of all, it’s an interesting pattern, worth more knowing about than actually using it. But on the more practical side, it could be very useful in domain-specific languages (DSL), where it could increase your expressiveness. Consider something like Construct, where you define data structures declaratively:

ipaddr = Struct("ipaddr",
    UInt8("a"),
    UInt8("b"),
    UInt8("c"),
    UInt8("d"),
)

lenval = Struct("lenval",
    UInt8("len"),
    Bytes("val", lambda ctx: ctx.len),   # interdependencies: "val" is `len` bytes long
)

We could replace all sorts of built-in constructs by such “operators”, thus building “data expressions”. So here’s a very early sketch of what it could look it:

ipaddr = UInt8 |seq| UInt8 |seq| UInt8 |seq| UInt8

# or maybe just
ipaddr = UInt8 |repeat| 4

# binding names into the context
lenval = UInt8 |bind_seq("len")| Bytes(getctx("len"))

Beware: I just made this up, there’s no solid concept behind it.

All Systems are Go

2012-01-23T00:00:00+00:00

At last, I finished migrating the old drupal site to github pages. Everything is now fully revisioned and statically-generated (using Disqus for comments). Jekyll is so cool! I wrote all the HTML and forged the stylesheets myself… hope you like it.

Anyhow, I’m happy with the design now, and I’ll start blogging more regularly. I still have plenty of pages to complete (projects, about-page, etc.), and a few partially-written blogs posts to polish up, but other generally speaking, we’re up.

If you have any feedback about the design/site, please let me know in the comments below or via email. Thanks!

Foxx0rz

2012-01-22T00:00:00+00:00

This is Foxx0rz, our course’ mascot. He’s written in C (compiles under MSVC++ 6 to be exact), and we had it printed on T-shirts… ‘twas fun.

Here’s a downloadable version

And this is the output:

Infix Operators in Python

2012-01-22T00:00:00+00:00

As you may already know, there are 3 kinds of operators calling-notations: prefix (+ 3 5), infix (3 + 5), and postfix (3 5 +). Prefix (as well as postfix) operators are used in languages like LISP/Scheme, and have the nice property of not requiring parenthesis — there’s only one way to read an expression like 3 5 + 2 *, unlike 3 + 5 * 2. On the other hand, it reduces code readability and the locality of operators and their arguments. This is why we all love infix operators.

Now imagine I have a function, add(x,y), and I have an expression like add(add(add(5,6),7),8)… wouldn’t it be cool if I could use infix notation here? Sadly though, Python won’t allow you to define new operators or change how functions take their arguments… but that doesn’t mean we have to give up!

Haskell, for instance, allows you to define custom operators and set their precedence, as well as invoking “normal” functions as infix operators. Suppose you have a function f(x,y) — you can invoke it like f 5 6 or 5 \`f\` 6 (using backticks). This allows us to turn our previous expression, add(add(add(5,6),7),8), into 5 \`add\` 6 \`add\` 7 \`add\` 8, which is much more readable. But how can we do this in Python?

Well, there’s this Cookbook recipe that provides a very nice way to achieving the same functionality in Python (adapted a little by me):

from functools import partial

class Infix(object):
    def __init__(self, func):
        self.func = func
    def __or__(self, other):
        return self.func(other)
    def __ror__(self, other):
        return Infix(partial(self.func, other))
    def __call__(self, v1, v2):
        return self.func(v1, v2)

Using instances of this peculiar class, we can now use a new “syntax” for calling functions as infix operators:

>>> @Infix
... def add(x, y):
...     return x + y
...
>>> 5 |add| 6
11

Surrounding decorated functions with pipes (bitwise ORs) allows them to take their parameters infix-ly. Using this, we can do all sorts of cool things:

>>> instanceof = Infix(isinstance)
>>>
>>> if 5 |instanceof| int:
...     print "yes"
...
yes

And even curry functions:

>>> curry = Infix(partial)
>>>
>>> def f(x, y, z):
...     return x + y + z
...
>>> f |curry| 3
<functools.partial object at 0xb7733dec>
>>> g = f |curry| 3 |curry| 4 |curry| 5
>>> g()
12

Ain’t that cool?

לקט ציטוטי מורים

2011-12-01T00:00:00+00:00

לקט ציטוטים שאספתי במהלך התיכון.

לינק ל PDF

RPyC Moves to a New Site

2011-08-29T00:00:00+00:00

RPyC is in the process of migrating from http://rpyc.wikidot.com to it’s new (and hopefully final) location at http://rpyc.sourceforge.net. Wikidot had served us well, and was easy to maintain, but they started displaying way to many ads and didn’t support rsync or SSH access, which meant I couldn’t upload the generated API reference automatically.

The new site is written entirely in ReST using sphinx and large parts of it are auto-generated from docstrings in the source code. It’s all now part of the git repository, and I only have to run make upload to upload it all up.

Learning Me a Haskell

2011-08-17T00:00:00+00:00

Phew! Finally the semester’s over (just submitted my last project), and it’s time to clean up my ever-so-long backlog. Here goes nothing: I’ll start by posting something here, after this long while of neglect.

As I’m sure you already know, I’m a long-time Pythonista, and I’m confident enough in calling myself a “native speaker” of that language. Although python is my expertise, I’d say I’m fluent in most other prominent programming languages (say, C, C++, Java, C#, VB), and with adequate knowledge of many more. It may seem like a good set of skills, but I have to admit this brings one to a point of stagnation. There comes a time where everything just looks the same: you take a glimpse at a new language/platform/other technology and sigh, “oh well, on the surface it’s different, but underneath it’s all the same sh*t”. You come to the conclusion that people mostly change the looks-and-feel, but nothing radical could never happen. And then you skip to the next article.

Then came Haskell: a statically-typed, type-inferred, highly-expressive functional language. I’m not new to functional programming – I’ve programmed in Scheme, and I even wrote an interpreter for a toy functional-language that I made up in order to investigate nested-scopes and evaluation models – but heck, this one’s different. A friend of mine nagged me to learn Haskell for quite a time already, but I was deterred by Haskell’s ugly syntax (it’s not Python a’right) and I kept pushing it for “when I have time”. It always seemed too academic to me, too impractical.

I think what provoked me into seriously learning Haskell is a pattern called continuation passing style – I was just amazed by the extra power, design simplicity, and greater expressiveness that you get with it. Of course you can implement it in almost any other language, and in fact it was invented in Scheme (call/cc), but Haskell, with its purely-mathematical state of mind, just brings the best out of things. Haskell has also taught me that expressiveness and conciseness are not virtues to be underestimated: if you can do something in one line instead of four, you must be generalizing on some deeper concept that you’ve previously missed. It’s not just looks-and-feel, this time.

But let’s not go astray. I think the most prominent and well-known feature of Haskell is its excellent type system. I’m sure you’ve heard of static languages, like java or C++, where every expression has a compile-time type; and I’m sure you’ve heard of dynamic (duck-typed) languages, where there are only run-time types and no checks are (or can be) performed prior to running the code. In the first case, you curse the compiler for limiting what you can do (even when it makes perfect sense) and for being bluntly stupid, requiring you to repeat yourself over and over (FooBar x = new FooBar()). And because most type systems are so weak, you find yourself escaping to “duck-typing” techniques like void * or Object, and relying on run-time casts. In the latter case, you find yourself trying to cover the infinite space of type permutations that your functions have to handle, and soon you pray for some sort of validation or restrictiveness.

When you start learning Haskell, you suddenly realize how weak most type systems are, and that there are (much) better alternatives. It’s like you’ve been suffering from a blurry vision your whole life, and suddenly you realize you can wear glasses. Has it ever occurred to you that type systems are equivalent to proof systems (the Curry-Howard isomorphism)? And that a compiler basically tries to prove your code? You can think of compilation errors as finding a counter-example to your claim! “You said F takes an integer, but I see you try to call it with a String”… so of course this is a trivial example, one that even a C compiler catches, but you can make more complicated claims. And then you realize that type systems and compilers (“provers”) do matter – a smarter compiler that employs a more powerful type system can prove or disprove more complicated claims! That’s a non-trivial conclusion that most programmers missed… type systems don’t have to suck, and they can actually work for you, unlike Java.

But what do I know, I’m just learning Haskell now, and besides – this post is but a teaser. So do yourself a favor and Learn You a Haskell!

Hooking Imports for Fun and Profit

2011-06-17T00:00:00+00:00

I really love Python… it’s so hackable that it just calls for hacking, inspiring your imagination to find ways to stretch its boundaries. This time I decided to investigate into import hooks, to add some missing functionality I wanted to have.

As you probably know, Python uses a flat namespace for packages, that works on a “first found first served” basis. Packages are simply searched in linear order, as they appear in sys.path: if two directories contain a package named foo, importing foo will fetch package in the first directory. This is normally the desired behavior (as it allows you to override some modules by changing PYTHONPATH), but it’s also quite limiting.

Consider the nested package namespace used by Java and various other languages (e.g., Haskell), where packages are normally “deeply nested”, as in com.sun.foo.bar or com.ibm.spam.ham. In Haskell, for instance, packages are normally placed under their appropriate “categories”, e.g., Data.Vector or Control.Monad. When you write a new data type, you’ll probably put it under Data, as in Data.Vector.UberVector.

If we were to use something like that in Python, if would require us to have a single com/ directory, under which lots of sub-packages must be placed. This might be possible, but it’s certainly not the “right way”; and besides, it implies that code written by Sun and IBM is somehow related (after all, they share the __init__.py file in com/), which means one may affect the other (for better or worse :)).

It would make much more sense to have separately-installed packages, e.g., site-packages/com.ibm.spam.ham and site-packages/com.sun.foo.bar, where each package is independent of the other. Sure, they might share a common com prefix – but that’s all. This is especially useful in corporate environments, where multiple teams share common packages (which usually get very unoriginal names, say, common), and name collisions are very likely. It’s not a joke: at my work-place, we’re know reorganizing our code after such problems. Also, from a marketing point-of-view, it might make more sense for your customers to import mycompany.foobar than just import foobar.

But most importantly – it’s composable. Nested packages allow you to “inject” your package into another namespace. Take twisted for instance: it’s become so large that it had made more sense to split it up into sub-packages (twisted.conch, twisted.news, …), and allow end users to choose which of them they wish to install. However, since Python wouldn’t let you have site-packages/twisted and site-packages/twisted-conch, they resorted to hacking distutils into doing what they want. If nested packages were supported, you would have a core twisted package, with separate add-on packages like twisted-conch. So why not, really?

Enter nimp (“nested imports”). Without going into too many technical details, nimp is a meta-import hook – it modifies the way import statements work. Specifically, it scans sys.path and “merges” packages that begin with a common prefix into “logical packages”. For instance, if you have com-ibm-foo and com-sun-bar on your sys.path, nimp will create namespace packages for com, com.sun and com.ibm. This would allow code like import com.ibm.foo or from com.sun.bar import vodka to work transparently. All you need to do is run import nimp; nimp.install() (you can also put it in your site.py, so it would happen every time you run a Python process), and you’re ready to go.

This was the first time I wrote an import hook, and I really liked how I easy it was to change the import mechanism. So today I had another idea – lazy imports. Of course there’s PEAK’s lazyModule and this quite complicated recipe, but I thought, why not combine the two. Writing code like from peak... import lazyModule; foo = lazyModule("foo") is cumbersome, while the recipe attempts is too make everything lazy.

Instead, I created a module called __lazy__, that when imported, installs a meta-import hook. This import hook handles only modules that begin with __lazy__, so instead of importing them, it returns an “on-demand-loaded module” (i.e., when you try to access an its attributes). Using it is really simple:

>>> from __lazy__ import telnetlib
>>> telnetlib
<OnDemandModule 'telnetlib'>
>>> telnetlib.Telnet  # forces loading
<class telnetlib.Telnet at 0x015B08B8>
>>> telnetlib
<module 'telnetlib' from 'C:\Python27\lib\telnetlib.pyc'>

>>> from __lazy__.xml.dom import minidom
>>> minidom
<OnDemandModule 'xml.dom.minidom'>
>>> minidom.parseString   # forces loading
<function parseString at 0x01659CF0>
>>> minidom
<module 'xml.dom.minidom' from 'C:\Python27\lib\xml\dom\minidom.pyc'>

Note, though, that using from __lazy__.x.y import z forces the loading of x, since we use the dot operator on it. The from __lazy__ import foo is “truly lazy”

You can get the code of __lazy__ here.

Property Classes

2011-05-14T00:00:00+00:00

Tired of creating properties the old way? Python 3 brings an improvement in the form of multi-stage properties, i.e.,

@property
def foo(self):
    ... # getter

@foo.setter
def foo(self, value):
    ... # setter

It still feels very awkward. I can’t say my solution is pure elegance, but I find it cleaner.

Code

import types

def property_class(cls):
    getter = getattr(cls, "get", None)
    if isinstance(getter, types.UnboundMethodType):
        getter = getter.im_func
    setter = getattr(cls, "set", None)
    if isinstance(setter, types.UnboundMethodType):
        setter = setter.im_func
    return property(getter, setter)

Example

>>> class Person(object):
...     def __init__(self):
...         self._age = 17
...     @property_class
...     class age:
...         def get(self):
...             return self._age
...         def set(self, value):
...             self._age = value
...
>>> p = Person()
>>> p.age
17
>>> p.age=19
>>> p.age
19

Python is Messy

2011-05-07T00:00:00+00:00

A couple of days ago, Rudiger, a user of RPyC found a rather surprising bug, that in turn revealed just how gruesome python’s inner workings are. Rudiger was working with two machines, one 32 bit and the other 64 bit, and one machine had a netref to a remote list. He then tried to execute something as simple as mylist[1:], which to everyone’s surprise threw a very peculiar exception: OverflowError: Python int too large to convert to C long.

At first, it seemed that the exception originated from the server side, but further investigation showed it actually originated from the client side, propagated to the server side, and then back to the client side: very weird indeed. I recreated the scenario with two of my machines, and popped up wireshark to see exactly what was going on there. The last packet before the “resonating” exception seemed to be invoking __getslice__ on the client-side list object, passing it 1 as the start index and 9223372036854775807 as the stop index. Where the heck is that number coming from? I added lots of debug prints and what not, but that strange number kept appearing there, and it was obviously not my code that placed it there.

A day later, the answer finally stroke me: 9223372036854775807 is actually 0x7fffffffffffffff, which is sys.maxint on 64-bit machines. From that point on, the solution was simple, but it had revealed a nasty implementation-detail of CPython 2.xx. You see, when getting a slice of an object, two methods come to play. The first, deprecated, method is __getslice__, which simply takes two arguments for start and stop. The second, recommended method, is __getitem__ which accepts a slice object instead of an integer. Sadly, <type list> has both, which are reflected on the netref proxy, which causes this rather surprising behavior:

>>> class Foo(object):
...     def __getitem__(self, x):
...             print "getitem", x
...     def __getslice__(self, *args):
...             print "getslice", args
...
>>>
>>> y=Foo()
>>> y[7]
getitem 7
>>> y[7:8]
getslice (7, 8)
>>> y[7:]
getslice (7, 2147483647)
>>> y[7:None]
getitem slice(7, None, None)

As you can see y[7:] invokes __getslice__ with sys.maxint, while y[7:None] (which is equivalent) invokes __getitem__ with a slice object… how lame! So when the server (64-bit) side code attempts to execute mylist[1:] it invokes __getslice__ on the client (32-bit) side, passing it the server’s “version” of sys.maxint, which goes into the C implementation and blows up. Monkeyballs!

So the simple (and only) solution to this issue is using mylist[1:None] when working with different-width platforms… sorry.

ctypes - Pointer from Address

2011-04-27T00:00:00+00:00

There are times you need to construct a ctypes pointer from an integer address you have, say the id of a python object. I scratched my head for quite a while until I found out a how to do it properly (with some help from the stackoverflow guys). Here’s what I got:

import ctypes

def deref(addr, typ):
    return ctypes.cast(addr, ctypes.POINTER(typ)).contents

Example

# get the ref count of an object (in a very nasty way :)
>>> x="hello world"
>>> deref(id(x), ctypes.c_int)
c_long(1)
>>> y=x
>>> z=x
>>> deref(id(x), ctypes.c_int)
c_long(3)

Some words of caution:

I’m relying here on id returning the address of an object. This is a weak assumption, but it holds (and is likely to continue to hold) for CPython.
There are APIs for what I showed in the example above… use them instead!

I’m using this code to dig into the vicious [OVERLAPPED](http://msdn.microsoft.com/en-us/library/ms684342(v=vs.85%29.aspx) structure that’s held inside PyOVERLAPPED for really low-level hacking… Anyway, if anyone finds this recipe useful, feel free to use it.

Copy Function Defaults

2007-02-06T00:00:00+00:00

Default arguments to functions are evaluated when the function is created, and are stored in the function object. This cause irritating problems when the default values are in fact mutable, as only single instance exists:

>>> def f(x = []):
...     x.append(5)
...     print x
...
>>> f()
[5]
>>> f()
[5, 5]
>>> f()
[5, 5, 5]

Sometimes it’s the desired behavior, but mostly it’s a bug. To solve that bug, we use

>>> def f(x = None):
...     if x is None:
...         x = []
...     x.append(5)
...     print x
...
>>> f()
[5]
>>> f()
[5]
>>> f()
[5]

But this idiom adds lots of boilerplate code into functions. The following little decorator solves that problem elegantly.

Code

from copy import deepcopy

def copydefaults(func):
    defaults = func.func_defaults
    def wrapper(*args, **kwargs):
        func.func_defaults = deepcopy(defaults)
        return func(*args, **kwargs)
    return wrapper

Example

>>> @copydefaults
... def f(x = []):
...     x.append(5)
...     print x
...
>>>
>>> f()
[5]
>>> f()
[5]
>>> f()
[5]