The Future of Construct
May 16, 2012

It's been a long while since I've put time into Construct. I gave up on it somewhere in 2007, right after the release of v2.0... I think I just got bored, and felt like the library was complete and extensible enough to survive on its own. Of course I was wrong there, and code-rot had spread all over.

Luckily for us, Corbin Simpson took up the project in January of 2011 and has been maintaining it since then. He migrated it to github, changed the project to use a proper directory structure, fixed lots of bugs, and wrote extensive documentation. Since then, Construct has been building a solid community and has reached quite a remarkable number of downloads on PyPI.

All this time, I've been busy with my other projects, but I kept toying with the idea of Construct 3. I got some sketches, wrote some early drafts, cleaned up the implementation of Construct's core... but it's remained a dream. A couple of months ago I decided I'd back-port a nice feature from Construct 3, called this expressions, and it has rekindled my interest in the library.

This Expressions

One of the goals of Construct 3 was to generate efficient (C/Python) code from construct definitions. It even worked, to some extent: for instance, see this snippet that automatically generates this C-code. I had no recollection of this until I discovered it today. Funny.

Anyway, Construct 2 uses lambda functions to represent dependencies between constructs, e.g.

s = Struct("LV",
    UBInt8("length"),
    Bytes("value", lambda ctx: ctx["length"]),
)

and since this is the case, it's impossible to translate dependencies to C. So I've created the this object, which essentially builds an expression tree from native Python expressions. In order to evaluate the expression, you simply invoke it with a context:

>>> from construct import this
>>> this.x * 2 + 3
((this.x * 2) + 3)
>>> (this.x * 2 + 3)({"x":7})
17

So now we can replace all lambda ctx: ctx["foo"] by the more succinct and readble this.foo -- the benefits are visually clear -- but they go even deeper than this: since we're no longer dealing with black-box lambda functions, we can drill down into them and generate the appropriate (static) code.

Construct 2.5

I had to do some Construct work recently, and I missed the conciseness of this expressions, so I took the time to back-port them to Construct 2. I sent a pull request to Corbin, but he's a bit too busy to maintain the library on a regular basis now, so he's created a github organization and moved the repository there; this is where Construct will be developed from now on.

Tinkering with the old code again got me sentimental, and I started to do some long-awaited maintenance. I plan to release version 2.5 (note the dramatic shift from 2.06 to 2.5) in the summer (say August), and here's the list of planned changes:

Construct 3

The next big thing(TM), Construct 3, is still far away. I've got lots of cool ideas, but time is too short (as so is my ability to concentrate on one thing). Generally, the guiding thought is to modernize the library and make it even yet more compact and efficient, while removing magic along the way. For instance, because Structs require their sub-elements to have a name, and due to the fact keyword-arguments in Python are unordered, all constructs ended up taking a name argument (even though it's usually meaningless to them, as in UBInt8("length")). This has given birth to all sorts of bastards like Rename and Alias; from now on, it'll be simpler:

s = Struct(
    Member("length", UBInt8),
    Member("value", Bytes(this.length)),
)

A second issue is, laying the grounds for code generation, thus converting all dependencies to use this expressions, and perhaps even limiting the power of Adapters. Or at least, making a clear distinction between the constructs that can be turned into code and those that can't.

And last but not least, I want Construct 3 to come with a designer, where you would drag-and-drop constructs, group them in "boxes", connect them to each other (instead of this.length, you'd connect the Bytes' count field to the source construct), etc. And most importantly, you could try it out live on a data sample, and see how it breaks it up. I made this sketch here (click to download the PowerPoint slides) to demonstrate how it might look:

I think it would be very powerful.

Closing Words

I'm mainly writing this post to inform everyone of Construct's new repository, and to ask for feedback on the plans for v2.5... Any thoughts, requests, comments would be appreciated. Also, if anyone wants to join in (especially with Construct 3 ambitious plans) -- feel free to contact me.