D for the Win August 21, 2014
I’m a convert! I’ve seen the light!
By the way, be sure to read part 2 as well.
You see, Python is nice and all and it excels in so many domains, but it was not crafted for the ever growing demands of the industry. Sure, you can build large-scale projects in Python (and I have built), but you take it out of the lab and into the real world, the price you pay is just too high. Literally. In terms of work per CPU cycle, you can’t do worst.
The C10M problem is a reiteration of the
C10K problem. In short, today’s commodity hardware can handle millions
of packets per second, but in reality you hardly ever reach such numbers. For example,
I worked a short while at a company that used AWS and had tens of twisted
-based Python servers
accepting and logging requests (not doing any actual work). They managed to squeeze ~500 requests/sec
out of this setup (per machine), which escalated in cost rather quickly. Moving to PyPy
(not without trouble) did triple
the numbers or so, but still, the cost simply didn’t scale.
Python, I love you, but you help instill Gate’s law –
“The speed of software halves every 18 months”. In the end, we pay for our CPU cycles and we want to
maximize our profit. It’s not you, Guido, it’s me. I’ve moved on to the C10M world, and for that
I’d need a programming language that’s designed for system programming with a strong and modern
type system (after all, I love duck typing). I need to interface with external systems,
so a C ABI is desirable (no foreign function interface), and meta-programming is a huge plus (so I
won’t need to incorporate cumbersome code-generation in my build system). Not to mention that
mission-critical code can’t allow for the occasional NameError
or NoneType has no member __len__
exceptions. The code must compile.
I’ve looked into rust (nice, but will require a couple of years to mature enough for a large-scale project) and go (Google must be joking if they actually consider it for system programming), but as strange as it may sound, I’ve finally found what I’ve been looking for with D.
Dlang Dlang Über Alles
System programming is a vast ocean of specifics, technicalities and constraints, imposed by your specific needs. Instead of boring you to death with that, I thought it would be much more intriguing to compare D and Python. In other words, I’ll try to show how D speaks fluent Python.
But first things first. In (the probable) case you don’t know much D – imagine it’s what C++ would have
dreamed to be. It offers cleaner syntax, much shorter compilation time, (optional) garbage collection, highly
expressive templates and type inference, Pythonic operator overloading (implemented as rewriting),
object-oriented and functional capabilities (multi-paradigm like Python), intermingles high-level constructs
(like closures) with low-level ones (naked functions in inline assembly) to produces efficient code, has
strong compile-time introspection capabilities and some extra cools features in the domain of code generation:
mixin
– which evaluates an arbitrary string of D code at compile time, and CTFE
– compile-time
function execution. Whoa, that was long.
In general, D follows Python’s duck-typed (or protocol-oriented) spirit. If a type provides the
necessary interface (“protocol”) for an operation, it will just work, but you can also test for
compliance at compile time. For example, ranges are a generalization of generators in Python.
All you need to do in order to be an InputRange
is implement bool empty()
, void popFront()
and auto front()
, and you can use isInputRange!T
to test whether T
adheres the protocol.
By the way, the exclamation point (!
), which we’ll soon get acquainted with, distinguishes compile-time
arguments from runtime ones.
For brevity’s sake, I’m not going to demonstrate all the properties I listed up there. Instead, I’ll show why Python programmers ought to love D.
Case Study #1: Generating HTML
In an old blog post I outlined my vision of HTML templating languages: kill them all. I argued they are all but crippled-down forms of Python with an ugly syntax, so just give me Python and an easy way to programmatically manipulate the DOM.
I’ve later extended the sketch into a library in its own right, named srcgen. You can use it to generate HTML, C-like languages and Python/Cython code. I used it in many of my commercial projects when I needed to generate code.
So here’s an excerpt of how’s it done in srcgen
:
And here’s how it’s done in D:
You can find the source code on github, just keep in mind it’s a sketch I wrote for this blog post, not a feature-complete library.
The funny thing is, Python’s with
and D’s with
are not even remotely related! The Python implementation builds
a stack of context managers, while with
in D merely alters symbol lookup. But lo and behold!
The two versions are practically identical, modulo curly braces. You get the same expressive power in both.
Case Study #2: Construct
But the pinnacle is clearly my D version of Construct.
You see, I’ve been struggling for many years
to create a compiled version of Construct. Generating efficient, static code from declarative constructs
would make the library capable of handling real-world data, like packet sniffing or processing of large files.
In other words, you won’t have to write a toy parser in Construct and then rewrite it (by hand) in C++
.
The issues with my C version of Construct were numerous, but they basically boiled down to the fact I needed a stronger object model to represent strings, dynamic arrays, etc., and adapters. The real power of Construct comes from adapters, which operate at the representational (“DOM”) level of the data, rather on its binary form. That required lambdas, closures and other higher-level concepts that C lacks. I even tried writing a Haskell version, given that Haskell is so high-level and functional, but my colleague and I had given hope after a while.
Last week, it struck me that D could be the perfect candidate: it has all the necessary high-level concepts while being able to generate efficient code with meta-programming. I began fiddling with a D version, which proved extremely promising. So without further ado, I present dconstruct – an initial sketch of the library.
This is the canonical PascalString
declaration in Python:
And here’s how it’s done in D:
Through the use of meta-programming (and assuming inlining and optimizations), that code snippet there actually boils down to something like
Which is as efficient as it gets.
But wait, there’s more! The real beauty here is how we handle the
context. In Python, Construct builds a dictionary
that travels along the parsing/building process, allowing constructs to refer to previously seen objects.
This is possible in D too, of course, but it’s highly inefficient (and not type safe). Instead, dconstruct
uses a trick that’s commonly found in template-enabled languages – creating types on demand:
The strange alias _curr this
is a lovely feature of D known as subtyping
. It basically means that
any property that doesn’t exist at the struct’s scope will we forwarded to _curr
, e.g.,
when I write myCtx.foo
and myCtx
has no member named foo
, the code is rewritten as myCtx._curr.foo
.
As we travel along constructs, we link the current context with its ancestor (_
). This means that for each
combination of constructs, and at each nesting level, we get a uniquely-typed context. At runtime, this context is
nothing more than a pair of pointers, but at compile time it keeps us type-safe. In other words, you can’t reference
a nonexistent field and expect the program to compile.
A more interesting example would thus be
When we unpack MyStruct
(which recursively unpacks YourStruct
), a new context ctx
will be created
with ctx._curr=&ms.child
and ctx._=&ms
. When YourStruct
refers to "_.length"
, the string
is implanted into ctx
, yielding ctx._.length
. If we refered to the wrong path or misspelled anything,
it would simply not compile. That, and you don’t need dictionary lookups at runtime – it’s all resolved
during compilation.
So again, this is a very preliminary version of Construct, miles away from production grade, but you can already see where it’s going.
By the way, you can try out D online at dpaste and even play around with my
demo version of dconstruct
over there.
In Short
Python will always have a special corner in my heart, but as surprising as it may be (for a guy who’s made
his career over Python), this rather unknown, rapidly-evolving language, D, has become my new language of choice.
It’s expressive, concise and powerful, offers short compilation times (as opposed to C++
) and makes
programming both fun and efficient. It’s the language for the C10M age.