Introducing Plumbum - Shell Combinators
May 12, 2012

It's been a while since I last blogged... sorry! Had a midterm exam, a seminar project to deliver (an O(n^3) parser for Tree Insertion Grammar), the routine family festivities of Passover, and this new thing, Plumbum, that has been keeping my mind overclocked while I should have been studying for exams and writing seminar papers. But now that it's out, it's about time I write a little on it.

So Plumbum is something I've toyed with for quite some time now. In almost any sort of project, there comes a time when you have to write this really simple, 5-liner shell script, just to build your project's artifacts, maybe also upload them to PyPI or sourceforge (using rsync), and while you're at it, why not build the project's documentation as well. Oh, and don't forget to run regression tests before all that, and it would be wise to also handle some basic command-line options, as you might want to skip uploading to PyPI at times, or perhaps build on different versions of python... and then you end up with this monstrous shell script that you wrote (or worse, some former employee wrote) and never want to lay your eyes on again... At this point to begin hating yourself for not doing it in python, to begin with.

But then again, how do you translate find . -name "*.pyc" | xargs rm or cp */*.py /tmp to python-speak? That would take quite a few lines and require importing several modules. Shell scripts tend to be so short and enchanting... How do we bridge the gap?

Plumbum was born to fill this very gap: on the one hand, be pythonic (and be backed by strong libraries), and on the other, make it all as easy and one-liner-ish by nature: use a real programming language, with a well-behaved object model and high level concepts, to achieve what you'd normally do in a shell script, retaining the same expressive power and wrist-handiness of the shell. I call it shell combinators, as it's a pythonic way to mimic shell syntax.

The library is actually a collection of utilities that I wrote for several separate projects, and had never got to polish them. Plumbum consolidates them into a single, production-grade framework. The library provides local and remote program execution, with support for piping and IO redirection; local and remote file-system paths abstraction; a programmatic command-line interface (CLI) toolkit, and numerous other utilities.

For instance,

from plumbum import local
from plumbum.cmd import wc, ls, echo, grep
from plumbum.utils import copy, delete

delete(local.cwd // "*/*.pyc")

for fn in local.cwd / "src" // "*.py":
    print wc("-l", fn)

num_of_src_lines = (ls["-l"] | grep["\\.py"] | wc["-l"])()

(echo["1"] > "/proc/sys/net/ipv4/ip_forward")()

There's a short cheat-sheet as well as extensive documentation on the project's site, but at this point I'd like to elaborate a bit on the CLI toolkit, as I think it deserves some more attention.

The approach of optparse / argparse and similar libraries is to build a parser object and populate it with options in an imperative manner, which I dislike. Plumbum's CLI toolkit offers a more declarative yet programmatic alternative: An application is defined as a class that derives from cli.Application; it may define a main() method, which serves as the "entry point" of the application, and any number of switch-methods, meaning, methods that are invokable from the command-line.

Switch methods are normal methods, decorated by @switch, that may take either no arguments or a single one; for each switch given on the command line, the toolkit will invoke the switch method that binds it. Similarly, the main() method is invoked after all switches have been processed, and it takes all the positional arguments (i.e., non-switch arguments) that were given. And last but not least, there are switch attributes, which are in fact just specialized versions of switch functions, that store an argument given to the switch in an instance attribute.

So I've probably only gotten you confused by the terminology at this point, but actually it's much simpler!

from plumbum import cli

class MyHttpServer(cli.Application):
    log_to_file = cli.SwitchAttr("--log-to-file", str)
    verbose = cli.Flag("-v")
    mode = cli.SwitchAttr("--mode", cli.Set("TCP", "UDP"), default = "TCP")
    port = cli.SwitchAttr("--port", cli.Range(1024, 65535), default = 8080)
    
    @switch(["-l", "--load-config"], cli.ExistingFile)
    def load_config(self, filename):
        """Loads the given config file"""
        f = open(filename, "r")
        self._parse_config(f.read())
    
    def main(self, src, dst):
        if self.log_to_file:
            logger.addHandler(FileHandler(self.log_to_file))
        logger.setLevel(logging.DEBUG if self.verbose else logging.WARNING)


if __name__ == "__main__":
    MyHttpServer.run()

There, I think it's a lovely example of the expressive power of the CLI toolkit and Plumbum in general. I hope you'll give it a try, and may you never have to write shell scripts again!