Javaism, Exceptions, and Logging: Part 1
July 03, 2012

I’m working nowadays on refactoring a large Python codebase at my workplace, and I wanted to share some of my insights over two some aspects of large-scale projects: exceptions, logging, and a bit on coding style. Due to it’s length, I decided to split it over three installments; the first covers Javaism and an introduction to the issues of working with exceptions. Part 2 suggests “best practices” concerning exceptions in Python, and part 3 will cover logging (when, how, and how much).

Javaism

From my long experience in the programming world, I get the feeling that many programmers (even those fluent in Pythonspeak) come from a rich Java/.NET background, where they’ve acquired their programming skills and mind-set. Now converted-Pythonists, they are still the “speakers” of a second language, and they can’t deny their mother tongue. In the context of this post, I’ll refer to this as Javaism, or thinking Java in Python. Of course it might as well be C# or C++, but Java is the umbrella term.

You don’t have to go far to see examples of it, for Javaism didn’t skip Python’s standard library: modules/packages such as logging, unittest and threading where ported almost isomorphically from Java. On the surface, you might encounter camelCase names (getLogger), but the verbosity and over-complicated nature of Java and its inheritance methodology can be seen anywhere. For instance, recall the complexity of setting up a logger (I have to look it up every time), or the threading.Thread class… I really don’t wish to digress here, but I feel that a concrete example would make this point clear:

It’s funny how the limitations of Java were ported to Python as well, and made the implementation ugly – it’s a classical case of “I don’t want to think, let’s just copy an existing solution”. Luckily, it seems to be a thing of the past – it only entered stdlib in the old days, before the community had a clear notion of what being pythonic meant. But still, Javaism of all degrees is widespread, especially in corporate-developed large-scale projects (Zope and twisted, to name a few, but naturally closed-source corporate-internal projects are even more susceptible).

Exceptions

Java had a good insight (that they most likely stole from some other language) in that exceptions are part of a function’s signature. Just like a function takes an argument of type T1 and returns a result of type T2, it also has “side channels” through which it can produce results, known as exceptions, which should be documented and checked as well.

The problem is, trying to foresee everything that might ever go wrong is a futile attempt, and even Java itself makes two exceptions (no pun intended): Error, for unrecoverable exceptions (such as VirtualMachineError), and RuntimeException, for exceptions that may always occur (such as NullPointerException).

The fundamental idea is correct, but it was bound to fail: first, because people are lazy, but most importantly, because trying to predict all unexpected, future edge cases is absurd. For instance, suppose you’re implementing an interface that stores data (say, in files), so you might find yourself implementing a signature such as void write(byte[] data) throws IOException. Now suppose your implementation uses a third-party database engine, that throws MySQLException. For obvious reasons, MySQLException does not derive from IOException, and there’s nothing you can do about it, as both the interface and the DB engine are given to you. You’re now faced with three options:

  1. Translate MySQLExceptions into IOExceptions
  2. When designing interfaces, always declare the most general exception in throws clauses
  3. When implementing libraries, always derive your exceptions from an unchecked exception (RuntimeException)

In short – you need to find a workaround and by-pass the compiler’s checking. This essentially means that throws should have served for documentation-only purposes, where the compiler might produce (suppressible) warnings should you not follow conventions. It’s more of a semantic property, like idempotence or thread-safety… you may state it in your Javadoc, but you wouldn’t expect the compiler to enforce that (not in a language like Java, anyway).

I’d guess most people agree that the second and third options are “inherently bad”, but opinions diverge on the first one. I will try to show that exception-wrapping (translating exceptions) is just as bad – at least when it comes to Python. We’ll cover this in part 2.