Easy Syntax for Representing Trees
March 07, 2012

I’m working on a parser for Tree Adjoining Grammar (TAG) for this seminar I’m taking. TAG is an extension of context-free grammar (CFG) that’s more powerful while still being polynomially-parsable. Anyhow, TAG makes use of “tree production rules” instead of the “linear” production rules of CFG: instead of S -> NP VP, you’d have a small tree, the root of which being S, having NP and VP as its children. Of course these trees can be more than two-level deep, and they go all sorts of operations such as substitution and adjunction, but that’s for the parser.

So I needed a compact and (hopefully) readable way to express such trees in my code. At first I used lots of parenthesis, which was ugly and cumbersome, but then I devised this:

class NonTerminal(object):
    def __init__(self, name):
        self.name = name
    def __sub__(self, children):
        return Tree(self, children)
    def __pos__(self):
        return Foot(self)

class Foot(object):
    def __init__(self, nonterm):
        self.nonterm = nonterm

class Tree(object):
    def __init__(self, root, children):
        self.root = root
        self.children = children

And here’s how you use it:

S = NonTerminal("S")

t1 = S-["e"]
t2 = S-["a", S-["c", +S, "d"], "b"]

# Which represents the following two trees:
#
# t1:                t2:
#      S                  S
#      |                / | \
#      |               /  |  \
#      e              a   S   b
#                       / | \
#                      /  |  \
#                     c   S*  d
#

The peculiar +S is a way to mark that a leaf node is a foot, which is part of the semantics TAG (that’s where adjunction takes place). It’s represented in the diagram by the more conventional S*, but I had to resort to a unary operator in the code. Anyway, I’m not sure if it’s a recipe or just a nice trick, but I thought I’d share this.

By the way, you can use both __sub__ and __neg__ to achieve things like S-----X (i.e., more than a single - sign, to allow for better padding), but I tried to avoid too much ASCII art. I’d love to hear about other such ideas!

Easy Syntax for Representing Trees March 07, 2012

Easy Syntax for Representing Trees
March 07, 2012