Justifying the Visitor Pattern for ASTs (in Java anyway)

(I mean, let’s take a second and appreciate how dumb this looks)

I’ve started writing a relational DBMS to better understand how they work. Any good relational DBMS uses SQL, and therefore requires parsing a SQL AST. So I find myself implementing a basic interpreter as well.

I turned to Robert Nystrom’s book Crafting Interpreters to help with this task. Nystrom’s book does a great job of explaining the different parts of an interpreter and creating intuition for the design choices made. However, I ran into trouble once it came time to parse an AST into a query plan for my database execution engine. Nystrom introduces the popular Visitor Pattern to handle parsing AST nodes. As we’ll see, the Visitor Pattern is complex and Nystrom skips over certain leaps in logic to justify the use of the pattern. These omissions allow the book to remain at a high theoretical level at the cost of a proper explanation on why the visitor pattern is optimal for parsing an AST.

If I’m going to introduce the additional complexity into my code, I want a good reason for doing so. So I set out with a simple task: either justify why the Visitor pattern is the best way of solving this problem, or find a better way of solving the problem.

Note: Because this article builds on Nystrom’s book (or any similar textbook), it presupposes a general understanding of Abstract Syntax Trees (ASTs), OOP principles, and the problems that the Visitor pattern is trying to solve. If you don’t have this background, feel free to read through anyway, but certain statements I take to be true may seem unjustified without the context. Alternatively, you can read the section here and then continue.

The problem

The fundamental question is: given an AST, how do we assign behaviors to these classes?

Since this post is based on the Crafting Interpreters book, I’ll just quote from that

We have a family of classes and we need to associate a chunk of behavior with each one. The natural solution in an object-oriented language like Java is to put those behaviors into methods on the classes themselves. We could add an abstract intepret() method on Expr which each subclass whould then implement to interpret itself.

This works alright for tiny projects, but it scales poorly. ¹ Like I noted before, these tree classes span a few domains. At the very least, both the parser and interpreter will mess with them. As you’ll see later, we need to do name resolution on them. If our language was statically typed, we’d have a type checking pass.

If we added instance methods to the expression classes for every one of those operations, that would smush a bunch of different domains together. That violates separation of concerns and leads to hard-to-maintain code.

Nystrom then goes on to define this problem as the expression problem. The way he does so, including drawing distinction between object oriented and functional languages, is simple and brilliant. The book is worth a read for those 1.5 pages alone. However, after he explains why we need to separate along the “function grain”, he jumps straight to the Visitor pattern.

The Visitor pattern involves multiple layers of indirect polymorphism and certain unintuitive verbage. I thought with enough time, the details would fall away and leave a beautiful simple structure. However, even after multiple rereads of the section and reading other articles, I was convinced the visitor pattern was less something you understood so much as you accepted. So what’s going on?

AST Code

Let’s start with a more rigorous definition of our AST example classes. The details aren’t important, but I included them because I personally have trouble believing anything that leaves out code. I’d recommend reading through once to understand the general patterns, then re-reading and following the code in your head to verify everything works.

abstract class Expr {
    static class Binary extends Expr {
        Expr left;
        Token operator;
        Expr right;
    }

    static class Literal extends Expr {
        Token value;
        Type type;
    }

    // Other Expr...
}

Solution 1: Chain of type tests

Nystrom’s first proposal is a class that would create a chain of instanceof checks to determine what subtype of object it is. Like so:

class Printer {
    void printExpr(Expr expr) {
        if (expr instanceof Expr.Binary) {
            
            Expr.Binary bin = (Expr.Binary)expr;
            System.out.println("BinaryExp");
        
        } else if (expr instanceof Expr.Literal) {
            
            Expr.Literal lit = (Expr.Literal)
            System.out.println("Literal");
        
            // Other cases can be easily added...
        } else {
            throw new IllegalArgumentException("Unexpected Expression type");
        }
    }

}

which would accomplish what we want. His main critique is:

But all of those sequential type tests are slow. Expression types whose names are alphabetically later would take longer to execute because they’d fall through more if cases before finding the right type. That’s not my idea of an elegant solution.

Honestly, I don’t mind this approach that much. It’s a little clunky, but with compiler optimizations I can’t imagine it’d be THAT slow. It gets bonus points for simplicity, which I think is underrated. But he does have a point that Java does not allow switching on pure Class types.

Java fact 1: You can’t switch on class types

In Java, a switch statement is usually implemented by the compiler as a hash table or a similar O(1) lookup type data structure. Our conditional chain would instead take O(n) time, where n is the number of subclasses we have. Which raises the question in my mind:

Wait, is there really no way to switch on a class type?

I was suspicious of this statement. On face value, it’s true. Java does not support switch statements on arbitrary classes. But if we’re using instanceof in each conditional, surely Java is retrieving some property and checking it against each class type. Is there really no way to store this property and use an O(1) lookup algorithm instead?

Solution 2a: Map of classnames

Well, we can grab the name of the class object, which is a String, and Java supports String switch statements

class Printer {
    void printExpr(Expr expr) {
        String className = expr.getClass().getSimpleName();

        switch (className) {
            case "Binary":
                Expr.Binary bin = (Expr.Binary)expr;
                print("BinaryExpr");

                break;
            case "Literal":
                Expr.Literal lit = (Expr.Literal)
                print("LiteralExpr");

                break;
            
            // ...
            default:
                System.out.println("Unhandled expression type: " + className);
                break;
        }
    }

}

However, in doing so we leave the orderly world of of the Java class system and enter the shadow realm of string comparisons. Our code will still throw an error if we pass an unexpected type, but otherwise we forfeit the nice guarantees of Java typing. This approach is passable for a small use case, but prior experience with class name strings quickly become an anti-pattern.

Is there a way we can perform the same logic while staying in the bounds of the Java class system?

Solution 2b: Map of Handlers

Well, if Java won’t create a Hash table for our Class statement, let’s try making our own!

We can use a Consumer, which is a Java feature for functional programming. They were originally implemented to easily run functions on collections and iterators. This approach allows us to construct a map of [class -> function]. As Nystrom himself notes in the book,

The visitor pattern is really all about approximating the functional style within an OOP language.

so really we’re just returning to the problem’s functional roots.

public class ExprHandler {
    private final Map<Class<? extends Expr>, Consumer<Expr>> handlerMap;

    public ExprHandler() {
        handlerMap = Map.of(
            BinaryExpr.class, this::handleBinary,
            LiteralExpr.class, this::handleLiteral
            // ...
        );
    }

    public void handle(Expr expr) {
        Consumer<Expr> handler = handlerMap.get(expr.getClass());
        if (handler != null) {
            handler.accept(expr);
        } else {
            throw new IllegalArgumentException("Unknown expression type: " + expr.getClass());
        }
    }

    private void handleBinary(Expr expr) {
        System.out.println("BinaryExpr");
    }

    private void handleLiteral(Expr expr) {
        System.out.println("LiteralExpr");
    }

}

Let’s not get too excited yet. There’s one main drawback on Consumers: they cannot return values. So our example above works fine because System.out.print() is a side-effect, but we would not be able to use this approach to return an actual value from our AST node: say, the token value. Which is a major no-go for us.

As an aside, it’s notable that we’re having to dig into Java aesoterica in order to try and accomplish our goal. Unless you’re a consistent Java user, I’d be surprised if you’ve heard of Consumers before.

Solution 3: The Visitor pattern

The visitor pattern is more understandable when it’s laid out side-by-side, because it involves ping-ponging between the two classes. Here’s a summary of the visitor pattern:

abstract class Expr {                   interface Visitor {
    void accept();                          void visit();
}                                       }

class Binary extends Expr {             class PrintVisitor implements Visitor {
    void accept(Visitor v) {                void visit(Expr.Binary expr) {
        v.visit(this);                          System.out.println("BinaryNode");
    }                                       }
}
                                            void print(Expr expr) {
                                                expr.accept();
                                            }
                                        }


void main() {
    Expr root = new Expr.Binary();
    Printer p = new PrintVisitor();

    p.print(root);
}

and here it is in action

abstract class Expr {                   interface Visitor {
    void accept();                          void visit();
}                                       }

class Binary extends Expr {             class PrintVisitor implements Visitor {
    void accept(Visitor v) { -------------> void visit(Expr.Binary expr) {
        v.visit(this);   ^----------.           System.out.println("BinaryNode");
    }                               |       }
}                                   |
                                    +------ void print(Expr expr) {
                                    ,------^    expr.accept();
                                    |       }
                                    |   }
                                    |
                                    |
void main() {                       |
    Expr root = new Expr.Binary();  |
    Printer p = new PrintVisitor(); |
                                    |
    p.print(root); -----------------+
}

And if we abstract away some of the extra detail, we’re left with something like

 Main            PrintVisitor            Expr.Binary
------          --------------          -------------
  main()   ---->   print()   -------->    accept()
                                              |
                                              |
                   visit() <------------------+
                      |
                      V
                System.out.println()

 Main            PrintVisitor            Expr.Binary
------          --------------          -------------
  main()
   |
   V
  p.print() ----> print()
                     |
                     V
                 expr.accept() ------------> accept()
                                              |
                                              V
                   visit() <------------- v.visit()
                        |
                        V
                System.out.println()

So why can’t we do this?

class Printer {
    void print(Expr.Binary expr) {} // <- We want to call this one
    void print(Expr.Literal expr) {}
}

main() {
    Expr expr = new Expr.Binary();
    Printer printer = new Printer();

    printer.print(expr);
}

which would correspond to the much simpler call diagram of

 Main            PrintVisitor            Expr.Binary
------          --------------          -------------
 main()    ----->   print()
                      |
                      V
            System.out.println()

 Main            PrintVisitor            Expr.Binary
------          --------------          -------------

 p.print(expr) -----> print(Expr.Binary)
                           |
                           V
                   System.out.println()

This is exactly what we want, except Java will throw an error if we try it. To understand why we this won’t work, we need to understand that in Java, method overloading resolution is static.

Java fact 2: Method overloading resolution is static

That means the Java compiler will decide the type of expr in Expr expr = new Expr.Binary() will be Expr and not Expr.Binary. It will then try to find a function with a signature of print(Expr expr) and will throw an error, because if you look at the Printer class above, no such function exists. If we wanted to define such a general function, we’d end up circling back to Solution 1.

Okay, but how does visitor pattern get around this?

Java fact 3: Method overloading resolution is static, but method dispatch is dynamic.

Here’s a simple explanation of the two:

Method overloading resolution determines which function to call based on the argument type, and happens at compiletime

// Method overloading
Expr expr = new Expr.Binary();
print(expr);

void print(Expr expr) {} // <- Will call this one
void print(Expr.Binary expr) {}

Method dispath determines which object should handle the function call, and happens at runtime

// Method dispatch
Expr expr = new Expr.Binary(); 
expr.printSelf()

Expr {
    void printSelf();
}

Expr.Binary {
    void printSelf(); // <- Will call this one
}

Notice in the first example, the function corresponding to the parent class was invoked, but in the second example, the function for the child class was invoked. That’s the key of the visitor pattern: use two method dispatches to basically convert the objects static class to a dynamic class, so we can use method overloading resolution on the dynamic class. Confusing? Yup. But once you understand this aspect of Java, you understand why the visitor pattern is required.

We can see this if we add types to our call diagram above

 Main            PrintVisitor            Expr.Binary
------          --------------          -------------
  main()
   |
   V
  p.print() ----> print(Expr) // Note type is Expr (parent class)
                     |
                     V
               expr.accept(this) --------> accept(Visitor)
                                                |
                                                V
              visit(Expr.Binary) <-------- v.visit(this)  // Note: `this` passes the dynamic type, not static type
                        |                                 // so type is now Expr.Binary
                        V
             System.out.println()

So they key reason we use the visitor pattern is because we prepend two method dispatches to determine the dynamic class path before we do a static method overloading resolution. The term for this is double dispatch.

Which seems like a lot of work just to pass the dynamic class instead of the static class. But at least we know what we’re trading off when we do so.

Conclusion

Here’s a summary of the journey:

Java fact 1: You can’t switch on class types
Java fact 2: Method overloading resolution is static
Java fact 3: Method overloading resolution is static, but method dispatch is dynamic.

Is the Visitor pattern really the best way to parse ASTs? Maybe. Now I understand that to use method overloading with dynamic types in Java, you have to either use 1) a direct solution that hacks around the Java class system or 2) a roundabout solution that fits well within the class system.

Truthfully, in my personal DBMS project, I’m leaning towards the string approach, because it offers the least complexity in the part of my project I’m not focused on. There may be some drawbacks to this approach that I haven’t foreseen, but that’s part of learning! Now I can rest easy now that I know why Nystrom (and others) advocate for the Visitor pattern :)

While the book overall does a great job at justifying its design decisions, there are a few statements that fall into the “trust me” category. Nystrom is almost certainly right, but I still have issues trusting this statement. I’ll need to read up on the theoretical basis of OOP. Perhaps its an inherent part of learning is to want to challenge such statements and learn the hard way. ↩︎