Visitor (add new operations over a tree without editing nodes)

When to use

  • You have a tree/AST (e.g., SQL filter) and need multiple operations on it (render SQL, validate, collect columns).
  • You want to add new operations easily while keeping node classes stable.
  • Useful for rule engines, SQL/JSON expression trees, lineage walkers.

Avoid when your structure is tiny—match/if isinstance or singledispatch may be simpler.

Diagram (text)

Expr (node protocol) ── accept(visitor)
  ├─ Col(name)
  ├─ Val(value)
  ├─ Eq(left, right)
  └─ And(items...)

Visitor
  └─ methods: col(), val(), eq(), conj()   ← new visitors = new operations

Python example (≤40 lines, type-hinted)

Render a predicate to parameterized SQL and collect params.

from __future__ import annotations
from dataclasses import dataclass
from typing import Protocol, TypeVar, Generic, List, Iterable, Any

T = TypeVar("T")

class Expr(Protocol):
    def accept(self, v: "Visitor[T]") -> T: ...

class Visitor(Protocol, Generic[T]):
    def col(self, n: "Col") -> T: ...
    def val(self, n: "Val") -> T: ...
    def eq(self, n: "Eq") -> T: ...
    def conj(self, n: "And") -> T: ...

@dataclass(frozen=True)
class Col(Expr): name: str
    def accept(self, v: Visitor[T]) -> T: return v.col(self)

@dataclass(frozen=True)
class Val(Expr): value: Any
    def accept(self, v: Visitor[T]) -> T: return v.val(self)

@dataclass(frozen=True)
class Eq(Expr): left: Expr; right: Expr
    def accept(self, v: Visitor[T]) -> T: return v.eq(self)

@dataclass(frozen=True)
class And(Expr): items: List[Expr]
    def accept(self, v: Visitor[T]) -> T: return v.conj(self)

@dataclass
class Render(Visitor[str]):
    params: List[Any]
    def col(self, n: Col) -> str: return n.name
    def val(self, n: Val) -> str: self.params.append(n.value); return "?"
    def eq(self, n: Eq) -> str: return f"({n.left.accept(self)} = {n.right.accept(self)})"
    def conj(self, n: And) -> str: return " AND ".join(x.accept(self) for x in n.items)

Usage

expr = And([Eq(Col("tenant_id"), Val(42)), Eq(Col("event_time"), Val("2025-11-06"))])
params: List[Any] = []
sql = expr.accept(Render(params))
# sql: "(tenant_id = ?) AND (event_time = ?)"
# params: [42, "2025-11-06"]

Tiny pytest (cements it)

def test_visitor_renders_sql_and_params_order():
    expr = And([Eq(Col("a"), Val(1)), Eq(Col("b"), Val(2))])
    params: list = []
    sql = expr.accept(Render(params))
    assert sql == "(a = ?) AND (b = ?)" and params == [1, 2]

Trade-offs & pitfalls

  • Pros: Easy to add new operations (new Visitor classes); keeps nodes stable; clean separation of concerns.
  • Cons: Adding a new node type forces changes in all visitors (double-dispatch cost).
  • Pitfalls:
    • Visitor methods drifting out of sync with nodes—keep names aligned (col/val/eq/conj).
    • Leaking side effects inside visitors—prefer pure visitors that return values (like Render).
    • Deep recursion—very large trees may need iterative visitors.

Pythonic alternatives

  • functools.singledispatch / singledispatchmethod to dispatch on node type without accept.
  • Structural pattern matching (match node:) for small sets of nodes in Python 3.10+.
  • Dataclass functions: keep nodes dumb, implement separate pure functions per operation.
  • For SQL, consider SQLAlchemy Core or a purpose-built AST instead of hand-rolling.

Mini exercise

Add an Or(items: list[Expr]) node. Update Visitor with disj(self, n: Or), extend Render to join with " OR ", and write a test for: (a = ? OR b = ?) AND (c = ?), params [1, 2, 3].

Checks (quick checklist)

  • Nodes expose accept(visitor); visitors implement one method per node type.
  • New operation = new Visitor class; nodes unchanged.
  • Visitors are pure (ideally) and tested.
  • If you add a node, update all visitors (or provide safe defaults).
  • Consider singledispatch/match when the tree is small.