Consider a programming environment that represents programs as abstract syntax trees. It will need to perform various interpretations on the abstract syntax tree like type-checking, code generation, and pretty-printing. Figure 1 depicts two sample transformations.
The result of a mapping (dashed arrows in figure 1) depends on the interpretation (e.g., compilation) and concrete node type (e.g., assign) involved. One may put all various interpretations (type-check, pretty-print, etc.) into the node interface in order to rely on dynamic binding. However, this is often not a good idea:
The last two arguments of the above list especially apply to data structures other than abstract syntax trees. Consider a data structure that represent the logical structure of a building. It is probably only well after designing the interface to that structure that one wishes to perform some interpretation like computing the total rent income. In this context, it is useful to differentiate between intrinsic properties (nodes have descendents) and extrinsic properties (pretty-print). There is no end to extrinsic properties and it does not make sense to lump all of them into one interface.
Now, if we provide interpretations as external features we are facing a problem with an implementation language that provides single-dispatch only2. As already mentioned, the code to be executed for each node when we traverse an abstract syntax tree depends on two variabilities:
exec-code(node-type, interpretation)
Note that we already rejected node-type.interpretation
with the
argumentation above. The reverse, interpretation.node-type
, does not
make sense, since, unlike the interpretation type, the node type always
changes during tree traversal; that is, dispatch isn't required for the
receiver but for the argument.
What we need is double-dispatch on both node-type and interpretation. Fortunately, there are ways to emulate double-dispatch and its generalization multi-dispatch, with a single-dispatch language. We opt for a solution which can be characterized as external polymorphism (see section 2.10 for Visitor type double-dispatch). Unlike Cleeland et al., however, we do not use a combination of C++ templates, Adapter, and Decorator [2]. We simply use generic functions [6].
When a generic function object is applied to a node, it determines the node's type , creates the corresponding specialized function object, and returns the result of applying the specialized function object to the node.
Figure 2 depicts how concrete element types (IF_THEN) induce the creation of their corresponding specialized functions. A specialized function knows the exact type of its argument and, therefore, can appropriately exploit the argument's full interface.
Note that it is not only natural to deal with generic functions to achieve double-dispatch, but also very natural to employ functions for translations. The approach of formally defining the semantics of a programming language called denotational semantics is entirely based on semantic functions, i.e., functions that transform phrases into denotations [13].
Figure 3 shows the structure diagram that corresponds to the domains used in figure 1. Only relationships relevant to Translator have been included. For instance, language nodes like TY_IF will typically have an aggregation relation with TOY_LANG. Exclamation marks denote places of possible extension (see section 2.7 Extensibility).
Class LANGUAGE in figure 3 is not required in general (see figure 4). Also, it is not required that TY_IF, TY_ASS, etc. have a common ancestor (like TOY_LANG). Hence, one can define semantics on heterogeneous collections where element types may come from different libraries.