In this document we will describe Propertied Abstract Trees (PAT), as objects which can be used for representing form and contents of abstract trees. We want to use PAT to describe internal representation of program trees in the different phases of the compilation process. The abstract trees are not arbitrary but do contain allot of structure.

Trees are made out of nodes that are hierarchical connected with each other. With each node we thus can associate a (ordered,empty) set of nodes. With each node we will also associate a node type. It is possible to describe the extra structure in the trees we want to consider by putting restrictions on the form of a node that has a certain node type.

We also want to store information in the tree separate from its form. We do this by associating properties with the nodes. The properties with a node are represented as a set of (name,value)-pairs. All the properties together describe the contents of a tree in the same way as the form of the tree is described by the hierarchical ordering of the different nodes of the tree. Of course it is also possible to structure the sets of properties with a node depending on its node type.

In this section we will describe a universal representation for the trees we want to consider. In the following section we will explain we can filter out those trees that has the desired extra structure.

We can describe trees as a root node with a number (possible zero) of subtrees. With the above we come to the conclusion that we can describe a tree as a tree-tuple of a node type, the properties with the node, and a (possible empty) set of subtrees. Given a set of node types, properties and property values we can give a formal description of the universal set of trees:

dfall trees={(N,{(P_{1},V_{1}),..,(P_{m},V_{m})},(t_{1},..,t_{n}))|Ninnode typesand0<=mandfor all1<=i<=m(P_{i }inpropertiesandV_{i }inproperty values)and0<=nandfor all1<=i<=n(t_{i }inall trees)}

We will use one of the simplest ways of restricting the number of trees that we allow. For each node in the tree its node type shall restrict the node types of its sons and define the set of properties. Each node with a node type has a fixed number of sons, and the node type of each sons has to be in a fixed subset of the node types.

We can describe these restrictions by giving tree rules for each node type that specifies the number of son nodes and a so-called class. With class definitions we define which node types are in a certain class.

In the following chapter we will present a way to describe these restrictions.

In this chapter we describe the syntax of the grammar with which Propertied Abstract Trees can be described.

The grammar that specifies the tree rules with the node types and the class definitions with the classes is the following:

element:class;node type.rule:(class def;tree rule),..class def:class,=,{,node typeLIST,}. tree rule:node type,=>,(part name,:,element)LIST. class:identifier.node type:identifier.part name:identifier.

The properties are defined by the following grammar:

property definition:property name,:,type identifier,OF,elementLIST,..property name:identifier.type identifier:INTEGER;REAL;Boolean;CHAR;user defined type identifier.user defined type identifier:identifier.

In the most simple form only classes are allowed in the right hand side of the tree rules, is there no overlap between the node types in the different class definitions and are properties only allowed at node types. In this section we describe a number of extensions on this simple form. Because these extensions are considered to be useful, but can have extensive consequences, we present a mechanism for the user to tell which extensions he wishes. The idea behind this is that it is possible to make separate implementations which do not support some or all of the extensions. Each extensions has implications on the semantic restrictions on the rest of the abstract tree grammar description. We present the grammar of the extensions:

extension:node types in tree rule;class overlap;class properties.

In each of the following sub-sections we will describe each extension.

This extension allows node types to be valid in the right hand side of tree rules, besides the classes. This extension is useful in those cases where a class exists of a single node type, which does happen often at the leaves of the abstract tree.

This extension states that a node type can be in different class definitions, instead of just one. This extension is useful because it can happen that two almost identical constructs can occur in two different classes that one would like to put together for sake of briefness and clarity.

This extension says that instead of properties only being defined at node types, may also defined with classes (implying that they are now defined with all node types in that specific class). This extension is useful because in some cases one wants to define a property that says something of a specific class instead of the separate node types.

Finally we present the complete grammar with which Propertied Abstract Trees can be described. This grammar just consists of the elements described before, together with enumerations of the classes, node types and user defined type identifiers. And the so-called root is defined. The complete grammar is:

input grammar:(EXTENSIONS,extensionLIST,.)OPTION:(CLASSES,classLIST,.)OPTION,NODETYPES,node typeLIST,.,(TYPES,user defined type identifierLIST,.)OPTION,PROPERTIES,property definitionsSEQ,RULES,ruleSEQ,ROOT,element.

In this chapter we give a formal description of all the semantic restrictions that are imposed on the input grammar. We shall do so by extracting certain sets from parts of the input. With these sets we will describe the restriction. Further more we will derive other sets from these sets that describe certain semantic qualities of the given grammar.

The first thing we come along in the input are the extensions:

EXTENSIONSext_{1},...,ext_{n}.

We construct the set of given extensions, and define the Boolean constants that say whether a extension is given or not:

dfextensions={ext_{1},..,ext_{n}}dfnode types in tree rules=node types in tree rulesinextensionsdfclass overlap=class overlapinextensionsdfclass properties=class propertiesinextensions

The following thing we find is the enumeration of the classes and the node types. Given this part of the grammar:

CLASSESC_{1},...,C_{n}.NODETYPESN_{1},...,N_{m}.

We define the sets ` classes` and

dfclasses={C_{1},..,C_{n}}dfnodes={N_{1},..,N_{m}}

Because an identifier can not represent a class and a node type the sets have to be disjoint:

classesdisjointnodes

We also would like to define the set of ` elements` as the union of these two sets:

dfelements=classesunionnodes

The next thing we get in the grammar is the enumeration of the user defined type identifiers. First of all we define the four standard types:

dfstandardtypeidentifiers={INTEGER,REAL,Boolean,CHAR}

The next part of the grammar that gives the enumeration looks like:

TYPEST_{1},...,T_{n}.

The set *user**defined**type*` identifiers` is defined by:

dfuserdefinedtypeidentifiers={T_{1},..,T_{n}}

Because the standard type identifiers may not be redefined, we get the following restriction:

userdefinedtypeidentifiersdisjointstandardtypeidentifiers

Now we can define the set *type*` identifiers` as the union of the standard and user defined type identifiers:

dftypeidentifiers=standardtypeidentifiersunionuserdefinedtypeidentifiers

The following part of the grammar we come along is the definition of the properties, which is given by:

PROPERTYpropdef_{1}.... propdef_{n}.

The set *property*` definitions` is derived from the grammar in the following way:

dfpropertydefinitions={propdef_{1},..,propdef_{n}}

The definition of a property consists of two parts : a type definition which defines the type of the property and a set of elements at which the property is defined with. This set contains node types only when the class properties-extension is not given. This results in the following restrictions on the property definitions:

for all"P:TOFE_{1}.. E_{n}"inattributedefinitions(Tintypeidentifiersand{E_{1},..,E_{n}}subsetelementsandnotclass properties=>{E_{1},..,E_{n}}subsetnodes)

The set ` properties` is defined as the set of all property identifiers that occur in the property definitions:

dfproperties={P|"P:TOFE_{1}.. E_{n}"inpropertydefinitions}

The restrictions that property definition has to be unambiguous is give by:

for allPinproperties(exists!"P:TOFE_{1}.. E_{n}"inpropertydefinitions)

When this restriction is satisfied we can define the function *type*` of` by:

forall"P:TOFE_{1}.. E_{n}"inpropertydefinitionsdftypeof(P)=T

Finally we present the definition of the function *prop*` of` that returns the properties that are defined with an element:

forallEinelementsdfpropof(E)={P|"P:TOF.. E .."inpropertydefinitions}

The greatest part of the grammar description consists of the rules that describe the actual tree grammar. These rules can are divided into the class definitions and the tree rules. In the grammar description the two kinds of rules are intermixed, and the tree rules can be followed by their attribute assignments (which we will neglect for the time being). When the grammar looks like:

RULESrule_{1}.... rule_{n}.

The following set can be defined:

dfrules={rule_{1},..,rule_{n}}dfclassdefinitions={"C={N_{1},..,N_{n}}"inrules}dftreerules={"N=>P_{1}:E_{1},..,P_{n}:E_{n}"inrules}

First the two kinds of rules will be dealt with separately in two successive sections. In the following sections overall consistency restrictions will be described.

Class definitions can only be defined with classes and have to be defined as a set of node types. Which is described by:

for all"C={N_{1},..,N_{n}}"inclassdefinition(Cinclassesand{N_{1},..,N_{n}}subsetnodes)

For each class that has to be exactly one class definition. resulting in the restriction:

for allCinclasses(exists! "C={N_{1},..,N_{n}}"inclass definitions)

When this restriction is fulfilled we can define the function ` class` that returns the set of node types by which the class is defined:

forall"C={N_{1},..,N_{n}}"inclass definitionsdfclass(C)={N_{1},..,N_{n}}

We also define the set of all node types that do occur in a class definition:

dfallnode typesinclass={E|Einclass(C)andCinclasses}

At last we define the function ` parents of` that returns all the classes in which a node type is defined:

forallNinnode typesdfparentsof(N)={Cinclasses|Ninclass(C)}

The extra restriction which occurs when the extension class overlap is not given is represented by:

notclass overlap=>for allNinnode types in classes(parents of(N)={C})

Now we describe the restrictions on the set of tree rules. In each tree rule the part names have to be unique and the parts have to be elements, even classes when the extension node types in tree rules is not active. This is described by:

for all"N=>P_{1}:E_{1},..,P_{n}:E_{n}"intreerules(Einelementsand{E_{1},..,E_{n}}subsetelements)andnotnode types in tree rule=> {E_{1},..,E_{n}}subsetclasses)andfor all1<=i<j<=n(P_{i}!=P_{j}))

For each node type there is precisely one tree rule, which is expressed by:

for allNinnodes(exists! "N=>P_{1}:E_{1},..,P_{n}:E_{n}"intreerules)

When this restriction is fulfilled we (again) can define function, called ` tree`, to bring the information on higher abstraction level:

forallEinnodesdftree(N)=((P_{1},E_{1}),..,(P_{n},E_{n}))where"N=>P_{1}:E_{1},..,P_{n}:E_{n}"intreerulesendwhere

Remark: This function returns a ordered set, so that the information of ordering of the parts is preserved for further uses.

The root element is defined in the following way in the grammar description:

ROOTroot

We define ` root `with:

dfroot=root

The only restriction on the root is that it has to an element:

rootinelements

In this section we will discuss some restrictions with respect to the properties and the class definitions. These restrictions are only necessary with the extension class properties, otherwise the restriction we will describe do not put extra restrictions on the grammar description. When we allow properties at classes we have to define how they are interpreted. We start with the definition of the function ` all prop of`, that returns all the properties assign to an element, with respect to the class definitions:

forallNinnodesdfallpropof(N)=prop of(N)unionunion{prop of(C)|Ninclass(C)}

It is not allowed to define a property with a class and a node type in this class in the grammar description. We call this a parent-child conflict. The extra restriction is described by:

for allCinclasses,Ninclass(C)(prop of(C)disjointpropof(N))

When a node type belongs to more than one class (this is only possible with the class overlap extension), the sets of properties defined with these classes should in the grammar description should be equal. If this is not the case we have a so-called parent-parent conflict. The extra restriction we get is described by:

for allNinelements,C_{1},C_{2}inparentsof(N)(prop of(C_{1})=prop of(C_{2}))

In this final chapter we shall give a function that with the grammar description of the previous chapters returns the set of all possible Properties Abstract Trees with this grammar.

First of all we define the function ` trees from` that returns for each element all the trees that start from that element:

for allCinclasses,Ninnode typesdftrees from(C)={trees from(N)|Ninclass(C)},trees from(N)={(N,{(P_{1},V_{1}),..,(P_{m},V_{m})},(t_{1},..,t_{n}))|{P_{1},..,P_{m}}=all prop of(N)andfor all1<=j<=m(V_{j}indomain of(P_{j}))andfor all1<=j<=n(t_{i}intrees from(E_{i}))where((P_{1},E_{1}),..,(P_{n},E_{n}))=tree(N)endwhere}

In this definition we make use of the function ` domain of` which we consider as given, and representing the value domain of a certain property.

Now we can define the function that returns all trees starting with the root:

dfall trees=trees from(root)

My life as a hacker | My home page