Jump to content

Regular tree grammar

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Jochen Burghardt (talk | contribs) at 22:58, 10 August 2013 (See also: typo). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science, a regular tree grammar (RTG)[1] is a formal grammar that describes a set of directed trees.

Definition

A regular tree grammar is defined by the tuple

,

where

  • is a set of nonterminals,
  • is a ranked alphabet (i.e., an alphabet whose symbols have an associated arity) disjoint from ,
  • is the starting nonterminal, with , and
  • is a set of productions of the form , where , and , where is the associated term algebra, i.e. the set of all trees composed from symbols in according to their arities, where nonterminals are considered nullary.

Derivation of trees

The grammar implicitly defines a set of trees: any tree that can be derived from using the rule set is said to be described by . This set of trees is known as the language of . To express this more formally, we define the relation on the set as follows:

We say that can be derived in a single step into a tree (in short: ), if there is a context and a production such that:

  • , and
  • .

Here, a context means a tree with exactly one hole in it; if is such a context, we denote by the result of filling the tree into the hole of .

The tree language generated by is the language .

Here, denotes the set of all trees composed from symbols of , while denotes successive applications of .

A language generated by some regular tree grammar is called a regular tree language.

Examples

Example derivation from G1

Let , where

  • is our set of nonterminals,
  • is our ranked alphabet, arities indicated by dummy arguments (i.e. the symbol has arity 2),
  • is our starting nonterminal, and
  • the set consists of the following productions:

An example derivation of the term from the grammar is shown in the image.

The tree language generated by is the set of all finite lists of boolean values, that is, happens to equal . The grammar corresponds to the algebraic data type declarations

  datatype Bool
    = false
    | true
  datatype BList
    = nil
    | cons of Bool * BList

in the Standard ML programming language: every member of corresponds to a Standard-ML value of type BList.

For another example, let , using the nonterminal set and the alphabet from above, but extending the production set by , consisting of the following productions:

The language is the set of all finite lists of boolean values that contain at least once. The set has no datatype counterpart in Standard ML, nor in any other functional language. It is a proper subset of . The above example term happens to be in , too, as the following derivation shows:

Language properties

If both are regular tree languages, then the tree sets , , and are also regular tree languages, and it is decidible whether , and whether .

Alternative characterizations and relation to other formal languages

As shown by Rajeev Alur and Parthasarathy Madhusudan[2][3] the class of regular tree languages coincides with nested words and visibly pushdown languages.

The regular tree languages are also[4] the languages recognized by bottom-up tree automata and nondeterministic top-down tree automata.

Regular tree grammars are a generalization of regular word grammars.

See also

  • Tree-adjoining grammar
  • A book devoted to tree grammars: Nivat, Maurice; Podelski, Andreas (1992). Tree Automata and Languages. Studies in Computer Science and Artificial Intelligence. Vol. 10. North-Holland.
  • Regular tree grammars were already described in 1968 by:
    • Brainerd, W.S. (1968). "The Minimalization of Tree Automata". Information and Control. 13: 484–491.
    • Thatcher, J.W.; Wright, J.B. (1968). "Generalized Finite Automata Theory with an Application to a Decision Problem of Second-Order Logic". Mathematical Systems Theory. 2 (1).
  • Applications of regular tree grammars include:
    • Instruction selection in compiler code generation:Emmelmann, Helmut (1991). "Code Selection by Regularly Controlled Term Rewriting". Code Generation - Concepts, Tools, Techniques. Workshops in Computing. Springer. pp. 3–29. {{cite conference}}: Cite has empty unknown parameter: |1= (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
    • A decision procedure for the first-order logic theory of formulas over equality () and set membership () as the only predicates: Comon, Hubert (1990). "Equational Formulas in Order-Sorted Algebras". Proc. ICALP. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
    • Solving constraints about mathematical sets: Gilleron, R.; Tison, S.; Tommasi, M. (1993). "Solving Systems of Set Constraints using Tree Automata". 10th Annual Symposium on Theoretical Aspects of Computer Science. LNCS. Vol. 665. Springer. pp. 505–514. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
    • The set of all truths expressible in first-order logic about a finite algebra is always a regular tree language: Burghardt, Jochen (2002). "Axiomatization of Finite Algebras" (PDF). Advances in Artificial Intelligence. LNAI. Vol. 2479. Springer. pp. 222–234. ISBN 3-540-44185-9. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Algorithms on regular tree grammars are discussed from an efficiency-oriented view in: Aiken, A.; Murphy, B. (1991). "Implementing Regular Tree Expressions". ACM Conference on Functional Programming Languages and Computer Architecture. pp. 427–447. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Given a mapping from trees to weights, Knuth's generalization of Dijkstra's shortest-path algorithm can be applied to a regular tree grammar to compute for each nonterminal the minimum weight of a derivable tree: Knuth, D.E. (1977). "A Generalization of Dijkstra's Algorithm". Information Processing Letters. 6 (1): 1–5.. Based on this information, it is straight-forward to enumerate its language in increasing weight order. In particular, any nonterminal with infinite minimum weight produces the empty language.
  • Regular tree automata have been generalized to admit equality tests between sibling nodes in trees: Bogaert, B.; Tison, Sophie (1992). "Equality and Disequality Constraints on Direct Subterms in Tree Automata". Proc. 9th STACS. LNCS. Vol. 577. Springer. pp. 161–172. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Allowing equality tests between deeper nodes leads to undecidibility: Tommasi, M. (1991), Automates d'Arbres avec Tests d'Égalités entre Cousins Germains, LIFL-IT

References

  1. ^ "Regular tree grammars as a formalism for scope underspecification". CiteSeerx10.1.1.164.5484. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1145/1007352.1007390, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1145/1007352.1007390 instead.
  3. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1145/1516512.1516518, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1145/1516512.1516518 instead.
  4. ^ Comon et al, Tree Automata Techniques and Applications, 1997