Jump to content

Regular tree grammar: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
converted ((math)) to unicode as far as possible, to speed up page rendering
Derivation of trees: removed "we"; boldfaced def.d notions
Line 15: Line 15:


==Derivation of trees==
==Derivation of trees==
The grammar ''G'' implicitly defines a set of trees: any
The grammar ''G'' implicitly defines a set of trees: any tree that can be derived from ''Z'' using the rule set ''P'' is said to be '''described''' by ''G''.
This set of trees is known as the '''[[Formal language|language]]''' of ''G''.
tree that can be derived from ''Z'' using the rule set ''P'' is said to be described by ''G''.
More formally, the relation ⇒<sub>''G''</sub> on the set ''T''<sub>Σ</sub>(''N'') is defined as follows:
This set of trees is known as the [[Formal language|language]] of ''G''.
To express this more formally,
we define the relation ⇒<sub>''G''</sub> on the set ''T''<sub>Σ</sub>(''N'') as follows:


We say that ''t''<sub>1</sub>∈ ''T''<sub>Σ</sub>(''N'') can be derived in a single step into a tree ''t''<sub>2</sub> ∈ ''T''<sub>Σ</sub>(''N'') (in short: ''t''<sub>1</sub> ⇒<sub>''G''</sub> ''t''<sub>2</sub>), if there is a context''S'' and a production (''A'' ''t'') ∈ ''P'' such that:
A tree ''t''<sub>1</sub>∈ ''T''<sub>Σ</sub>(''N'') can be '''derived in a single step''' into a tree ''t''<sub>2</sub> ∈ ''T''<sub>Σ</sub>(''N'')
(in short: ''t''<sub>1</sub> ⇒<sub>''G''</sub> ''t''<sub>2</sub>), if there is a context ''S'' and a production (''A''→''t'') ∈ ''P'' such that:


* ''t''<sub>1</sub> = ''S''[''A''], and
* ''t''<sub>1</sub> = ''S''[''A''], and
* ''t''<sub>2</sub> = ''S''[''t''].
* ''t''<sub>2</sub> = ''S''[''t''].


Here, a ''context'' means a tree with exactly one hole in it; if ''S'' is such a context, we denote by ''S''[''t''] the result of filling the tree ''t'' into the hole of''S''.
Here, a ''context'' means a tree with exactly one hole in it; if ''S'' is such a context, ''S''[''t''] denotes the result of filling the tree ''t'' into the hole of ''S''.


The tree language generated by ''G'' is the language ''L''(''G'') = { ''t'' ∈ ''T''<sub>Σ</sub> | ''Z'' ⇒<sub>''G''*</sub> ''t'' }.
The tree language generated by ''G'' is the language ''L''(''G'') = { ''t'' ∈ ''T''<sub>Σ</sub> | ''Z'' ⇒<sub>''G''*</sub> ''t'' }.
Line 32: Line 31:
Here, ''T''<sub>Σ</sub> denotes the set of all trees composed from symbols of Σ, while ⇒<sub>''G''*</sub> denotes successive applications of ⇒<sub>''G''</sub>.
Here, ''T''<sub>Σ</sub> denotes the set of all trees composed from symbols of Σ, while ⇒<sub>''G''*</sub> denotes successive applications of ⇒<sub>''G''</sub>.


A language generated by some regular tree grammar is called a regular [[tree language]].
A language generated by some regular tree grammar is called a '''regular [[tree language]]'''.


==Examples==
==Examples==

Revision as of 18:49, 17 October 2013

In computer science, a regular tree grammar (RTG)[1] is a formal grammar that describes a set of directed trees.

Definition

A regular tree grammar G is defined by the tuple

G = (N, Σ, Z, P),

where

  • N is a set of nonterminals,
  • Σ is a ranked alphabet (i.e., an alphabet whose symbols have an associated arity) disjoint from N,
  • Z is the starting nonterminal, with ZN, and
  • P is a set of productions of the form At, where AN, and tTΣ(N), where TΣ(N) is the associated term algebra, i.e. the set of all trees composed from symbols in Σ ∪ N according to their arities, where nonterminals are considered nullary.

Derivation of trees

The grammar G implicitly defines a set of trees: any tree that can be derived from Z using the rule set P is said to be described by G. This set of trees is known as the language of G. More formally, the relation ⇒G on the set TΣ(N) is defined as follows:

A tree t1TΣ(N) can be derived in a single step into a tree t2TΣ(N) (in short: t1G t2), if there is a context S and a production (At) ∈ P such that:

  • t1 = S[A], and
  • t2 = S[t].

Here, a context means a tree with exactly one hole in it; if S is such a context, S[t] denotes the result of filling the tree t into the hole of S.

The tree language generated by G is the language L(G) = { tTΣ | ZG* t }.

Here, TΣ denotes the set of all trees composed from symbols of Σ, while ⇒G* denotes successive applications of ⇒G.

A language generated by some regular tree grammar is called a regular tree language.

Examples

Example derivation from G1

Let G1 = (N11,Z1,P1), where

  • N1 = {Bool, BList } is our set of nonterminals,
  • Σ1 = { true, false, nil, cons(.,.) } is our ranked alphabet, arities indicated by dummy arguments (i.e. the symbol cons has arity 2),
  • Z1 = BList is our starting nonterminal, and
  • the set P1 consists of the following productions:
    • Boolfalse
    • Booltrue
    • BListnil
    • BListcons(Bool,BList)

An example derivation of the term cons(false,cons(true,nil)) from the grammar G1 is shown in the image.

The tree language generated by G1 is the set of all finite lists of boolean values, that is, L(G1) happens to equal TΣ1. The grammar G1 corresponds to the algebraic data type declarations

  datatype Bool
    = false
    | true
  datatype BList
    = nil
    | cons of Bool * BList

in the Standard ML programming language: every member of L(G1) corresponds to a Standard-ML value of type BList.

For another example, let G2 = (N11,BList1,P1P2), using the nonterminal set and the alphabet from above, but extending the production set by P2, consisting of the following productions:

  • BList1cons(true,BList)
  • BList1cons(false,BList1)

The language L(G2) is the set of all finite lists of boolean values that contain true at least once. The set L(G2) has no datatype counterpart in Standard ML, nor in any other functional language. It is a proper subset of L(G1). The above example term happens to be in L(G2), too, as the following derivation shows:

BList1cons(false,BList1) ⇒ cons(false,cons(true,BList)) ⇒ cons(false,cons(true,nil)).

Language properties

If L1, L2 both are regular tree languages, then the tree sets L1L2, L1L2, and L1 \ L2 are also regular tree languages, and it is decidible whether L1L2, and whether L1 = L2.

Alternative characterizations and relation to other formal languages

As shown by Rajeev Alur and Parthasarathy Madhusudan[2][3] the class of regular tree languages coincides with nested words and visibly pushdown languages.

The regular tree languages are also[4] the languages recognized by bottom-up tree automata and nondeterministic top-down tree automata.

Regular tree grammars are a generalization of regular word grammars.

See also

  • Tree-adjoining grammar
  • A book devoted to tree grammars: Nivat, Maurice; Podelski, Andreas (1992). Tree Automata and Languages. Studies in Computer Science and Artificial Intelligence. Vol. 10. North-Holland.
  • Regular tree grammars were already described in 1968 by:
    • Brainerd, W.S. (1968). "The Minimalization of Tree Automata". Information and Control. 13: 484–491.
    • Thatcher, J.W.; Wright, J.B. (1968). "Generalized Finite Automata Theory with an Application to a Decision Problem of Second-Order Logic". Mathematical Systems Theory. 2 (1).
  • Applications of regular tree grammars include:
    • Instruction selection in compiler code generation:Emmelmann, Helmut (1991). "Code Selection by Regularly Controlled Term Rewriting". Code Generation - Concepts, Tools, Techniques. Workshops in Computing. Springer. pp. 3–29. {{cite conference}}: Cite has empty unknown parameter: |1= (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
    • A decision procedure for the first-order logic theory of formulas over equality (=) and set membership (∈) as the only predicates: Comon, Hubert (1990). "Equational Formulas in Order-Sorted Algebras". Proc. ICALP. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
    • Solving constraints about mathematical sets: Gilleron, R.; Tison, S.; Tommasi, M. (1993). "Solving Systems of Set Constraints using Tree Automata". 10th Annual Symposium on Theoretical Aspects of Computer Science. LNCS. Vol. 665. Springer. pp. 505–514. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
    • The set of all truths expressible in first-order logic about a finite algebra is always a regular tree language: Burghardt, Jochen (2002). "Axiomatization of Finite Algebras" (PDF). Advances in Artificial Intelligence. LNAI. Vol. 2479. Springer. pp. 222–234. ISBN 3-540-44185-9. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Algorithms on regular tree grammars are discussed from an efficiency-oriented view in: Aiken, A.; Murphy, B. (1991). "Implementing Regular Tree Expressions". ACM Conference on Functional Programming Languages and Computer Architecture. pp. 427–447. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Given a mapping from trees to weights, Knuth's generalization of Dijkstra's shortest-path algorithm can be applied to a regular tree grammar to compute for each nonterminal the minimum weight of a derivable tree: Knuth, D.E. (1977). "A Generalization of Dijkstra's Algorithm". Information Processing Letters. 6 (1): 1–5.. Based on this information, it is straight-forward to enumerate its language in increasing weight order. In particular, any nonterminal with infinite minimum weight produces the empty language.
  • Regular tree automata have been generalized to admit equality tests between sibling nodes in trees: Bogaert, B.; Tison, Sophie (1992). "Equality and Disequality Constraints on Direct Subterms in Tree Automata". Proc. 9th STACS. LNCS. Vol. 577. Springer. pp. 161–172. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  • Allowing equality tests between deeper nodes leads to undecidibility: Tommasi, M. (1991), Automates d'Arbres avec Tests d'Égalités entre Cousins Germains, LIFL-IT

References

  1. ^ "Regular tree grammars as a formalism for scope underspecification". CiteSeerx10.1.1.164.5484. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1145/1007352.1007390, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1145/1007352.1007390 instead.
  3. ^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1145/1516512.1516518, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1145/1516512.1516518 instead.
  4. ^ Comon et al, Tree Automata Techniques and Applications, 1997