Tree contraction: Difference between revisions

Content deleted Content added

Inline

Revision as of 01:36, 12 December 2015

In computer science, parallel tree contraction is a broadly applicable technique for the parallel solution of a large number of tree problems, and is used as an algorithm design technique for the design of a large number of parallel graph algorithms. Parallel tree contraction was introduced by Gary L. Miller and John H. Reif ^[1], and has subsequently been modified to improve efficiency by X. He and Y. Yesha ^[2], Hillel Gazit, Gary L. Miller and Shang-Hua Teng ^[3] and many others.

Tree contraction has been used in designing many efficient parallel algorithms, including expression evaluation, finding lowest common ancestors, tree isomorphism, graph isomorphism, maximal subtree isomorphism, common subexpression elimination, computing the 3-connected components of a graph, and finding an explicit planar embedding of a planar graph.

Based on the research and work on parallel tree contraction, various algorithms have been proposed targeting to improve the effeciency or simplicity of this topic. This article hereby focuses on a particular solution, which is a variant of the algorithm by Miller and Reif, and its application.

Introduction

Over the past several decades there has been significant research on deriving new parallel algorithms for a variety of problems, with the goal of designing highly parallel (polylogarithmic depth), work-efficient (linear in the sequential running time) algorithms ^[4]. For some problems, tree turns out to be a nice solution. Addressing these problems, we can sometimes get more parallelism simply by representing our problem as a tree.

Considering a generic definition of a tree, there is a root vertex, and several child vertices attached to the root. And the child vertices might have children themselves, and so on so forth. Eventually, the paths come down to leaves, which are defined to be the terminal of a tree. Then based on this generic tree, we can further come up with some special cases: (1) balanced binary tree; (2) linked list. A balanced binary tree has exactly two branches for each vertex except for leaves. This gives a O(log n) bound on the depth of the tree. A linked list is also a tree where every vertex has only one child. We can also achieve O(log n) depth using symmetry breaking.

Given the general case of a tree, we would like to keep the bound at O(log n) no matter it is unbalanced or list-like or a mix of both. To address this problem, we make use of an algorithm called prefix sum by using the Euler tour technique. With the Euler tour technique, a tree could be represented in a flat style, and thus prefix sum could be applied to an arbitrary tree in this format. In fact, prefix sum can be used on any set of values and binary operation which form a group: the binary operation must be associative, every value must have an inverse, and there exists an identity value.

With a bit of thought, we can find some exceptional cases where prefix sum becomes incapable or inefficient. Consider the example of multiplication when the set of values includes 0. Or there are some commonly desired operations are max() and min() which do not have inverses. The goal is to seek an algorithm which works on all trees, in expected O(n) work and O(log n) depth. In the following sections, a Rake/Compress algorithm will be proposed to fulfill this goal.

Definitions

Before going into the algorithm itself, we first look at a few terminologies that will be used later.

Rake – The rake is an operation which contracts the children of a node into the node, using the binary operation to combine them. It applies when the node has more than one child.
Compress – A node is considered a chain node if it has only one child. Compress shortens a chain by using the random-mate technique ^[5] drawn from symmetry-breaking algorithms ^[6] on linked lists.

Analysis

For the moment, let us assume that all nodes have less than three children, namely binary. Generally speaking, as long as the degree is bounded, the bounds will hold. But we will analyze the binary case for simplicity. In the two “degenerate” cases listed above, the rake is the best tool for dealing with balanced binary trees, and compress is the best for linked lists. However, arbitrary trees will have to require a combination of these operations. By this combination, we claim a theorem that

Theorem: After O(log n) expected rake and compress steps, a tree is reduced to a single node.

Now rephrase the tree contraction algorithm as follows:

Input: A binary tree rooted at r
Output: A single node
Operation: A sequence of contraction steps, each consisting of a rake operation and a compress operation (in any order). The rake operation removes all the leaf nodes in parallel. The compress operation finds an independent set of unary nodes and splice out the selected nodes.

To approach the theorem, we first take a look at a property of a binary tree. Given a binary tree T, we can partition the nodes of T into 3 groups: T0 contains all leaf nodes, T1 contains all nodes with 1 child, and T2 contains all nodes with 2 children. It is easy to see that: $V(T)=T0\cup T1\cup T2$ . Now we propose:

Claim: $|T0|=|T2|+1$

This claim can be proved by strong induction on the number of nodes. It is easy to see that the base case of n=1 trivially holds. And we further assume the claim also holds for any tree with at most n nodes. Then given a tree with n+1 nodes rooted at r, there appears to be two cases: (1) If r has only one subtree, consider the subtree of r. We know that the subtree has the same number of binary nodes and the same number of leaf nodes as the whole tree itself. This is true since the root is a unary node. And based the previous assumption, a unary node does not change either T0 or T2. (2) If r has two subtrees, we define T0^L, T2^L to be the leaf nodes and binary nodes in the left subtree, respectively. Similarly, we define the same T0^R, T2^R for the right subtree. From previous, there is $|T0^{L}|=|T2^{L}|+1$ and $|T0^{R}|=|T2^{R}|+1$ . Also we know that T has $|T0^{L}|+|T0^{R}|$ leaf nodes and $|T2^{L}|+|T2^{R}|+1$ binary nodes. Thus, we can derive:

$|T0^{L}|+|T0^{R}|=|T2^{L}|+1+|T2^{R}|+1=(|T2^{L}|+|T2^{R}|+1)+1$

which proves the claim.

Following the claim, we then prove a lemma, which leads us to the theorem.

Lemma: The number of nodes of after a contraction step is reduced by a constant factor in expectation.

Assume the number of nodes before the contraction to be m, and m' after the contraction. By definition, the rake operation deletes all T0 and the compress operation deletes at least 1/4 of T1 in expectation. All T2 remains. Therefore, we can see:

$E[m']\leq |T2|+{\tfrac {3}{4}}*|T1|\leq {\tfrac {3}{4}}+{\tfrac {3}{4}}*|T1|+{\tfrac {3}{2}}*|T2|={\tfrac {3}{4}}(1+|T1|+2*|T2|)={\tfrac {3}{4}}(|T0|+|T1|+|T2|)={\tfrac {3}{4}}m$

Finally, based on this lemma, we can conclude that if the nodes are reduced by a constant factor in each iteration, after O(log n), there will be only one node left.

Applications

Line Breaking

Expression Evaluation

Tree Isomorphic

References

^ Gary L. Miller and John H. Reif, Parallel Tree Contraction--Part I: Fundamentals., 1989
^ X. He and Y. Yesha, "Binary tree algebraic computation and parallel algorithms for simple graphs.", Journal of Algorithms, 1988, pp 92-113
^ Hillel Gazit, Gary L. Miller and Shang-Hua Teng, Optimal tree contraction in the EREW model, Springer, 1988
^ Karl Abrahamson and et al., "A simple parallel tree contraction algorithm.", Journal of Algorithms, 1989, pp 287-302
^ Andrew Goldberg, Serge Plotkin and Gregory Shannon, "Parallel symmetry-breaking in sparse graphs.", Proceedings of the nineteenth annual ACM symposium on Theory of computing (ACM), 1987
^ Johannes Schneider and Roger Wattenhofer, "A new technique for distributed symmetry breaking.", Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing (ACM), 2010

Donald Knuth. The Art of Computer Programming: Fundamental Algorithms, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89683-4 . Section 2.3: Trees, pp. 308–423.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7 . Section 10.4: Representing rooted trees, pp. 214–217. Chapters 12–14 (Binary Search Trees, Red-Black Trees, Augmenting Data Structures), pp. 253–320.

External links

Applications of Parallel Tree Contraction by Samuel Yeom
6.851: Advanced Data Structures by Prof. Erik Demaine
Parallel Algorithms by Guy Blelloch

[1] Gary L. Miller and John H. Reif, Parallel Tree Contraction--Part I: Fundamentals., 1989

[2] X. He and Y. Yesha, "Binary tree algebraic computation and parallel algorithms for simple graphs.", Journal of Algorithms, 1988, pp 92-113

[3] Hillel Gazit, Gary L. Miller and Shang-Hua Teng, Optimal tree contraction in the EREW model, Springer, 1988

[4] Karl Abrahamson and et al., "A simple parallel tree contraction algorithm.", Journal of Algorithms, 1989, pp 287-302

[5] Andrew Goldberg, Serge Plotkin and Gregory Shannon, "Parallel symmetry-breaking in sparse graphs.", Proceedings of the nineteenth annual ACM symposium on Theory of computing (ACM), 1987

[6] Johannes Schneider and Roger Wattenhofer, "A new technique for distributed symmetry breaking.", Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing (ACM), 2010

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 72: / Line 72: @@
 {{Data structures}}
-{{DEFAULTSORT:Tree (Data Structure)}}
-[[Category:Data types]]
-[[Category:Trees (data structures)| ]]
 [[Category:Knowledge representation]]

v t e Tree data structures
Search trees (dynamic sets/associative arrays)	2–3 2–3–4 AA (a,b) AVL B B+ B* B^x (Optimal) Binary search Dancing HTree Interval Order statistic Palindrome (Left-leaning) Red–black Scapegoat Splay T Treap UB Weight-balanced
Heaps	Binary Binomial Brodal d-ary Fibonacci Leftist Pairing Skew binomial Skew van Emde Boas Weak
Tries	Ctrie C-trie (compressed ADT) Hash Radix Suffix Ternary search X-fast Y-fast
Spatial data partitioning trees	Ball BK BSP Cartesian Hilbert R k-d (implicit k-d) M Metric MVP Octree PH Priority R Quad R R+ R* Segment VP X
Other trees	Cover Exponential Fenwick Finger Fractal tree index Fusion Hash calendar iDistance K-ary Left-child right-sibling Link/cut Log-structured merge Merkle PQ Range SPQR Top

v t e Data structures
Types	Collection Container
Abstract	Associative array Multimap Retrieval Data Structure List Stack Queue Double-ended queue Priority queue Double-ended priority queue Set Multiset Disjoint-set
Arrays	Bit array Circular buffer Dynamic array Hash table Hashed array tree Sparse matrix
Linked	Association list Linked list Skip list Unrolled linked list XOR linked list
Trees	B-tree Binary search tree AA tree AVL tree Red–black tree Self-balancing tree Splay tree Heap Binary heap Binomial heap Fibonacci heap R-tree R* tree R+ tree Hilbert R-tree Trie Hash tree
Graphs	Binary decision diagram Directed acyclic graph Directed acyclic word graph
List of data structures