Ternary search tree: Difference between revisions
Explanation for the number of nodes in a compressed TST reworded. |
Description of the search algorithm simplified. |
||
Line 1: | Line 1: | ||
In [[computer science]], a '''ternary search tree''' is a special [[Trie|trie data structure]] where the child nodes of a standard trie are ordered as a [[binary search tree]]. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the |
In [[computer science]], a '''ternary search tree''' is a special [[Trie|trie data structure]] where the child nodes of a standard trie are ordered as a [[binary search tree]]. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the string is thereby compared to the character at the current node: |
||
* if it is less, then the search goes to the left child node, |
* if it is less, then the search goes to the left child node, |
||
* if it is greater, then the search goes to the right child node, |
* if it is greater, then the search goes to the right child node, |
||
* if it is equal, then the search goes to the middle child node and proceeds to the next character in the |
* if it is equal, then the search goes to the middle child node and proceeds to the next character in the string. |
||
The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us": |
The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us": |
||
Line 16: | Line 16: | ||
As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix. |
As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix. |
||
Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length ''m'' in a balanced ternary search tree with ''n'' strings requires at most ''m'' + log<sub>2</sub>(''n'') character comparisons. Roughly speaking, each comparison either consumes one character of the |
Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length ''m'' in a balanced ternary search tree with ''n'' strings requires at most ''m'' + log<sub>2</sub>(''n'') character comparisons. Roughly speaking, each comparison either consumes one character of the string or cuts the search space in half. |
||
A '''compressed''' ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a tree is less than twice the number of strings: for each string that is inserted into the tree, at most one new node is added and at most one existing node is split into two nodes. |
A '''compressed''' ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a tree is less than twice the number of strings: for each string that is inserted into the tree, at most one new node is added and at most one existing node is split into two nodes. |
Revision as of 14:16, 17 November 2011
In computer science, a ternary search tree is a special trie data structure where the child nodes of a standard trie are ordered as a binary search tree. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the string is thereby compared to the character at the current node:
- if it is less, then the search goes to the left child node,
- if it is greater, then the search goes to the right child node,
- if it is equal, then the search goes to the middle child node and proceeds to the next character in the string.
The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us":
c / | \ a u h | | | \ t t e u / / | / | s p e i s
As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix.
Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length m in a balanced ternary search tree with n strings requires at most m + log2(n) character comparisons. Roughly speaking, each comparison either consumes one character of the string or cuts the search space in half.
A compressed ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a tree is less than twice the number of strings: for each string that is inserted into the tree, at most one new node is added and at most one existing node is split into two nodes.
See also
References
External links
- Ternary Search Trees
- Tree::Ternary (Perl module)
- Ternary Search Tree code
- STL-compliant Ternary Search Tree in C++
- Ternary Search Tree in C++
- Ternary Search Tree in Ruby
- pytsh - C++ Ternary Search Tree implementation with Python bindings
- Algorithm for generating search strings given a Ternary Search Tree