Jump to content

Ternary search tree: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Explanation for the number of nodes in a compressed TST reworded.
Description of the search algorithm simplified.
Line 1: Line 1:
In [[computer science]], a '''ternary search tree''' is a special [[Trie|trie data structure]] where the child nodes of a standard trie are ordered as a [[binary search tree]]. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the search string is thereby compared to the character at the current node:
In [[computer science]], a '''ternary search tree''' is a special [[Trie|trie data structure]] where the child nodes of a standard trie are ordered as a [[binary search tree]]. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the string is thereby compared to the character at the current node:
* if it is less, then the search goes to the left child node,
* if it is less, then the search goes to the left child node,
* if it is greater, then the search goes to the right child node,
* if it is greater, then the search goes to the right child node,
* if it is equal, then the search goes to the middle child node and proceeds to the next character in the search string.
* if it is equal, then the search goes to the middle child node and proceeds to the next character in the string.


The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us":
The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us":
Line 16: Line 16:
As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix.
As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix.


Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length ''m'' in a balanced ternary search tree with ''n'' strings requires at most ''m'' + log<sub>2</sub>(''n'') character comparisons. Roughly speaking, each comparison either consumes one character of the search string or cuts the search space in half.
Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length ''m'' in a balanced ternary search tree with ''n'' strings requires at most ''m'' + log<sub>2</sub>(''n'') character comparisons. Roughly speaking, each comparison either consumes one character of the string or cuts the search space in half.


A '''compressed''' ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a tree is less than twice the number of strings: for each string that is inserted into the tree, at most one new node is added and at most one existing node is split into two nodes.
A '''compressed''' ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a tree is less than twice the number of strings: for each string that is inserted into the tree, at most one new node is added and at most one existing node is split into two nodes.

Revision as of 14:16, 17 November 2011

In computer science, a ternary search tree is a special trie data structure where the child nodes of a standard trie are ordered as a binary search tree. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the string is thereby compared to the character at the current node:

  • if it is less, then the search goes to the left child node,
  • if it is greater, then the search goes to the right child node,
  • if it is equal, then the search goes to the middle child node and proceeds to the next character in the string.

The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us":

          c
        / | \
       a  u  h
       |  |  | \
       t  t  e  u
     /  / |   / |
    s  p  e  i  s

As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix.

Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length m in a balanced ternary search tree with n strings requires at most m + log2(n) character comparisons. Roughly speaking, each comparison either consumes one character of the string or cuts the search space in half.

A compressed ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a tree is less than twice the number of strings: for each string that is inserted into the tree, at most one new node is added and at most one existing node is split into two nodes.

See also

References