Jump to content

Ternary search tree

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 89.204.136.55 (talk) at 18:03, 12 November 2011 (Introduction to compressed TSTs improved.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer science, a ternary search tree is a special trie data structure where the child nodes of a standard trie are ordered as a binary search tree. Searching for a string in a ternary search tree consists of a series of binary searches, one for each character in the string. The current character in the search string is thereby compared to the character at the current node:

  • if it is less, then the search goes to the left child node,
  • if it is greater, then the search goes to the right child node,
  • if it is equal, then the search goes to the middle child node and proceeds to the next character in the search string.

The figure below shows a ternary search tree with the strings "as", "at", "cup", "cute", "he", "i" and "us":

          c
        / | \
       a  u  h
       |  |  | \
       t  t  e  u
     /  / |   / |
    s  p  e  i  s

As with other trie data structures, each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix.

Like a binary search tree, a ternary search tree can be balanced or unbalanced, depending on the order the strings are inserted into the tree. Searching for a string of length m in a balanced ternary search tree with n strings requires at most m + log2(n) character comparisons. Roughly speaking, each comparison either consumes one character of the search string or cuts the search space in half.

A compressed ternary search tree is a space-efficient variant where redundant nodes are removed. For example, in the figure above, the character sequences "cu", "te", "he" and "us" can be each compressed into one node. The number of nodes in such a compressed tree is less than twice the number of strings. For each string, at most one new node is added and at most one existing node is split into two nodes.

See also

References