Hash array mapped trie
A hash array mapped trie[1] (HAMT) is an implementation of an associative array that combines the characteristics of a hash table and an array mapped trie.[2]
Operation
A HAMT is an array mapped trie where the keys are first hashed in order to ensure an even distribution of keys and to ensure a constant key length.
In a typical implementation of an array mapped trie, each node may branch to up to 256 other nodes. However, as allocating space for 256 pointers for each node would be enormously expensive, each node instead contains a bitmap which is 256 bits long where each bit indicates the presence of a path. This followed by an array of pointers equal in length to the Hamming weight of the bitmap.
Advantages of HAMTs
The hash array mapped trie achieves almost hash table-like speed, despite being a functional, persistent data structure.
Problems with HAMTs
Implementation of a HAMT involves the use of the population count function, which counts the number of ones in the binary representation of a number. This operation is available in many instruction set architectures (where it is sometimes called "CTPOP"), but it is only available in some high-level languages. Although population count can be implemented in software in O(1) time using a series of shift and add instructions, doing so may perform the operation an order of magnitude slower.
Implementations
The programming language Clojure uses a persistent variant of hash array mapped tries for its native hash map type.[3]
The programming language Scala also uses hash array mapped tries to implement a persistent map data type.
References
- ^ Bagwell, P. (2001) Ideal Hash Trees. Technical Report, 2001.
- ^ Bagwell, P. (2000) Fast And Space Efficient Trie Searches. Technical Report, 2000.
- ^ Java source file of Clojure's hash map type.