Jump to content

CityHash

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Royanee (talk | contribs) at 19:58, 6 February 2017 (Cite source for FarmHash announcement.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

CityHash is a family of non-cryptographic hash functions, designed for fast hashing of strings. It has 32-, 64-, 128-, and 256-bit variants. CityHash been referenced widely in academic papers.

Google developed the algorithm in-house starting in 2010.[1] The C++ source code for the reference implementation of the algorithm was released in 2011 under an MIT license, with credit to Geoff Pike and Jyrki Alakuijala.[2] The authors expect the algorithm to outperform previous work by a factor of 1.05 to 2.5, depending on the CPU and mix of string lengths being hashed.[3] CityHash is influenced by and partly based on MurmurHash.[4]

Some particularly fast CityHash functions depend on CRC32 instructions that are present in SSE4.2. However, most CityHash functions are designed to be portable, though they will run best on little-endian 32-bit or 64-bit CPUs.[3]

Google has announced FarmHash as the successor to CityHash.[5]

Concerns

CityHash releases do not maintain backward compatibility with previous versions.[6] Users should not use CityHash for persistent storage, or else not upgrade CityHash.

The README warns that CityHash has not been tested much on big-endian platforms.[3]

References