CityHash
CityHash is a family of non-cryptographic hash functions, designed for fast hashing of strings. It has 32-, 64-, 128-, and 256-bit variants. CityHash been referenced widely in academic papers.
Google developed the algorithm in-house starting in 2010.[1] The C++ source code for the reference implementation of the algorithm was released in 2011 under an MIT license, with credit to Geoff Pike and Jyrki Alakuijala.[2] The authors expect the algorithm to outperform previous work by a factor of 1.05 to 2.5, depending on the CPU and mix of string lengths being hashed.[3] CityHash is influenced by and partly based on MurmurHash.[4]
Some particularly fast CityHash functions depend on CRC32 instructions that are present in SSE4.2. However, most CityHash functions are designed to be portable, though they will run best on little-endian 32-bit or 64-bit CPUs.[3]
Google has announced FarmHash as the successor to CityHash.[5]
Concerns
CityHash releases do not maintain backward compatibility with previous versions.[6] Users should not use CityHash for persistent storage, or else not upgrade CityHash.
The README warns that CityHash has not been tested much on big-endian platforms.[3]
References
- ^ https://code.google.com/p/cityhash/
- ^ http://google-opensource.blogspot.com/2011/04/introducing-cityhash.html
- ^ a b c https://github.com/google/cityhash/blob/master/README
- ^ https://github.com/google/cityhash/blob/master/src/city.cc
- ^ https://opensource.googleblog.com/2014/03/introducing-farmhash.html
- ^ https://github.com/google/cityhash/blob/master/NEWS
External links
- Official site which redirects to an export on GitHub
- Introducing CityHash, Announcement by Google
- Slides from Geoff Pike's talk at Stanford University