API Reference¶
The MurmurHash3 algorithm has three variants:
MurmurHash3_x86_32: Generates 32-bit hashes using 32-bit arithmetic.
MurmurHash3_x64_128: Generates 128-bit hashes using 64-bit arithmetic.
MurmurHash3_x86_128: Generates 128-bit hashes using 32-bit arithmetic.
The mmh3 library provides functions and classes for each variant.
Although this API reference is comprehensive, you may find the following functions particularly useful:
mmh3.hash(): Uses the 32-bit variant as its backend and accepts
bytesorstras input (strings are UTF-8 encoded). This function is slower than the x64_128 variant in 64-bit environments but is portable across different architectures. It can also be used to calculate favicon hash footprints for platforms like Shodan and ZoomEye.mmh3.mmh3_x64_128_digest(): Uses the x64_128 variant as its backend. This function accepts a buffer (e.g.,
bytes,bytearray,memoryview, andnumpyarrays) and returns a 128-bit hash as abytesobject, similar to thehashlibmodule in the Python Standard Library. It performs faster than the 32-bit variant on 64-bit machines.
Note that mmh3 is endian-neutral, while the original C++ library is
endian-sensitive (see also
Known Issues).
This feature of mmh3 is essential when portability across different
architectures is required, such as when calculating hash footprints for web
services.
Basic Hash Functions¶
The following functions are used to hash immutable types, specifically
bytes and str. String inputs are automatically converted to bytes using
UTF-8 encoding before hashing.
Although hash128(), hash64(), and mmh3.hash_bytes() are provided for
compatibility with previous versions and are not marked for deprecation,
the buffer-accepting hashe functions
introduced in version 5.0.0 are recommended for new code.
- mmh3.hash(key, seed=0, signed=True) int¶
Return a hash as a 32-bit integer.
Calculated by the MurmurHash3_x86_32 algorithm.
- Parameters:
key (bytes | str) – The input data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
signed (Any) – If True, return a signed integer. Otherwise, return an unsigned integer.
- Returns:
The hash value as a 32-bit integer.
- Return type:
int
Changed in version 5.0.0: The
seedargument is now strictly checked for valid range. The type of thesignedargument has been changed frombooltoAny. Performance improvements have been made.
- mmh3.hash128(key, seed=0, x64arch=True, signed=False) int¶
Return a hash as a 128-bit integer.
Calculated by the MurmurHash3_x{64, 86}_128 algorithm.
- Parameters:
key (bytes | str) – The input data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
x64arch (Any) – If True, use an algorithm optimized for 64-bit architecture. Otherwise, use one optimized for 32-bit architecture.
signed (Any) – If True, return a signed integer. Otherwise, return an unsigned integer.
- Returns:
The hash value as a 128-bit integer.
- Return type:
int
Changed in version 5.0.0: The
seedargument is now strictly checked for valid range. The type of thex64archandsignedarguments has been changed frombooltoAny.
- mmh3.hash64(key, seed=0, x64arch=True, signed=True) tuple[int, int]¶
Return a hash as a tuple of two 64-bit integers.
Calculated by the MurmurHash3_x{64, 86}_128 algorithm.
- Parameters:
key (bytes | str) – The input data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
x64arch (Any) – If True, use an algorithm optimized for 64-bit architecture. Otherwise, use one optimized for 32-bit architecture.
signed (Any) – If True, return a signed integer. Otherwise, return an unsigned integer.
- Returns:
The hash value as a tuple of two 64-bit integers.
- Return type:
tuple[int, int]
Changed in version 5.0.0: The
seedargument is now strictly checked for valid range. The type of thex64archandsignedarguments has been changed frombooltoAny.
- mmh3.hash_bytes(key, seed=0, x64arch=True) bytes¶
Return a 16-byte hash of the
bytestype.- Parameters:
key (bytes | str) – The input data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
x64arch (Any) – If True, use an algorithm optimized for 64-bit architecture. Otherwise, use one optimized for 32-bit architecture.
- Returns:
The hash value as the
bytestype with a length of 16 bytes (128 bits).- Return type:
bytes
Changed in version 5.0.0: The
seedargument is now strictly checked for valid range. The type of thex64archargument has been changed frombooltoAny.
Buffer-Accepting Hash Functions¶
The following functions are used to hash types that implement the buffer
protocol such as bytes, bytearray, memoryview, and numpy arrays.
See also
The buffer protocol,
originally implemented as a part of Python/C API,
was formally defined as a Python-level API in
PEP 688
in 2022 and its corresponding type hint was introduced as
collections.abc.Buffer
in Python 3.12. For earlier Python versions, mmh3 uses a type alias for the
type hint
_typeshed.ReadableBuffer,
which is itself an alias for
typing_extensions.Buffer,
the backported type hint for collections.abc.Buffer.
- mmh3.hash_from_buffer(key, seed=0, signed=True) int¶
Return a hash for the buffer as a 32-bit integer.
Calculated by the MurmurHash3_x86_32 algorithm. Designed for large memory-views such as numpy arrays.
- Parameters:
key (Buffer | str) – The bufer to hash. String inputs are also supported and are automatically converted to bytes using UTF-8 encoding before hashing.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
signed (Any) – If True, return a signed integer. Otherwise, return an unsigned integer.
- Returns:
The hash value as a 32-bit integer.
- Return type:
int
Deprecated since version 5.0.0: Use
mmh3_32_sintdigest()ormmh3_32_uintdigest()instead.Changed in version 5.0.0: The
seedargument is now strictly checked for valid range. The type of thesignedargument has been changed frombooltoAny.
- mmh3.mmh3_32_digest(key, seed=0, /) bytes¶
Return a 4-byte hash of the
bytestype for the buffer.Calculated by the MurmurHash3_x86_32 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as the
bytestype with a length of 4 bytes (32 bits).- Return type:
bytes
Added in version 5.0.0.
- mmh3.mmh3_32_sintdigest(key, seed=0, /) int¶
Return a hash for the buffer as a 32-bit signed integer.
Calculated by the MurmurHash3_x86_32 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a 32-bit signed integer.
- Return type:
int
Added in version 5.0.0.
- mmh3.mmh3_32_uintdigest(key, seed=0, /) int¶
Return a hash for the buffer as a 32-bit unsigned integer.
Calculated by the MurmurHash3_x86_32 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a 32-bit unsigned integer.
- Return type:
int
Added in version 5.0.0.
- mmh3.mmh3_x64_128_digest(key, seed=0, /) bytes¶
Return a 16-byte hash of the
bytestype for the buffer.Calculated by the MurmurHash3_x64_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as the
bytestype with a length of 16 bytes (128 bits).- Return type:
bytes
Added in version 5.0.0.
- mmh3.mmh3_x64_128_sintdigest(key, seed=0, /) int¶
Return a hash for the buffer as a 128-bit signed integer.
Calculated by the MurmurHash3_x64_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a 128-bit signed integer.
- Return type:
int
Added in version 5.0.0.
- mmh3.mmh3_x64_128_stupledigest(key, seed=0, /) tuple[int, int]¶
Return a hash for the buffer as a tuple of two 64-bit signed integers.
Calculated by the MurmurHash3_x64_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a tuple of two 64-bit signed integers.
- Return type:
tuple[int, int]
Added in version 5.0.0.
- mmh3.mmh3_x64_128_uintdigest(key, seed=0, /) int¶
Return a hash for the buffer as a 128-bit unsigned integer.
Calculated by the MurmurHash3_x64_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a 128-bit unsigned integer.
- Return type:
int
Added in version 5.0.0.
- mmh3.mmh3_x64_128_utupledigest(key, seed=0, /) tuple[int, int]¶
Return a hash for the buffer as a tuple of two 64-bit unsigned integers.
Calculated by the MurmurHash3_x64_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a tuple of two 64-bit unsigned integers.
- Return type:
tuple[int, int]
Added in version 5.0.0.
- mmh3.mmh3_x86_128_digest(key, seed=0, /) bytes¶
Return a 16-byte hash of the
bytestype for the buffer.Calculated by the MurmurHash3_x86_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as the
bytestype with a length of 16 bytes (128 bits).- Return type:
bytes
Added in version 5.0.0.
- mmh3.mmh3_x86_128_sintdigest(key, seed=0, /) int¶
Return a hash for the buffer as a 128-bit signed integer.
Calculated by the MurmurHash3_x86_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as an signed 128-bit integer.
- Return type:
int
Added in version 5.0.0.
- mmh3.mmh3_x86_128_stupledigest(key, seed=0, /) tuple[int, int]¶
Return a hash for the buffer as a tuple of two 64-bit signed integers.
Calculated by the MurmurHash3_x86_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a tuple of two 64-bit signed integers.
- Return type:
tuple[int, int]
Added in version 5.0.0.
- mmh3.mmh3_x86_128_uintdigest(key, seed=0, /) int¶
Return a hash for the buffer as a 128-bit unsigned integer.
Calculated by the MurmurHash3_x86_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a 128-bit unsigned integer.
- Return type:
int
Added in version 5.0.0.
- mmh3.mmh3_x86_128_utupledigest(key, seed=0, /) tuple[int, int]¶
Return a hash for the buffer as a tuple of two 64-bit unsigned integers.
Calculated by the MurmurHash3_x86_128 algorithm.
- Parameters:
key (Buffer) – The input buffer to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
- Returns:
The hash value as a tuple of two 64-bit unsigned integers.
- Return type:
tuple[int, int]
Added in version 5.0.0.
Hasher Classes¶
mmh3 implements hashers with interfaces similar to those in hashlib from
the standard library: mmh3_32() for 32-bit hashing, mmh3_x64_128() for
128-bit hashing optimized for x64 architectures, and mmh3_x86_128() for
128-bit hashing optimized for x86 architectures.
In addition to the standard digest() method, each hasher provides
sintdigest(), which returns a signed integer, and uintdigest(), which
returns an unsigned integer. The 128-bit hashers also include stupledigest()
and utupledigest(), which return two 64 bit integers.
Please note that as of version 5.0.0, the implementation is still experimental,
and performance may be unsatisfactory (particularly mmh3_x86_128()).
Additionally, hexdigest() is not supported; use digest().hex() instead.
>>> import mmh3
>>> hasher = mmh3.mmh3_x64_128(b"foo", 42) # seed=42
>>> hasher.update(b"bar")
>>> hasher.digest()
b'\x82_n\xdd \xac\xb6j\xef\x99\xb1e\xc4\n\xc9\xfd'
>>> hasher.sintdigest() # 128 bit signed int
-2943813934500665152301506963178627198
>>> hasher.uintdigest() # 128 bit unsigned int
337338552986437798311073100468589584258
>>> hasher.stupledigest() # two 64 bit signed ints
(7689522670935629698, -159584473158936081)
>>> hasher.utupledigest() # two 64 bit unsigned ints
(7689522670935629698, 18287159600550615535)
- class mmh3.mmh3_32(data=None, seed=0)¶
Hasher for incrementally calculating the murmurhash3_x86_32 hash.
- Parameters:
data (Buffer | None) – The initial data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
Changed in version 5.0.0: Added the optional
dataparameter as the first argument. Theseedargument is now strictly checked for valid range.- copy() mmh3_32¶
Return a copy of the hash object..
- Returns:
A copy of this hash object.
- Return type:
- digest() bytes¶
Return the digest value as a
bytesobject.- Returns:
The digest value.
- Return type:
bytes
- sintdigest() int¶
Return the digest value as a signed integer.
- Returns:
The digest value as a signed integer.
- Return type:
int
- uintdigest() int¶
Return the digest value as an unsigned integer.
- Returns:
The digest value as an unsigned integer.
- Return type:
int
- update(data)¶
Update this hash object’s state with the provided bytes-like object.
- Parameters:
data (Buffer) – The buffer to hash.
- block_size¶
Number of bytes of the internal block of this algorithm
- Type:
int
- digest_size¶
Number of bytes in this hashes output
- Type:
int
- name¶
The hash algorithm being used by this object
- Type:
str
- class mmh3.mmh3_x64_128(data=None, seed=0)¶
Hasher for incrementally calculating the murmurhash3_x64_128 hash.
- Parameters:
data (Buffer | None) – The initial data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
Changed in version 5.0.0: Added the optional
dataparameter as the first argument. Theseedargument is now strictly checked for valid range.- copy() mmh3_128x64¶
Return a copy of the hash object..
- Returns:
A copy of this hash object.
- Return type:
mmh3_128x64
- digest() bytes¶
Return the digest value as a
bytesobject.- Returns:
The digest value.
- Return type:
bytes
- sintdigest() int¶
Return the digest value as a signed integer.
- Returns:
The digest value as a signed integer.
- Return type:
int
- stupledigest() tuple[int, int]¶
Return the digest value as a tuple of two signed integers.
- Returns:
The digest value as a tuple of two signed integers.
- Return type:
tuple[int, int]
- uintdigest() int¶
Return the digest value as an unsigned integer.
- Returns:
The digest value as an unsigned integer.
- Return type:
int
- update(data)¶
Update this hash object’s state with the provided bytes-like object.
- Parameters:
data (Buffer) – The buffer to hash.
- utupledigest() tuple[int, int]¶
Return the digest value as a tuple of two unsigned integers.
- Returns:
The digest value as a tuple of two unsigned integers.
- Return type:
tuple[int, int]
- block_size¶
Number of bytes of the internal block of this algorithm.
- Type:
int
- digest_size¶
Number of bytes in this hashes output.
- Type:
int
- name¶
The hash algorithm being used by this object.
- Type:
str
- class mmh3.mmh3_x86_128(data=None, seed=0)¶
Hasher for incrementally calculating the murmurhash3_x86_128 hash.
- Parameters:
data (Buffer | None) – The initial data to hash.
seed (int) – The seed value. Must be an integer in the range [0, 0xFFFFFFFF].
Changed in version 5.0.0: Added the optional
dataparameter as the first argument. Theseedargument is now strictly checked for valid range.- copy() mmh3_128x86¶
Return a copy of the hash object..
- Returns:
A copy of this hash object.
- Return type:
mmh3_128x86
- digest() bytes¶
Return the digest value as a
bytesobject.- Returns:
The digest value.
- Return type:
bytes
- sintdigest() int¶
Return the digest value as a signed integer.
- Returns:
The digest value as a signed integer.
- Return type:
int
- stupledigest() tuple[int, int]¶
Return the digest value as a tuple of two signed integers.
- Returns:
The digest value as a tuple of two signed integers.
- Return type:
tuple[int, int]
- uintdigest() int¶
Return the digest value as an unsigned integer.
- Returns:
The digest value as an unsigned integer.
- Return type:
int
- update(data)¶
Update this hash object’s state with the provided bytes-like object.
- Parameters:
data (Buffer) – The buffer to hash.
- utupledigest() tuple[int, int]¶
Return the digest value as a tuple of two unsigned integers.
- Returns:
The digest value as a tuple of two unsigned integers.
- Return type:
tuple[int, int]
- block_size¶
Number of bytes of the internal block of this algorithm
- Type:
int
- digest_size¶
Number of bytes in this hashes output
- Type:
int
- name¶
Te hash algorithm being used by this object
- Type:
str