Skip to content

Commit

Permalink
Merge pull request #84 from bits-and-blooms/dlemire/issue68
Browse files Browse the repository at this point in the history
Improve support of big endian systems
  • Loading branch information
lemire authored Aug 18, 2022
2 parents bf91282 + f4f0bcc commit cb9965b
Show file tree
Hide file tree
Showing 5 changed files with 5 additions and 8 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ make qa

A Bloom filter has two parameters: _m_, the number of bits used in storage, and _k_, the number of hashing functions on elements of the set. (The actual hashing functions are important, too, but this is not a parameter for this implementation). A Bloom filter is backed by a [BitSet](https://github.com/bits-and-blooms/bitset); a key is represented in the filter by setting the bits at each value of the hashing functions (modulo _m_). Set membership is done by _testing_ whether the bits at each value of the hashing functions (again, modulo _m_) are set. If so, the item is in the set. If the item is actually in the set, a Bloom filter will never fail (the true positive rate is 1.0); but it is susceptible to false positives. The art is to choose _k_ and _m_ correctly.

In this implementation, the hashing functions used is [murmurhash](https://github.com/spaolacci/murmur3), a non-cryptographic hashing function.
In this implementation, the hashing functions used is [murmurhash](github.com/twmb/murmur3), a non-cryptographic hashing function.


Given the particular hashing scheme, it's best to be empirical about this. Note
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ go 1.14

require (
github.com/bits-and-blooms/bitset v1.2.0
github.com/spaolacci/murmur3 v1.1.0
github.com/twmb/murmur3 v1.1.6
)
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
github.com/bits-and-blooms/bitset v1.2.0 h1:Kn4yilvwNtMACtf1eYDlG8H77R07mZSPbMjLyS07ChA=
github.com/bits-and-blooms/bitset v1.2.0/go.mod h1:gIdJ4wp64HaoK2YrL1Q5/N7Y16edYb8uY+O0FJTyyDA=
github.com/spaolacci/murmur3 v1.1.0 h1:7c1g84S4BPRrfL5Xrdp6fOJ206sU9y293DDHaoy0bLI=
github.com/spaolacci/murmur3 v1.1.0/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
github.com/twmb/murmur3 v1.1.6 h1:mqrRot1BRxm+Yct+vavLMou2/iJt0tNVTTC0QoIjaZg=
github.com/twmb/murmur3 v1.1.6/go.mod h1:Qq/R7NUyOfr65zD+6Q5IHKsJLwP7exErjN6lyyq3OSQ=
3 changes: 0 additions & 3 deletions murmur.go
Original file line number Diff line number Diff line change
Expand Up @@ -267,9 +267,6 @@ func (d *digest128) sum256(data []byte) (hash1, hash2, hash3, hash4 uint64) {
// we do not want to append to an actual array!!!
if tail_length+1 == block_size {
// We are left with no tail!!!
// Note that murmur3 is sensitive to endianess and so are we.
// We assume a little endian system. Go effectively never run
// on big endian systems so this is fine.
word1 := *(*uint64)(unsafe.Pointer(&tail[0]))
word2 := uint64(*(*uint32)(unsafe.Pointer(&tail[8])))
word2 = word2 | (uint64(tail[12]) << 32) | (uint64(tail[13]) << 40) | (uint64(tail[14]) << 48)
Expand Down
2 changes: 1 addition & 1 deletion murmur_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import (
"math/rand"
"testing"

"github.com/spaolacci/murmur3"
"github.com/twmb/murmur3"
)

// We want to preserve backward compatibility
Expand Down

0 comments on commit cb9965b

Please sign in to comment.