Commit 2f65b948 authored by Tim McNamara's avatar Tim McNamara

Add hash small strings

parent e0c73a8d
I'm not sure if this is a problem in reality, but it feels like a fun thing
to think about. Is is possible to do better than current approaches for hash
functions when dealing with small strings? Things like the columns of databases
with business data, such as domain names, email addresses, city names and other
place names.
### xor/shift hash
```rust
use std::convert::TryInto;
fn shift_hash(short_string: &[u8]) -> u64 {
let mut hash = 0;
let mut len = 0;
let mut chunks = short_string.chunks_exact(8);
'ugly: loop {
let chunk = chunks.next();
let mut chunk_as_array: [u8; 8] = [0; 8];
let mut done = false;
match chunk {
Some(bytes) => {
let chunk_as_array = bytes.try_into().unwrap();
hash ^= u64::from_ne_bytes(chunk_as_array);
len += 8;
}
None => {
for (i, byte) in chunks.remainder().iter().enumerate() {
chunk_as_array[i] = *byte << i;
len += 1;
}
done = true;
}
}
hash ^= u64::from_ne_bytes(chunk_as_array);
if done {
break 'ugly;
}
}
hash ^ len
}
```
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment