Thread: Deep Hash
View Single Post
Old 2012-10-05, 14:46   #1
diep's Avatar
Sep 2006
The Netherlands

36 Posts
Default Deep Hash


Hope i post in correct forum...

I'm trying to design a hash for huge files that can be computed in parallel and/or at GPU to big extend. I baptized it 'deep hash' and giving it a name for a first attempt is probably a bit early. Of course the first 4 letters of my surname/family name translates to the English: 'deep'.

Prior to coding it up i would like some comments on it and how useful it is to avoid easy to generate collissions. Here is what i crabbled up:

Definitions: in all cases the least significant bit is bit 0 in that specific byte or double word (32 bits unsigned). Everything gets done using unsigned math.

a) 64 bits: number of total bits of file
b) 32 bits: number of ones (mod 2^32)
c) 64 bits: summation of squared bytes (mod 2^64)
d) 32 bits: summation over all doublewords (32 bits datatype summation mod 2^32)
e) we group the file in packets of n bits and xor all packets
we do this for several n: 17,19,23,29,31,37,41,43,47,53,59,61
this is 460 bits together.

so far 460 + 192 = 652 bits

f) 32 bits: rotate each doubleword clockwise
n bits with n being 1,2,.. ..30,31... and calculate
the sum of it modulo 2^32, so we don't rotate 0 bits,
always rotation is a number from domain [1..31]
and each next doubleword we take the next number and after
31 we start again at 1.
g) 32 bits: same as f, but now we rotate in the other direction

652+64 = 716 bits so far..

h) 32 bits: for each byte we calculate how many bits each bit
of the byte have a 1. So how many 0th bits we have ... 7th bit.
we then AND the result with 15 giving us 8 numbers of
4 bits.

748 bits so far

We use a sieve to calculate prime numbers.
for each bit set to 1 we calculate the prime number
belonging to it. We start with prime number 65537 for bit 0 in the first byte
then our sieve calculates the next prime number that comes after that. In this case it starts with:

65537 for bit 0
65539 for bit 1
65543 for bit 2
65551 for bit 3
65557 for bit 4
65563 for bit 5
65579 for bit 6
65581 for bit 7
65587 for bit 8
65599 for bit 9

If a bit is set we grab the prime number. Now we do 3 things with the
selected prime numbers:

i1) 64 bits: we xor all selected primes and store the result modulo 2^64
i2) 64 bits: we add all selected primes up and store the result modulo 2^64
i3) 127 bits: we multiply all selected primes modulo 2^127 - 1,

total: 748 + 255 = 1003 bits so far

j4) 17 bits: we split up file in 16 bits words and multiply those modulo 2^17 - 1

total 1003 + 17 = 1020 bits so far

We calculate a 4 bits CRC of the 1020 bits hash and
store that as the last 4 bits

1020 + 4 = 1024 bits hashkey in total

Last fiddled with by diep on 2012-10-05 at 14:55
diep is offline   Reply With Quote