20121005, 14:46  #1 
Sep 2006
The Netherlands
3×233 Posts 
Deep Hash
hi,
Hope i post in correct forum... I'm trying to design a hash for huge files that can be computed in parallel and/or at GPU to big extend. I baptized it 'deep hash' and giving it a name for a first attempt is probably a bit early. Of course the first 4 letters of my surname/family name translates to the English: 'deep'. Prior to coding it up i would like some comments on it and how useful it is to avoid easy to generate collissions. Here is what i crabbled up: Definitions: in all cases the least significant bit is bit 0 in that specific byte or double word (32 bits unsigned). Everything gets done using unsigned math. a) 64 bits: number of total bits of file b) 32 bits: number of ones (mod 2^32) c) 64 bits: summation of squared bytes (mod 2^64) d) 32 bits: summation over all doublewords (32 bits datatype summation mod 2^32) e) we group the file in packets of n bits and xor all packets we do this for several n: 17,19,23,29,31,37,41,43,47,53,59,61 this is 460 bits together. so far 460 + 192 = 652 bits f) 32 bits: rotate each doubleword clockwise n bits with n being 1,2,.. ..30,31... and calculate the sum of it modulo 2^32, so we don't rotate 0 bits, always rotation is a number from domain [1..31] and each next doubleword we take the next number and after 31 we start again at 1. g) 32 bits: same as f, but now we rotate in the other direction 652+64 = 716 bits so far.. h) 32 bits: for each byte we calculate how many bits each bit of the byte have a 1. So how many 0th bits we have ... 7th bit. we then AND the result with 15 giving us 8 numbers of 4 bits. 748 bits so far We use a sieve to calculate prime numbers. for each bit set to 1 we calculate the prime number belonging to it. We start with prime number 65537 for bit 0 in the first byte then our sieve calculates the next prime number that comes after that. In this case it starts with: 65537 for bit 0 65539 for bit 1 65543 for bit 2 65551 for bit 3 65557 for bit 4 65563 for bit 5 65579 for bit 6 65581 for bit 7 65587 for bit 8 65599 for bit 9 etc If a bit is set we grab the prime number. Now we do 3 things with the selected prime numbers: i1) 64 bits: we xor all selected primes and store the result modulo 2^64 i2) 64 bits: we add all selected primes up and store the result modulo 2^64 i3) 127 bits: we multiply all selected primes modulo 2^127  1, total: 748 + 255 = 1003 bits so far j4) 17 bits: we split up file in 16 bits words and multiply those modulo 2^17  1 total 1003 + 17 = 1020 bits so far We calculate a 4 bits CRC of the 1020 bits hash and store that as the last 4 bits 1020 + 4 = 1024 bits hashkey in total Last fiddled with by diep on 20121005 at 14:55 
20121005, 15:44  #2 
Undefined
"The unspeakable one"
Jun 2006
My evil lair
5·1,201 Posts 
Many of the SHA3 candidates support parallel computation, and an output of up to 1024 bits.
Why are you designing your own hash? Some particular reason? 
20121005, 16:32  #3  
Sep 2006
The Netherlands
3×233 Posts 
Quote:
If that takes overdesign then that's not a problem  it's about the illusion :) 

20121005, 17:14  #4  
If I May
"Chris Halsall"
Sep 2002
Barbados
2·4,703 Posts 
Quote:
I challenge you to change a publicly known file, let's say CentOS6.3i386binDVD1.iso... ...and add the string "diep was here" anywhere in the file without changing the MD5 Hash of "0285160d8ba3cfc720ea55e98e464eac" nor the size of the file. You are allowed to use as many rainbow tables and algorithms as you like in this challenge. 

20121005, 17:26  #5  
Sep 2006
The Netherlands
3·233 Posts 
Quote:


20121005, 17:44  #6  
If I May
"Chris Halsall"
Sep 2002
Barbados
2·4,703 Posts 
Quote:
I rely on those who have demonstrated a deep knowledge in the domain to design publicly used hash functions. And similar others to peer review the functions to ensure there are no bugs, or to show where there might be exposure. I am simply a user of such functions. But I have reason to trust them. 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Array vs Hash Table vs Map for QS  Sam Kennedy  Programming  1  20121225 23:25 
FNV hash challenges  Rich  Puzzles  2  20111223 18:17 
SHA1 Crypto Hash weakened  plandon  Lounge  0  20090616 13:55 
Pari hash table lookup  Joshua2  Software  5  20090223 22:59 
NASA's Deep Impact...  ixfd64  Lounge  5  20050706 13:46 