mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2012-10-05, 14:46   #1
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

32·79 Posts
Default Deep Hash

hi,

Hope i post in correct forum...

I'm trying to design a hash for huge files that can be computed in parallel and/or at GPU to big extend. I baptized it 'deep hash' and giving it a name for a first attempt is probably a bit early. Of course the first 4 letters of my surname/family name translates to the English: 'deep'.

Prior to coding it up i would like some comments on it and how useful it is to avoid easy to generate collissions. Here is what i crabbled up:

Definitions: in all cases the least significant bit is bit 0 in that specific byte or double word (32 bits unsigned). Everything gets done using unsigned math.

a) 64 bits: number of total bits of file
b) 32 bits: number of ones (mod 2^32)
c) 64 bits: summation of squared bytes (mod 2^64)
d) 32 bits: summation over all doublewords (32 bits datatype summation mod 2^32)
e) we group the file in packets of n bits and xor all packets
we do this for several n: 17,19,23,29,31,37,41,43,47,53,59,61
this is 460 bits together.

so far 460 + 192 = 652 bits

f) 32 bits: rotate each doubleword clockwise
n bits with n being 1,2,.. ..30,31... and calculate
the sum of it modulo 2^32, so we don't rotate 0 bits,
always rotation is a number from domain [1..31]
and each next doubleword we take the next number and after
31 we start again at 1.
g) 32 bits: same as f, but now we rotate in the other direction

652+64 = 716 bits so far..

h) 32 bits: for each byte we calculate how many bits each bit
of the byte have a 1. So how many 0th bits we have ... 7th bit.
we then AND the result with 15 giving us 8 numbers of
4 bits.

748 bits so far

We use a sieve to calculate prime numbers.
for each bit set to 1 we calculate the prime number
belonging to it. We start with prime number 65537 for bit 0 in the first byte
then our sieve calculates the next prime number that comes after that. In this case it starts with:

65537 for bit 0
65539 for bit 1
65543 for bit 2
65551 for bit 3
65557 for bit 4
65563 for bit 5
65579 for bit 6
65581 for bit 7
65587 for bit 8
65599 for bit 9
etc

If a bit is set we grab the prime number. Now we do 3 things with the
selected prime numbers:

i1) 64 bits: we xor all selected primes and store the result modulo 2^64
i2) 64 bits: we add all selected primes up and store the result modulo 2^64
i3) 127 bits: we multiply all selected primes modulo 2^127 - 1,

total: 748 + 255 = 1003 bits so far

j4) 17 bits: we split up file in 16 bits words and multiply those modulo 2^17 - 1

total 1003 + 17 = 1020 bits so far

We calculate a 4 bits CRC of the 1020 bits hash and
store that as the last 4 bits

1020 + 4 = 1024 bits hashkey in total

Last fiddled with by diep on 2012-10-05 at 14:55
diep is offline   Reply With Quote
Old 2012-10-05, 15:44   #2
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

3·112·17 Posts
Default

Many of the SHA-3 candidates support parallel computation, and an output of up to 1024 bits.

Why are you designing your own hash? Some particular reason?
retina is online now   Reply With Quote
Old 2012-10-05, 16:32   #3
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

32·79 Posts
Default

Quote:
Originally Posted by retina View Post
Many of the SHA-3 candidates support parallel computation, and an output of up to 1024 bits.

Why are you designing your own hash? Some particular reason?
Well i'd like to design something that can work in parallel and work on gpu's and manycores and multicores which doesn't spread the illusion (be it true or false) that a few rainbow tables in combination with a 10 line genetic algorithm for a person handy with algorithms without being a cryptographer, can already have anyone add any trojan to any big file, say a linux distro image...

If that takes overdesign then that's not a problem - it's about the illusion :)
diep is offline   Reply With Quote
Old 2012-10-05, 17:14   #4
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·7·691 Posts
Default

Quote:
Originally Posted by diep View Post
Well i'd like to design something that can work in parallel and work on gpu's and manycores and multicores which doesn't spread the illusion (be it true or false) that a few rainbow tables in combination with a 10 line genetic algorithm for a person handy with algorithms without being a cryptographer, can already have anyone add any trojan to any big file, say a linux distro image...
This statement alone suggests you are not the right person to design a new hash function accepted for public use.

I challenge you to change a publicly known file, let's say CentOS-6.3-i386-bin-DVD1.iso...

...and add the string "diep was here" anywhere in the file without changing the MD5 Hash of "0285160d8ba3cfc720ea55e98e464eac" nor the size of the file.

You are allowed to use as many rainbow tables and algorithms as you like in this challenge.
chalsall is online now   Reply With Quote
Old 2012-10-05, 17:26   #5
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

32·79 Posts
Default

Quote:
Originally Posted by chalsall View Post
This statement alone suggests you are not the right person to design a new hash function accepted for public use.

I challenge you to change a publicly known file, let's say CentOS-6.3-i386-bin-DVD1.iso...

...and add the string "diep was here" anywhere in the file without changing the MD5 Hash of "0285160d8ba3cfc720ea55e98e464eac" nor the size of the file.

You are allowed to use as many rainbow tables and algorithms as you like in this challenge.
I'm asking for comments on the combination of algorithms, do you have something to say there?
diep is offline   Reply With Quote
Old 2012-10-05, 17:44   #6
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2×7×691 Posts
Default

Quote:
Originally Posted by diep View Post
I'm asking for comments on the combination of algorithms, do you have something to say there?
No.

I rely on those who have demonstrated a deep knowledge in the domain to design publicly used hash functions. And similar others to peer review the functions to ensure there are no bugs, or to show where there might be exposure.

I am simply a user of such functions. But I have reason to trust them.
chalsall is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Array vs Hash Table vs Map for QS Sam Kennedy Programming 1 2012-12-25 23:25
FNV hash challenges Rich Puzzles 2 2011-12-23 18:17
SHA-1 Crypto Hash weakened plandon Lounge 0 2009-06-16 13:55
Pari hash table lookup Joshua2 Software 5 2009-02-23 22:59
NASA's Deep Impact... ixfd64 Lounge 5 2005-07-06 13:46

All times are UTC. The time now is 13:06.

Mon Jun 14 13:06:36 UTC 2021 up 17 days, 10:53, 0 users, load averages: 1.42, 1.53, 1.63

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.