mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2020-06-18, 05:17   #133
patnashev
 
"Pavel Atnashev"
Mar 2020

22×11 Posts
Default

It's rather "easy" to cheat Pietrzak VDF. Just set all u_i to 1, then the cheating boils down to finding y = x^hash(y). If you replace hash(y) with any predetermined hash value, then it's trivial to find the solution. And it can be extended to random u_i, the rough formula looks like y = x^hash(y) / (all u_i with appropriate exponents).

The idea is to construct the result in such a way that it cancels all proof data and essentially becomes the first intermediate point (r1) with the exponent computed from predetermined values. It is possible only if the exponent does not depend on the result. Pietrzak himself writes in his paper that it's important to include the result in the hash. He also provides a simple example how to cheat if it's not done. So, the security of the VDF critically depends on using hash(y). Curiously, it doesn't depend on cryptographic strength of the hash function. I suspect that even non-cryptographic hash functions may work securely.
patnashev is online now   Reply With Quote
Old 2020-06-18, 06:10   #134
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55B16 Posts
Default

Quote:
Originally Posted by patnashev View Post
It's rather "easy" to cheat Pietrzak VDF. Just set all u_i to 1, then the cheating boils down to finding y = x^hash(y). If you replace hash(y) with any predetermined hash value, then it's trivial to find the solution. And it can be extended to random u_i, the rough formula looks like y = x^hash(y) / (all u_i with appropriate exponents).

The idea is to construct the result in such a way that it cancels all proof data and essentially becomes the first intermediate point (r1) with the exponent computed from predetermined values. It is possible only if the exponent does not depend on the result. Pietrzak himself writes in his paper that it's important to include the result in the hash. He also provides a simple example how to cheat if it's not done. So, the security of the VDF critically depends on using hash(y). Curiously, it doesn't depend on cryptographic strength of the hash function. I suspect that even non-cryptographic hash functions may work securely.
Could you answer George's question (which became also my question when I realized I can't answer it), reproduced here for convenience:
Quote:
Originally Posted by Prime95 View Post
OK, I see that making h dependent on M does not help.

A primer on potential cheating might be useful.

For example, if we make h0 dependent on B, but make h1...hn known at startup, does that make cheating easy or just a little less difficult. If this were safe, I can get by with 3*power temporaries -- much better than 6GB at power==9.

Last fiddled with by preda on 2020-06-18 at 06:11
preda is offline   Reply With Quote
Old 2020-06-18, 07:07   #135
patnashev
 
"Pavel Atnashev"
Mar 2020

22·11 Posts
Default

Short answer: no, you can't do that without making cheating easy.

Long answer:
The fundamental problem of computing proofs on the user side is that every time you compute a product it's hiding its content. No simple way to check if the product contains 2 operands or 4 operands or just a random value. The most secure way is to upload ALL intermediate points and compute the product on the server (using random exponents). But Pietrzak found a way to "look" inside the product. It is possible because every point added in a Pietrzak iteration ends between two points known to exist from the previous iterations. Every iteration can be compared to increasing resolution of a picture. You get the general idea already after a few iterations, and then the picture just becomes more detailed by inserting pixels between known pixels.

If only h0 depends on the result, and all other hashes are predetermined, then it's possible to construct a proof that will prove only two distances of iterations. Let distance be the number of iterations between x_t and y_t at the highest depth. Then you need only to calculate x, x^distance, x^2distance. Then using the hashes you construct a proof consisting of 1's or randoms which cancel each other at depths > 1.

Another way to say it is that every iteration has to use hash(y_i) of that iteration, otherwise u_i can be used to construct y_i. The easiest way to do it is by setting u_i=1 (but not the only way).

Btw, you may already noticed that the last intermediate point has exponent=1 in x_i. It's "purpose" is to become the result during verification.
patnashev is online now   Reply With Quote
Old 2020-06-18, 09:59   #136
patnashev
 
"Pavel Atnashev"
Mar 2020

1011002 Posts
Default

As an exercise, I constructed a solution for depth 2 in case h1 is predetermined. Only two intermediate points is needed, not four. The solution passes the check but is wrong, because it's only half of the work.

Code:
x = r0
y = x^(distance2*h1*h1)
h0 = hash(y)

u1 = x^(distance*h1)
x_1 = x^h0 * x^(distance*h1)
y_1 = x^(distance*h0*h1) * x^(distance2*h1*h1)

u2 = 1
x_2 = x^(h0*h1) * x^(distance*h1*h1)
y_2 = x^(distance*h0*h1) * x^(distance2*h1*h1)

x_2^distance == y_2
And this is how it should be if h1 is computed in the iteration. All 4 points necessary.

Code:
x = r0
y = x^(distance4)
h0 = hash(y)

u1 = x^(distance2)
x_1 = x^h0 * x^(distance2)
y_1 = x^(distance2*h0) * x^(distance4)

h1 = hash(y_1)
u2 = x^(distance*h0) * x^(distance3)
x_2 = x^(h0*h1) * x^(distance2*h1) * x^(distance*h0) * x^(distance3)
y_2 = x^(distance*h0*h1) * x^(distance3*h1) * x^(distance2*h0) * x^(distance4)

x_2^distance == y_2
patnashev is online now   Reply With Quote
Old 2020-06-18, 12:03   #137
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default

Quote:
Originally Posted by patnashev View Post
As an exercise, I constructed a solution for depth 2 in case h1 is predetermined. Only two intermediate points is needed, not four. The solution passes the check but is wrong, because it's only half of the work.

Code:
x = r0
y = x^(distance2*h1*h1)
h0 = hash(y)

u1 = x^(distance*h1)
x_1 = x^h0 * x^(distance*h1)
y_1 = x^(distance*h0*h1) * x^(distance2*h1*h1)

u2 = 1
x_2 = x^(h0*h1) * x^(distance*h1*h1)
y_2 = x^(distance*h0*h1) * x^(distance2*h1*h1)

x_2^distance == y_2
And this is how it should be if h1 is computed in the iteration. All 4 points necessary.

Code:
x = r0
y = x^(distance4)
h0 = hash(y)

u1 = x^(distance2)
x_1 = x^h0 * x^(distance2)
y_1 = x^(distance2*h0) * x^(distance4)

h1 = hash(y_1)
u2 = x^(distance*h0) * x^(distance3)
x_2 = x^(h0*h1) * x^(distance2*h1) * x^(distance*h0) * x^(distance3)
y_2 = x^(distance*h0*h1) * x^(distance3*h1) * x^(distance2*h0) * x^(distance4)

x_2^distance == y_2
Thanks -- it looks good to me. So I think this shows that every hash must be "backwards-dependent" on the result; doing that only for the first hash (h0) does not work.
preda is offline   Reply With Quote
Old 2020-06-18, 12:05   #138
axn
 
axn's Avatar
 
Jun 2003

5,051 Posts
Default

Quote:
Originally Posted by patnashev View Post
You do not need to trust the verifier, he makes no decisions. He only computes x_t^distance. It's the server who compares that value with the stored y_t.
How do you know the verifier and author are not the same person or that the verifier is not working together with the author? You are "trusting" that the verifier is independent and does not have access to the original y_t submitted to the server.
axn is online now   Reply With Quote
Old 2020-06-18, 12:20   #139
patnashev
 
"Pavel Atnashev"
Mar 2020

22×11 Posts
Default

Quote:
Originally Posted by axn View Post
How do you know the verifier and author are not the same person or that the verifier is not working together with the author? You are "trusting" that the verifier is independent and does not have access to the original y_t submitted to the server.
Yes, this is exactly the scenario that has to be dealt with. Solution is simple: provide the verifier with x_t^random and expect y_t^random from him. Even if the verifier knows (x_t, y_t), he has to do discrete logarithm to obtain the random exponent and cheat. Not a small problem, to say the least. Computing x_t^random^distance is much easier.

This way the tester can securely verify his own test, which is a useful feature for projects with low participation.
patnashev is online now   Reply With Quote
Old 2020-06-19, 00:28   #140
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·53·71 Posts
Default

Quote:
Originally Posted by patnashev View Post
Short answer: no, you can't do that without making cheating easy..
Thanks for joining the discussion. Your insights are invaluable.

So if I understand correctly, knowing h1...hn in advance allows cheating with 1/2 PRP test effort. My extension where h1 depends on the middle value can be foiled with 3/4 PRP test effort.

This does not mean it is a bad idea. It is hard to imagine someone wanting to submit a bogus result for that much CPU effort. If the temporary file space issue is insurmountable for a particular user, or for us to use a higher power to reduce cost of a single AWS verifier solution, this might still make sense. It is important for us to understand the tradeoff we are making.

We could even support 2 different proof file hashing functions (full and weakened). It is easy to implement 2 different hash strategies in the verifier program.
Prime95 is online now   Reply With Quote
Old 2020-06-19, 00:59   #141
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·53·71 Posts
Default

George's very, very rough AWS verifier price guess:

Using AWS Lambda (https://aws.amazon.com/lambda/pricing/)

My KabyLake 4-core CPU does about 8ms / iteration using all 4 cores.

Thus, for a 100M test, KabyLake = 8ms * 100M iterations
Assume proof power=10
Verify cost is about 1/1000 of a full test = 8ms * 100K = 800 sec.
Assume an AWS Lambdainstance is as powerful as my KabyLake
Assume verifier program uses 512MB
Assume no cost for bandwidth used.
totalCompute (GB-s) = 800 * 512MB/1024 = 400 GB-s
cost = 400 * $0.00001667 = $0.007
Sounds great except.... times 900 PRP tests/day * 30 days/month = $180/month
Prime95 is online now   Reply With Quote
Old 2020-06-19, 01:30   #142
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·7·383 Posts
Default

800 sec / proof x 900 / day * 1 day / 86400 sec = 8.3+ KabyLakes.
Isn't that about 1 RadeonVII? $600 amortized over 3 years = $200/year
Bandwidth, system to house it, power and cooling not included.
GIMPS server has bandwidth.
Direct power ~$1-2/watt-year x 500 watts with system and big added drives ~$500-1000/year power.
System components amortization $500/year?
Cooling cost depends on utility rates and geography.
Getting close to your rough AWS number without the personnel that keep that running.
How do Google Compute or MS Azure or other offerings compare to AWS?
kriesel is online now   Reply With Quote
Old 2020-06-19, 01:53   #143
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·53·71 Posts
Default

Quote:
Originally Posted by kriesel View Post
How do Google Compute or MS Azure or other offerings compare to AWS?
A bit premature to compare as my very, very, rough estimate could easily be off by a factor of 10.
Prime95 is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
phi function rula Homework Help 3 2017-01-18 01:41
delay in crediting? ixfd64 PrimeNet 7 2008-10-20 20:45
Why delay between posts? JHagerson Forum Feedback 1 2006-05-13 21:30
Minimum delay between server connections vaughan ElevenSmooth 5 2005-09-08 17:17
Stats delay ltd Prime Sierpinski Project 10 2005-08-08 13:38

All times are UTC. The time now is 18:04.


Fri Jul 16 18:04:55 UTC 2021 up 49 days, 15:52, 1 user, load averages: 2.06, 1.87, 1.63

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.