![]() |
It's rather "easy" to cheat Pietrzak VDF. Just set all u_i to 1, then the cheating boils down to finding y = x^hash(y). If you replace hash(y) with any predetermined hash value, then it's trivial to find the solution. And it can be extended to random u_i, the rough formula looks like y = x^hash(y) / (all u_i with appropriate exponents).
The idea is to construct the result in such a way that it cancels all proof data and essentially becomes the first intermediate point (r1) with the exponent computed from predetermined values. It is possible only if the exponent does not depend on the result. Pietrzak himself writes in his paper that it's important to include the result in the hash. He also provides a simple example how to cheat if it's not done. So, the security of the VDF critically depends on using hash(y). Curiously, it doesn't depend on cryptographic strength of the hash function. I suspect that even non-cryptographic hash functions may work securely. |
[QUOTE=patnashev;548340]It's rather "easy" to cheat Pietrzak VDF. Just set all u_i to 1, then the cheating boils down to finding y = x^hash(y). If you replace hash(y) with any predetermined hash value, then it's trivial to find the solution. And it can be extended to random u_i, the rough formula looks like y = x^hash(y) / (all u_i with appropriate exponents).
The idea is to construct the result in such a way that it cancels all proof data and essentially becomes the first intermediate point (r1) with the exponent computed from predetermined values. It is possible only if the exponent does not depend on the result. Pietrzak himself writes in his paper that it's important to include the result in the hash. He also provides a simple example how to cheat if it's not done. So, the security of the VDF critically depends on using hash(y). Curiously, it doesn't depend on cryptographic strength of the hash function. I suspect that even non-cryptographic hash functions may work securely.[/QUOTE] Could you answer George's question (which became also my question when I realized I can't answer it), reproduced here for convenience: [QUOTE=Prime95;548018]OK, I see that making h dependent on M does not help. A primer on potential cheating might be useful. For example, if we make h0 dependent on B, but make h1...hn known at startup, does that make cheating easy or just a little less difficult. If this were safe, I can get by with 3*power temporaries -- much better than 6GB at power==9. [/QUOTE] |
Short answer: no, you can't do that without making cheating easy.
Long answer: The fundamental problem of computing proofs on the user side is that every time you compute a product it's hiding its content. No simple way to check if the product contains 2 operands or 4 operands or just a random value. The most secure way is to upload ALL intermediate points and compute the product on the server (using random exponents). But Pietrzak found a way to "look" inside the product. It is possible because every point added in a Pietrzak iteration ends between two points known to exist from the previous iterations. Every iteration can be compared to increasing resolution of a picture. You get the general idea already after a few iterations, and then the picture just becomes more detailed by inserting pixels between known pixels. If only h0 depends on the result, and all other hashes are predetermined, then it's possible to construct a proof that will prove only two distances of iterations. Let distance be the number of iterations between x_t and y_t at the highest depth. Then you need only to calculate x, x^distance, x^2distance. Then using the hashes you construct a proof consisting of 1's or randoms which cancel each other at depths > 1. Another way to say it is that every iteration has to use hash(y_i) of that iteration, otherwise u_i can be used to construct y_i. The easiest way to do it is by setting u_i=1 (but not the only way). Btw, you may already noticed that the last intermediate point has exponent=1 in x_i. It's "purpose" is to become the result during verification. |
As an exercise, I constructed a solution for depth 2 in case h1 is predetermined. Only two intermediate points is needed, not four. The solution passes the check but is wrong, because it's only half of the work.
[CODE]x = r0 y = x^(distance2*h1*h1) h0 = hash(y) u1 = x^(distance*h1) x_1 = x^h0 * x^(distance*h1) y_1 = x^(distance*h0*h1) * x^(distance2*h1*h1) u2 = 1 x_2 = x^(h0*h1) * x^(distance*h1*h1) y_2 = x^(distance*h0*h1) * x^(distance2*h1*h1) x_2^distance == y_2 [/CODE] And this is how it should be if h1 is computed in the iteration. All 4 points necessary. [CODE]x = r0 y = x^(distance4) h0 = hash(y) u1 = x^(distance2) x_1 = x^h0 * x^(distance2) y_1 = x^(distance2*h0) * x^(distance4) h1 = hash(y_1) u2 = x^(distance*h0) * x^(distance3) x_2 = x^(h0*h1) * x^(distance2*h1) * x^(distance*h0) * x^(distance3) y_2 = x^(distance*h0*h1) * x^(distance3*h1) * x^(distance2*h0) * x^(distance4) x_2^distance == y_2 [/CODE] |
[QUOTE=patnashev;548359]As an exercise, I constructed a solution for depth 2 in case h1 is predetermined. Only two intermediate points is needed, not four. The solution passes the check but is wrong, because it's only half of the work.
[CODE]x = r0 y = x^(distance2*h1*h1) h0 = hash(y) u1 = x^(distance*h1) x_1 = x^h0 * x^(distance*h1) y_1 = x^(distance*h0*h1) * x^(distance2*h1*h1) u2 = 1 x_2 = x^(h0*h1) * x^(distance*h1*h1) y_2 = x^(distance*h0*h1) * x^(distance2*h1*h1) x_2^distance == y_2 [/CODE] And this is how it should be if h1 is computed in the iteration. All 4 points necessary. [CODE]x = r0 y = x^(distance4) h0 = hash(y) u1 = x^(distance2) x_1 = x^h0 * x^(distance2) y_1 = x^(distance2*h0) * x^(distance4) h1 = hash(y_1) u2 = x^(distance*h0) * x^(distance3) x_2 = x^(h0*h1) * x^(distance2*h1) * x^(distance*h0) * x^(distance3) y_2 = x^(distance*h0*h1) * x^(distance3*h1) * x^(distance2*h0) * x^(distance4) x_2^distance == y_2 [/CODE][/QUOTE] Thanks -- it looks good to me. So I think this shows that every hash must be "backwards-dependent" on the result; doing that only for the first hash (h0) does not work. |
[QUOTE=patnashev;548339]You do not need to trust the verifier, he makes no decisions. He only computes x_t^distance. It's the server who compares that value with the stored y_t.[/QUOTE]
How do you know the verifier and author are not the same person or that the verifier is not working together with the author? You are "trusting" that the verifier is independent and does not have access to the original y_t submitted to the server. |
[QUOTE=axn;548371]How do you know the verifier and author are not the same person or that the verifier is not working together with the author? You are "trusting" that the verifier is independent and does not have access to the original y_t submitted to the server.[/QUOTE]
Yes, this is exactly the scenario that has to be dealt with. Solution is simple: provide the verifier with x_t^random and expect y_t^random from him. Even if the verifier knows (x_t, y_t), he has to do discrete logarithm to obtain the random exponent and cheat. Not a small problem, to say the least. Computing x_t^random^distance is much easier. This way the tester can securely verify his own test, which is a useful feature for projects with low participation. |
[QUOTE=patnashev;548344]Short answer: no, you can't do that without making cheating easy..[/QUOTE]
Thanks for joining the discussion. Your insights are invaluable. So if I understand correctly, knowing h1...hn in advance allows cheating with 1/2 PRP test effort. My extension where h1 depends on the middle value can be foiled with 3/4 PRP test effort. This does not mean it is a bad idea. It is hard to imagine someone wanting to submit a bogus result for that much CPU effort. If the temporary file space issue is insurmountable for a particular user, or for us to use a higher power to reduce cost of a single AWS verifier solution, this might still make sense. It is important for us to understand the tradeoff we are making. We could even support 2 different proof file hashing functions (full and weakened). It is easy to implement 2 different hash strategies in the verifier program. |
George's very, very rough AWS verifier price guess:
Using AWS Lambda ([url]https://aws.amazon.com/lambda/pricing/[/url]) My KabyLake 4-core CPU does about 8ms / iteration using all 4 cores. Thus, for a 100M test, KabyLake = 8ms * 100M iterations Assume proof power=10 Verify cost is about 1/1000 of a full test = 8ms * 100K = 800 sec. Assume an AWS Lambdainstance is as powerful as my KabyLake Assume verifier program uses 512MB Assume no cost for bandwidth used. totalCompute (GB-s) = 800 * 512MB/1024 = 400 GB-s cost = 400 * $0.00001667 = $0.007 Sounds great except.... times 900 PRP tests/day * 30 days/month = $180/month |
800 sec / proof x 900 / day * 1 day / 86400 sec = 8.3+ KabyLakes.
Isn't that about 1 RadeonVII? $600 amortized over 3 years = $200/year Bandwidth, system to house it, power and cooling not included. GIMPS server has bandwidth. Direct power ~$1-2/watt-year x 500 watts with system and big added drives ~$500-1000/year power. System components amortization $500/year? Cooling cost depends on utility rates and geography. Getting close to your rough AWS number without the personnel that keep that running. How do Google Compute or MS Azure or other offerings compare to AWS? |
[QUOTE=kriesel;548471]How do Google Compute or MS Azure or other offerings compare to AWS?[/QUOTE]
A bit premature to compare as my very, very, rough estimate could easily be off by a factor of 10. |
| All times are UTC. The time now is 14:45. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.