mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Number Theory Discussion Group (https://www.mersenneforum.org/forumdisplay.php?f=132)
-   -   Efficient Proth/PRP Test Proof Scheme (https://www.mersenneforum.org/showthread.php?t=25323)

 patnashev 2020-03-01 14:04

Efficient Proth/PRP Test Proof Scheme

1 Attachment(s)
[SIZE="3"]Executive Summary[/SIZE]

Double checking is long considered necessary for coordinated search for prime numbers. Indeed, a prime missed due to hardware error, software misconfiguration or credit cheating can mean years, decades or even centuries of additional work. But the price is high, the resources needed to test a candidate are doubled. Recent developments in hardware error detection (Gerbicz check) allow more certainty in the performance of hardware, but search coordinators still can't be sure that all participants run their software properly, with hardware checks enabled and no malicious intent present.

Proposed scheme does not eliminate the double check, but allows more efficient validation of the original result. Instead of redoing the full test, only ~5% of work is need to validate the result. This comes at a price of additional amounts of data needed to be transmitted from the tester to the coordinator, about 10 MB per million decimal digits of tested number. But this data doesn't need to be stored, after a relatively fast processing it is turned into a certificate the size of the number, which is stored and transmitted to a double checker for validation.

[SIZE="3"]Implementation Summary[/SIZE]

The scheme borrows heavily from the [URL="https://www.mersenneforum.org/showthread.php?t=22510"]Gerbicz check[/URL]. It's the same idea of verifying a sequence by transforming each member into the next one by a single operation. The coordinator receives [i]s[/i] checkpoints at regular interval [i]M[/i] from the tester. It is convenient (and reliable) to save the checkpoints right after [i]L^2[/i] Gerbicz check, so it's advised to make [i]M[/i] a multiple of [i]L^2[/i]. Naturally, we can compute the result by [i]M/2[/i] iterations on average starting from the last checkpoint, but we need to prove the correctness of the last checkpoint first. Instead of proving each checkpoint in the sequence individually, we tag each of them with a random number and multiply all of them into one. The resulting certificate can be proven by [i]M[/i] iterations, which requries [i]1/s[/i] of original work. Combined with computing the result, the verification work averages to [i]1.5/s[/i] of total. By keeping [i]s[/i] in 20..40 range, we can ensure the verification to be less than 10% while keeping the amount of data transmitted at reasonable level.

Note that the certificate is salted with a sequence of random numbers, which makes the information in it scrambled and unusable for any other purpose but verification. Even if the double checker is the same person as the original tester, it gives him no advantage because without knowing the salt one can neither speed up the calculation nor tamper with its outcome. The result of the verification process is sent to the coordinator who compares it with the result obtained during construction of the certificate.

Also note that due to checkpoints being saved after the Gerbicz check, no hardware error can cause the verification to fail. A failure of the verification is a strong indicator of cheating and should result in permanent ban in most cases.

I've implemented the scheme (and the Gerbicz check) using LLR source code as a base. My implementation can be downloaded here:
[url]https://www.dropbox.com/s/1uidmmbw5oclmh7/Proof.zip?dl=1[/url]
It contains Windows and Linux binaries, bat files with all the steps as well as readme file which describes each step in detail. Note that for [i]b!=2[/i] the test is ~17% slower than the original LLR.

[SIZE="3"]Theory Summary[/SIZE]

The scheme requires the test to be an exponentiation to a power which is itself a power of a relatively small number. So, it works for Proth and Fermat tests of numbers of the form [i]N = k*b^n+c[/i]. Both of these tests require exponentiation of [i]a^k[/i] to [i]b^n[/i] power.
[TEX]\text{ 1. Let } M = L^2, \, r_0 = a^k, \, r_1 = {r_0}^{2^M}, ..., \, r_s = {r_{s-1}}^{2^M} = {r_0}^{2^{s \cdot M}}=a^{k \cdot 2^{\lfloor n/M \rfloor \cdot M}} \\ \\ \text{ 2. Let } x_i = random(1,2^{64}), \, P_D = {\prod_{0}^{s-1} {r_i}^{x_i}}, \, P_S = {\prod_{1}^{s} {r_i}^{x_{i-1}}} \\ \\ \text{ 3. Compute } P_{DC} = {P_D}^{2^M} \\ \\ \text{ 4. } P_S = P_{DC} \text{ only if all } r_i \text{ are correct }[/TEX]

Step 1 is performed by the tester, Step 2&4 by the coordinator, Step 3 by a double checker. [i]2[/i]'s in the formulas can be replaced with [i]b[/i], [i]a^k[/i] is double checked in Step 2, [i]x_i[/i] do not need to be stored, and a residue of [i]P_S[/i] is enough for Step 4.

While proving the correctness of the scheme rigorously is challenging for me, I've run some simulations. For [i]N=5*2^11+1, a=3, s=3, M=3,[/i] 10 sets of random [i]x_i[/i], exhaustive search of [i](Z/NZ)^s[/i] yielded only one solution, the correct one. This proves that the general solution is unique. Probabilistic simulations with [i]r_0 = a^k, r_1 = ... = r_{s-1} = 1, r_s = (a^k)^(2^m)[/i] yielded the false positive rate very close to [i]1/X_MAX[/i]. For [i]X_MAX = 2^64[/i] the probability of cheating the scheme is the same as guessing RES64 for a regular test.

 R.D. Silverman 2020-03-01 16:51

[QUOTE=patnashev;538633]

[SIZE="3"]Theory Summary[/SIZE]

The scheme requires the test to be an exponentiation to a power which is itself a power of a relatively small number. So, it works for Proth and Fermat tests of numbers of the form [i]N = k*b^n+c[/i]. Both of these tests require exponentiation of [i]a^k[/i] to [i]b^n[/i] power.
[TEX]\text{ 1. Let } M = L^2, \, r_0 = a^k, [/TEX]
[/QUOTE]

 patnashev 2020-03-01 17:00

L is the constant of Gerbicz check. Can be selected to keep s in 20..40 range. If L = 1024, M is about a million.

a is the parameter of Fermat and Proth tests.Usually a = 3 for big numbers.

 R.D. Silverman 2020-03-01 17:22

[QUOTE=patnashev;538643]L is the constant of Gerbicz check. Can be selected to keep s in 20..40 range. If L = 1024, M is about a million.

a is the parameter of Fermat and Proth tests.Usually a = 3 for big numbers.[/QUOTE]

I have not looked deeply, but the scheme seems workable.

 patnashev 2020-04-20 11:49

Setting [i]L = sqrt(n/s)[/i] moves the last checkpoint almost to the end of the test. It allows to save on the calculation of the result, and makes [i]s[/i] (the number of uploaded checkpoints) the main configurable parameter of the scheme. If [i]s[/i] = 16, the verification of a certificate requires 6% of the work, upload size is 6MB per 1M decimal digits.

 preda 2020-04-21 00:44

Please have a look at using VDFs for PRP verification:

 patnashev 2020-04-21 05:17

preda, Pietrzak's VDF looks interesting. A similiar idea, using similiar tools, but more complicated proof calculation. Worth researching.

 patnashev 2020-04-28 17:40

We started testing this scheme, as well as Gerbicz check implementation, on the private GFN server.
[url]http://boincvm.proxyma.ru:30080/test4vm/[/url]

Invitation code "PrimeGrid", select "LLR2 testing" and "Run test applications" in the settings.

Source code for LLR2:
[url]https://github.com/patnashev/llr2[/url]

 patnashev 2020-04-30 19:01

I'm very excited about Pietrzak VDF. It has potential to decrease the upload to 2MB per million decimal digits for 6% validation work.

I've implemented it in LLR2, pushed to GitHub already. Not tested yet though.
My implementation looks like this:
[CODE] int[] Prove(int t, int[] r) // t - depth, r - checkpoints
{
int i, j, k, e;
int[] h = new int[t]; // hashes
int[] u = new int[t]; // products
int s = 1 << t; // number of points

int PS = r[s]; // the result
u[0] = r[s/2];

for (i = 1; i < t; i++)
{
h[i] = hash(PS);
PS = mul(PS, pow(u[i - 1], h[i]));

u[i] = 1;
for (j = 0; j < (1 << i); j++)
{
e = 1;
for (k = 0; k < i; k++)
if ((j & (1 << k)) == 0)
e *= h[i - k];
u[i] = mul(u[i], pow(r[(1 + j*2) << (t - i - 1)], e));
}
}

return u;
}

Cert BuildCert(int t, int r0, int rs, int[] u)
{
int i, h;
int PD = r0; // a^k
int PS = rs; // the result

for (i = 0; i < t; i++)
{
h = hash(PS);
PD = mul(u[i], pow(PD, h));
PS = mul(PS, pow(u[i], h));
}

h = random();
PD = pow(PD, h);
PS = pow(PS, h);

if (PS == 0 || gcd(PS, N) != 1)
throw new Exception();

return new Cert(PD, PS);
}
[/CODE]

 patnashev 2020-05-13 08:09

Pietrzak VDF testing started on the private GFN server. Typical task looks like this:

[CODE]<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
BOINC PrimeGrid wrapper 2.01 (May 8 2020 09:10:05)
running ../../projects/boincvm.proxyma.ru_30080_test4vm/llr2_win64_200511.exe -v
LLR2 Program - Version 0.9.3, using Gwnum Library Version 29.8
running ../../projects/boincvm.proxyma.ru_30080_test4vm/llr2_win64_200511.exe -oGerbicz=1 -oProofName=proof -oProofCount=8 -oProductName=prod -oPietrzak=1 -oCachePoints=1 -pSavePoints -q19*2^3588874+1 -d -oDiskWriteTime=1
Starting Proth prime test of 19*2^3588874+1
Using all-complex AVX FFT length 200K, Pass1=640, Pass2=320, clm=2, a = 3, L2 = 899*499, M = 448601
Compressed 8 points to 3 products. Time : 22.330 sec.
Testing complete.
07:28:01 (4572): called boinc_finish(0)

</stderr_txt>
]]>
[/CODE]

 patnashev 2020-05-13 08:17

2 Attachment(s)
And I made infographic explaining how the scheme works. This is my original scheme:
[ATTACH]22292[/ATTACH]

And this is Pietrzak VDF:
[ATTACH]22293[/ATTACH]

All times are UTC. The time now is 11:52.