mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

R. Gerbicz 2017-09-22 20:30

[QUOTE=Prime95;468378]
5: Fermat PRP cofactor variant, N/d is PRP if a^(N-1) = a^d mod N/d
[/QUOTE]

In 5 it should be a^N==a^d mod (N/d)
or you can divide it by "a", if you want that:
a^(N-1)==a^(d-1) mod (N/d).
Ofcourse this 2 are equivalent forms since in all these cases gcd(a,N)=1.

Btw in some cases even if only the truncated RES64 kept, say in 1 and 3 we can do a quick comparison that decides with high probability if these are the same (so would be the same for the same exponent):
Let r1 and r2 the two RES64 remainders where these comes from the exponents differing by one:
r1=(a^e mod N) mod (2^64)
r2=(a^(e+1) mod N) mod (2^64)
then r2==a*r1-i*N mod (2^64) for some 0<=i<a.

if the difference is 2 between the exponents, then
r3==a^2*r1-i*N mod (2^64) and 0<=i<a^2 is the range for "i" etc.
In other words there are only at most 9 different possible values for r3=(a^(e+2) mod N) mod (2^64) if you know r1 and a=3.

It was a small trick, but good to know.

axn 2017-09-23 02:04

[QUOTE=Prime95;468378]2) The residue remains the same as more and more factors are found for a Mersenne number - no double-check needed.[/quote]
How? Since ...
[quote]N/d is PRP if a^(N-1) = a^d mod [B]N/d[/B] [/quote]
The modulus will be different -- N/d1, N/d2, etc... -- unless you are really only doing (mod N).

[QUOTE=Prime95;468378] 3) If a user decides to archive his PRP save files he can retest with a new known factor quite quickly.[/QUOTE]
It would be good if the server itself saves the residue -- then no need to send it out for subsequent tests. IIRC, there is a trick where you only need to store a few hundred MSBs of the full residue to do this quick verification (there was some thread in the forum sometime back on this topic -- might even be the Suyama trick that R.Gerbicz mentioned upthread).

But assume that this will not be done. Are you saying that you'll build the intelligence into P95 so that if the final save file is available (calculated using method 5), and a new test with additional factor is done, it will do the short circuit evaluation?

axn 2017-09-23 02:13

[QUOTE=R. Gerbicz;468381]Btw in some cases even if only the truncated RES64 kept, say in 1 and 3 we can do a quick comparison that decides with high probability if these are the same (so would be the same for the same exponent):
Let r1 and r2 the two RES64 remainders where these comes from the exponents differing by one:
r1=(a^e mod N) mod (2^64)
r2=(a^(e+1) mod N) mod (2^64)
then r2==a*r1-i*N mod (2^64) for some 0<=i<a.

if the difference is 2 between the exponents, then
r3==a^2*r1-i*N mod (2^64) and 0<=i<a^2 is the range for "i" etc.
In other words there are only at most 9 different possible values for r3=(a^(e+2) mod N) mod (2^64) if you know r1 and a=3.

It was a small trick, but good to know.[/QUOTE]

That [B]is[/B] a good trick. It means, George doesn't have to add additional logic in the client to "divide by 3" to get "normal" residue. Instead, if another program produces "normal" residues, we'll still be able to compare.

Prime95 2017-09-23 02:45

[QUOTE=axn;468392]How? Since ...

The modulus will be different -- N/d1, N/d2, etc... -- unless you are really only doing (mod N).
[/quote]

Yes, prime95 calculates a^N mod N. This is the residue to send to the server and it should match previous PRP tests with 0 or more known factors. The result is then further reduced to a^N mod N/d.

[quote]But assume that this will not be done. Are you saying that you'll build the intelligence into P95 so that if the final save file is available (calculated using method 5), and a new test with additional factor is done, it will do the short circuit evaluation?[/QUOTE]

I'm saying I'll have the option to implement that feature (either now or in the future).

R. Gerbicz 2017-09-23 10:53

[QUOTE=Prime95;468378]
For first-time PRP assignments, the server will not specify a PRP base or residue type.
[/QUOTE]
Working with any base is pretty dangerous, for Mersenne numbers base=2 or any power of two is NOT usable (why?, little homework), maybe main programs allow only odd base(?), but it is important to know that there are "bad" bases. And for Proth numbers I would choose a quadratic nonresidue as base, because in that case we are in the Proth test, so we have a true primality test.

[QUOTE=Prime95;468378]
Interim residues are application dependent. However, it is strongly encouraged to report Fermat PRP residues using a standard binary powering algorithm. That is, iteration 0 is a, iteration 1 is a^(top 2 bits of N-1), etc.[/QUOTE]
I see you points (what axn has also written), it would be good to print the same interim residues. (to match the same residue at the same iteration(!)). There could be some issue with your count: the top 3 bits of a Mersenne number is 7 and after 2 iterations you still only know a^(2^2)=a^4 mod N, so not a^8, where you could get easily a^7.


For k!=1 there is some change in the test:
we test N=k*2^n+c (fixed b=2).
[CODE]
u0=a^k mod N
u[i]=u0^(2^i) mod N
d[t]=prod(i=0,t,u[L*i]) mod N
d[t]=d[t-1]*u[L*t] mod N
d[t]=u0*d[t-1]^(2^L) mod N (to do the error check)
[/CODE]
(this was also described first at [url]http://www.mersenneforum.org/showthread.php?p=465629[/url])
the main change that at the error check we multiple by a general number, u0. This is not a timing issue, as we would do probably even less than 50 error checks at the main wavefront of Riesel/Sierpinski numbers (more checks for Mersenne/Fermat numbers, but there u0=3 smallish). So the slowdown is less than 1 sec. per test.

Save in file u0 or recompute(?) it at each restart, not an issue since k is small.
Also a change that there is absolutely no error checks while you don't get u0=a^k mod N.

Ofcourse as mentioned in some cases you have to do some mults at interim file save to get the Fermat Prp interim residue because the top i bits of N=k*2^n+c could be different from the top i bit of k*2^n. This also means that you should need to do some mults to get back to the error check's algorithm's u[i]=u0^(2^i) mod N residue if you have an interim file save's Fermat prp residue. The error check's algorithm is doing n iterations: iteration=0 to iteration=n the final residue is u[n]=u0^(2^n) mod N, and this iteration count is independent from k and c.

An alternative way to handle this is that to delay the k-th powermod to the end, and before at each interim file save do a k-th powermod, but in this case you have to save also the original residue(!) to be able to continue the algorithm at restart (doing k-th root is hard).

R. Gerbicz 2017-09-23 11:23

For Proth numbers if you stop at iteration=n-1 at the error check algorithm then you have the Proth test (if you would go further then you are in the weaker Fermat test).

Prime95 2017-09-23 14:48

[QUOTE=R. Gerbicz;468417]Working with any base is pretty dangerous[/quote]

Yes, Primenet assumes the client is smart. Both gpuOwl and prime95 default to base=3.

[quote]
I see your points (what axn has also written), it would be good to print the same interim residues. (to match the same residue at the same iteration(!)). There could be some issue with your count: the top 3 bits of a Mersenne number is 7 and after 2 iterations you still only know a^(2^2)=a^4 mod N, so not a^8, where you could get easily a^7.[/quote]

Yes, prime95 is aware of this off-by-one-iteration problem when PRPing a Mersenne number. If you ask for the residue after 2 iterations, prime95 will do 3 squarings to get a^8. The interim residue output code then does a mul by a^-1 to output a^7. The good news is that interim residues are not output by default and probably only used as a debugging aid.

Note this mul by a^-1 is not done for interim save files. These save files are not meant to be portable, so I just save whatever I need for a quick restart.

BTW, thank you for the error-checking idea, clear explanations, and development feedback. Job well done!

preda 2017-09-27 08:03

[QUOTE=Prime95;468378]Here is my PRP residue proposal:

There are (at least) 5 PRP residue types:
1; Fermat PRP. N is PRP if a^(N-1) = 1 mod N
2: SPRP variant, N is PRP if a^((N-1)/2) = +/-1 mod N
3: Fermat PRP variant. N is PRP if a^(N+1) = a^2 mod N
4: SPRP variant. N is PRP if a^((N+1)/2) = +/-a mod N
5: Fermat PRP cofactor variant, N/d is PRP if a^(N-1) = a^d mod N/d

I encourage programs to return type 1 residues as that has been the standard for prime95, PFGW, LLR for many years.

I think Prime95 v29.4 will have options to support all of the above residue types for doing double-checks. PrimeNet will store residue type along with the residue -- need another JSON argument. When issuing PRP double-check assignments, the server will send the PRP base and desired residue type to the client. For first-time PRP assignments, the server will not specify a PRP base or residue type.

Interim residues are application dependent. However, it is strongly encouraged to report Fermat PRP residues using a standard binary powering algorithm. That is, iteration 0 is a, iteration 1 is a^(top 2 bits of N-1), etc.

The server will soon have about 50 type-4 residues from gpuOwl, which prime95 should be able to verify someday. While I'd prefer gpuOwl converts to type-1 residues, it is not a requirement.

The server also has several thousand type-1 PRP cofactor residues. I'm thinking of making type-5 the default for v29.4 as it has some advantages. 1) It allows Gerbicz error checking, 2) The residue remains the same as more and more factors are found for a Mersenne number - no double-check needed. 3) If a user decides to archive his PRP save files he can retest with a new known factor quite quickly.[/QUOTE]

If I understand correctly, the "div-by-3" operation needed for moving the residue "one 3 behind" for mersenne numbers requires one full modMul operation. In addition to that, one modSquare is needed to align the iteration numbers between 2^k and (2^(k+1))-1. Because of this, I think it's not worth for gpuOwl to switch to outputting intermediate residues in the "type-1" form. I would rather de-emphasize the intermediate residues, for example by not displaying them by default (same as Prime95).

OTOH for the final residue, I can trivially change from type-4 to type-3 if that's preferred, and probably without too much trouble change to type-1 (by doing a "div-by-9" from type-3).

So, I'm going to submit the existing gpuOwl residues as type-4, and consider switching in the future to type-3 or type-1. But I don't see enough reason to "massage" the intermediate residues to change their format yet.

R. Gerbicz 2017-09-27 10:57

[QUOTE=preda;468636]If I understand correctly, the "div-by-3" operation needed for moving the residue "one 3 behind" for mersenne numbers requires one full modMul operation. [/QUOTE]

Modular division by a small "d" number takes only linear time: say you want to get x/d mod N, where d is coprime to N. Let
r=x mod d
s=N mod d
then x/d==(x+k*N)/d mod N for k==(-r/s) mod d
and there is an exact(!) division here: d divides x+k*N

the idea behind this was to get the good k multiplier:
k is good iff x+k*N==0 mod d, so k==-x/N==-r/s mod d.

axn 2017-09-27 14:37

[QUOTE=R. Gerbicz;468640]Modular division by a small "d" number takes only linear time: say you want to get x/d mod N, where d is coprime to N. Let
r=x mod d
s=N mod d
then x/d==(x+k*N)/d mod N for k==(-r/s) mod d
and there is an exact(!) division here: d divides x+k*N

the idea behind this was to get the good k multiplier:
k is good iff x+k*N==0 mod d, so k==-x/N==-r/s mod d.[/QUOTE]

Furthermore, in the particular case of N=Mp, d=3, s will always be 1.

GP2 2017-09-27 18:01

[QUOTE=preda;468636]So, I'm going to submit the existing gpuOwl residues as type-4, and consider switching in the future to type-3 or type-1. But I don't see enough reason to "massage" the intermediate residues to change their format yet.[/QUOTE]

I don't really understand all the details being discussed here, but...

I think once you submit a truncated (64-bit or whatever) residue of a given type, and throw away the remaining bits, it's no longer possible to convert that truncated residue from one type to another, no?

And for the sake of being able to double-check residues in the database that were returned by different programs, what matters is simply that the 64-bit hex strings match, regardless of how they were calculated.

So switching from one type to another in the future would be quite problematic, and it would be quite important for all software authors to agree on a type, right from the start. Or am I missing something?


All times are UTC. The time now is 21:16.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.