mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2017-09-22, 20:30   #210
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2×743 Posts
Default

Quote:
Originally Posted by Prime95 View Post
5: Fermat PRP cofactor variant, N/d is PRP if a^(N-1) = a^d mod N/d
In 5 it should be a^N==a^d mod (N/d)
or you can divide it by "a", if you want that:
a^(N-1)==a^(d-1) mod (N/d).
Ofcourse this 2 are equivalent forms since in all these cases gcd(a,N)=1.

Btw in some cases even if only the truncated RES64 kept, say in 1 and 3 we can do a quick comparison that decides with high probability if these are the same (so would be the same for the same exponent):
Let r1 and r2 the two RES64 remainders where these comes from the exponents differing by one:
r1=(a^e mod N) mod (2^64)
r2=(a^(e+1) mod N) mod (2^64)
then r2==a*r1-i*N mod (2^64) for some 0<=i<a.

if the difference is 2 between the exponents, then
r3==a^2*r1-i*N mod (2^64) and 0<=i<a^2 is the range for "i" etc.
In other words there are only at most 9 different possible values for r3=(a^(e+2) mod N) mod (2^64) if you know r1 and a=3.

It was a small trick, but good to know.

Last fiddled with by R. Gerbicz on 2017-09-22 at 20:35 Reason: small typos
R. Gerbicz is offline   Reply With Quote
Old 2017-09-23, 02:04   #211
axn
 
axn's Avatar
 
Jun 2003

13DA16 Posts
Default

Quote:
Originally Posted by Prime95 View Post
2) The residue remains the same as more and more factors are found for a Mersenne number - no double-check needed.
How? Since ...
Quote:
N/d is PRP if a^(N-1) = a^d mod N/d
The modulus will be different -- N/d1, N/d2, etc... -- unless you are really only doing (mod N).

Quote:
Originally Posted by Prime95 View Post
3) If a user decides to archive his PRP save files he can retest with a new known factor quite quickly.
It would be good if the server itself saves the residue -- then no need to send it out for subsequent tests. IIRC, there is a trick where you only need to store a few hundred MSBs of the full residue to do this quick verification (there was some thread in the forum sometime back on this topic -- might even be the Suyama trick that R.Gerbicz mentioned upthread).

But assume that this will not be done. Are you saying that you'll build the intelligence into P95 so that if the final save file is available (calculated using method 5), and a new test with additional factor is done, it will do the short circuit evaluation?

Last fiddled with by axn on 2017-09-23 at 02:06 Reason: quote
axn is offline   Reply With Quote
Old 2017-09-23, 02:13   #212
axn
 
axn's Avatar
 
Jun 2003

2×3×7×112 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
Btw in some cases even if only the truncated RES64 kept, say in 1 and 3 we can do a quick comparison that decides with high probability if these are the same (so would be the same for the same exponent):
Let r1 and r2 the two RES64 remainders where these comes from the exponents differing by one:
r1=(a^e mod N) mod (2^64)
r2=(a^(e+1) mod N) mod (2^64)
then r2==a*r1-i*N mod (2^64) for some 0<=i<a.

if the difference is 2 between the exponents, then
r3==a^2*r1-i*N mod (2^64) and 0<=i<a^2 is the range for "i" etc.
In other words there are only at most 9 different possible values for r3=(a^(e+2) mod N) mod (2^64) if you know r1 and a=3.

It was a small trick, but good to know.
That is a good trick. It means, George doesn't have to add additional logic in the client to "divide by 3" to get "normal" residue. Instead, if another program produces "normal" residues, we'll still be able to compare.
axn is offline   Reply With Quote
Old 2017-09-23, 02:45   #213
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

165618 Posts
Default

Quote:
Originally Posted by axn View Post
How? Since ...

The modulus will be different -- N/d1, N/d2, etc... -- unless you are really only doing (mod N).
Yes, prime95 calculates a^N mod N. This is the residue to send to the server and it should match previous PRP tests with 0 or more known factors. The result is then further reduced to a^N mod N/d.

Quote:
But assume that this will not be done. Are you saying that you'll build the intelligence into P95 so that if the final save file is available (calculated using method 5), and a new test with additional factor is done, it will do the short circuit evaluation?
I'm saying I'll have the option to implement that feature (either now or in the future).
Prime95 is offline   Reply With Quote
Old 2017-09-23, 10:53   #214
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2×743 Posts
Default

Quote:
Originally Posted by Prime95 View Post
For first-time PRP assignments, the server will not specify a PRP base or residue type.
Working with any base is pretty dangerous, for Mersenne numbers base=2 or any power of two is NOT usable (why?, little homework), maybe main programs allow only odd base(?), but it is important to know that there are "bad" bases. And for Proth numbers I would choose a quadratic nonresidue as base, because in that case we are in the Proth test, so we have a true primality test.

Quote:
Originally Posted by Prime95 View Post
Interim residues are application dependent. However, it is strongly encouraged to report Fermat PRP residues using a standard binary powering algorithm. That is, iteration 0 is a, iteration 1 is a^(top 2 bits of N-1), etc.
I see you points (what axn has also written), it would be good to print the same interim residues. (to match the same residue at the same iteration(!)). There could be some issue with your count: the top 3 bits of a Mersenne number is 7 and after 2 iterations you still only know a^(2^2)=a^4 mod N, so not a^8, where you could get easily a^7.


For k!=1 there is some change in the test:
we test N=k*2^n+c (fixed b=2).
Code:
u0=a^k mod N
u[i]=u0^(2^i) mod N
d[t]=prod(i=0,t,u[L*i]) mod N
d[t]=d[t-1]*u[L*t] mod N
d[t]=u0*d[t-1]^(2^L) mod N  (to do the error check)
(this was also described first at http://www.mersenneforum.org/showthread.php?p=465629)
the main change that at the error check we multiple by a general number, u0. This is not a timing issue, as we would do probably even less than 50 error checks at the main wavefront of Riesel/Sierpinski numbers (more checks for Mersenne/Fermat numbers, but there u0=3 smallish). So the slowdown is less than 1 sec. per test.

Save in file u0 or recompute(?) it at each restart, not an issue since k is small.
Also a change that there is absolutely no error checks while you don't get u0=a^k mod N.

Ofcourse as mentioned in some cases you have to do some mults at interim file save to get the Fermat Prp interim residue because the top i bits of N=k*2^n+c could be different from the top i bit of k*2^n. This also means that you should need to do some mults to get back to the error check's algorithm's u[i]=u0^(2^i) mod N residue if you have an interim file save's Fermat prp residue. The error check's algorithm is doing n iterations: iteration=0 to iteration=n the final residue is u[n]=u0^(2^n) mod N, and this iteration count is independent from k and c.

An alternative way to handle this is that to delay the k-th powermod to the end, and before at each interim file save do a k-th powermod, but in this case you have to save also the original residue(!) to be able to continue the algorithm at restart (doing k-th root is hard).
R. Gerbicz is offline   Reply With Quote
Old 2017-09-23, 11:23   #215
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

27168 Posts
Default

For Proth numbers if you stop at iteration=n-1 at the error check algorithm then you have the Proth test (if you would go further then you are in the weaker Fermat test).
R. Gerbicz is offline   Reply With Quote
Old 2017-09-23, 14:48   #216
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
Working with any base is pretty dangerous
Yes, Primenet assumes the client is smart. Both gpuOwl and prime95 default to base=3.

Quote:
I see your points (what axn has also written), it would be good to print the same interim residues. (to match the same residue at the same iteration(!)). There could be some issue with your count: the top 3 bits of a Mersenne number is 7 and after 2 iterations you still only know a^(2^2)=a^4 mod N, so not a^8, where you could get easily a^7.
Yes, prime95 is aware of this off-by-one-iteration problem when PRPing a Mersenne number. If you ask for the residue after 2 iterations, prime95 will do 3 squarings to get a^8. The interim residue output code then does a mul by a^-1 to output a^7. The good news is that interim residues are not output by default and probably only used as a debugging aid.

Note this mul by a^-1 is not done for interim save files. These save files are not meant to be portable, so I just save whatever I need for a quick restart.

BTW, thank you for the error-checking idea, clear explanations, and development feedback. Job well done!
Prime95 is offline   Reply With Quote
Old 2017-09-27, 08:03   #217
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Here is my PRP residue proposal:

There are (at least) 5 PRP residue types:
1; Fermat PRP. N is PRP if a^(N-1) = 1 mod N
2: SPRP variant, N is PRP if a^((N-1)/2) = +/-1 mod N
3: Fermat PRP variant. N is PRP if a^(N+1) = a^2 mod N
4: SPRP variant. N is PRP if a^((N+1)/2) = +/-a mod N
5: Fermat PRP cofactor variant, N/d is PRP if a^(N-1) = a^d mod N/d

I encourage programs to return type 1 residues as that has been the standard for prime95, PFGW, LLR for many years.

I think Prime95 v29.4 will have options to support all of the above residue types for doing double-checks. PrimeNet will store residue type along with the residue -- need another JSON argument. When issuing PRP double-check assignments, the server will send the PRP base and desired residue type to the client. For first-time PRP assignments, the server will not specify a PRP base or residue type.

Interim residues are application dependent. However, it is strongly encouraged to report Fermat PRP residues using a standard binary powering algorithm. That is, iteration 0 is a, iteration 1 is a^(top 2 bits of N-1), etc.

The server will soon have about 50 type-4 residues from gpuOwl, which prime95 should be able to verify someday. While I'd prefer gpuOwl converts to type-1 residues, it is not a requirement.

The server also has several thousand type-1 PRP cofactor residues. I'm thinking of making type-5 the default for v29.4 as it has some advantages. 1) It allows Gerbicz error checking, 2) The residue remains the same as more and more factors are found for a Mersenne number - no double-check needed. 3) If a user decides to archive his PRP save files he can retest with a new known factor quite quickly.
If I understand correctly, the "div-by-3" operation needed for moving the residue "one 3 behind" for mersenne numbers requires one full modMul operation. In addition to that, one modSquare is needed to align the iteration numbers between 2^k and (2^(k+1))-1. Because of this, I think it's not worth for gpuOwl to switch to outputting intermediate residues in the "type-1" form. I would rather de-emphasize the intermediate residues, for example by not displaying them by default (same as Prime95).

OTOH for the final residue, I can trivially change from type-4 to type-3 if that's preferred, and probably without too much trouble change to type-1 (by doing a "div-by-9" from type-3).

So, I'm going to submit the existing gpuOwl residues as type-4, and consider switching in the future to type-3 or type-1. But I don't see enough reason to "massage" the intermediate residues to change their format yet.
preda is offline   Reply With Quote
Old 2017-09-27, 10:57   #218
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2×743 Posts
Default

Quote:
Originally Posted by preda View Post
If I understand correctly, the "div-by-3" operation needed for moving the residue "one 3 behind" for mersenne numbers requires one full modMul operation.
Modular division by a small "d" number takes only linear time: say you want to get x/d mod N, where d is coprime to N. Let
r=x mod d
s=N mod d
then x/d==(x+k*N)/d mod N for k==(-r/s) mod d
and there is an exact(!) division here: d divides x+k*N

the idea behind this was to get the good k multiplier:
k is good iff x+k*N==0 mod d, so k==-x/N==-r/s mod d.
R. Gerbicz is offline   Reply With Quote
Old 2017-09-27, 14:37   #219
axn
 
axn's Avatar
 
Jun 2003

508210 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
Modular division by a small "d" number takes only linear time: say you want to get x/d mod N, where d is coprime to N. Let
r=x mod d
s=N mod d
then x/d==(x+k*N)/d mod N for k==(-r/s) mod d
and there is an exact(!) division here: d divides x+k*N

the idea behind this was to get the good k multiplier:
k is good iff x+k*N==0 mod d, so k==-x/N==-r/s mod d.
Furthermore, in the particular case of N=Mp, d=3, s will always be 1.
axn is offline   Reply With Quote
Old 2017-09-27, 18:01   #220
GP2
 
GP2's Avatar
 
Sep 2003

5×11×47 Posts
Default

Quote:
Originally Posted by preda View Post
So, I'm going to submit the existing gpuOwl residues as type-4, and consider switching in the future to type-3 or type-1. But I don't see enough reason to "massage" the intermediate residues to change their format yet.
I don't really understand all the details being discussed here, but...

I think once you submit a truncated (64-bit or whatever) residue of a given type, and throw away the remaining bits, it's no longer possible to convert that truncated residue from one type to another, no?

And for the sake of being able to double-check residues in the database that were returned by different programs, what matters is simply that the 64-bit hex strings match, regardless of how they were calculated.

So switching from one type to another in the future would be quite problematic, and it would be quite important for all software authors to agree on a type, right from the start. Or am I missing something?
GP2 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 16:56.


Mon Aug 2 16:56:54 UTC 2021 up 10 days, 11:25, 0 users, load averages: 2.53, 2.38, 2.23

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.