mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2018-06-25, 03:13   #1
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·43·47 Posts
Default Application of Jacobi check to P-1 factoring (paging owftheevil and other coding wizards)

The following is primarily about feasibility. There may well be errors. Hopefully any errors are not fatal to it. There probably are optimizations and impacts I haven't seen.

https://www.mersenne.org/various/math.php says, about prime95/mprime's P-1 factoring:
"The P-1 method is quite simple. In stage 1 we pick a bound B1. P-1 factoring will find the factor q as long as all factors of k are less than B1 (k is called B1-smooth). We compute E - the product of all primes less than B1. Then we compute x = 3^(E*2*P). Finally, we check the GCD (x-1, 2^P-1) to see if a factor was found."

If the above is a current and accurate description of the prime95 P1 computation process, or even if it is not in certain ways, it appears to offer some opportunities to employ the Jacobi symbol check in part of the prime95 P-1 factoring effectively.

A quick review of parts of the prime95 v29.4b7 source code indicates the description might be somewhat more completely stated as:
"We compute most of 2*p*E - the product of powers of all odd primes less than B1 and 2 and exponent P, up to a limit. Then we compute x= 3^(E*2*P) and in the process include the rest of the computation of E implicitly." So the full E or E*2*p term may be not available for a separate Jacobi check of it. I think most or all of it is precomputed for
usual PrimeNet assignments. For example, for a B1 bound of 800,000, there are 63,951 primes smaller than the bound.

These primes (except for 2?) raised to a power to produce a value just under B1 will have up to 20 bits each. Current P-1 assignment exponents ~88 million occupy 27 bits, and the factor 2 occupies one. So an estimate of number of bits
in E*2*p is 63,950 * 20 + 28 = ~ 1,279,028 bits. This is Nbits / B1 = ~1.599 bits. For B1 1,000,000 and exponent 91 million, there are 78498 primes smaller than the B1 bound. 91 million also occupies 27 bits; 2 one, so E*2*p ~ 78497 * 20 + 28 =~ 1569968 bits; Nbits/B1 =~ 1.57.
The prime95 v29.4b7 source says, in part:
/* First restart point. Compute the big exponent (a multiple of small */
/* primes). Then compute 3^exponent. The exponent always contains 2*p. */
/* We only compute 1.5 * B bits (up to about 20 million bits). The rest of the */
/* exponentiation will be done one prime at a time in the second part of stage 1. */
So, about 15/16 (93.75%) - 15/15.7 (95.5%) of the bit total of a usual (E*2*P) is precomputed. (This will change somewhat as GIMPS progresses on exponents.) It's close enough to 100% that a small code change could lift it to 100% for some time to come for P-1 assignments being assigned just ahead of first-time primality checks.

In CUDAPm1, stage one is also raising 3 to a power of the product of the primes below B1 (http://www.mersenneforum.org/showpos...1&postcount=20) Since it's doing the same math and was somewhat derived from prime95's approach, the following should also apply to it more or less. Unless CUDAPm1's math sufficiently deviates in the details to require a separate treatment, the following applies to CUDAPm1. (CUDAPm1 is hard to read code, at least for me; P-1 math plus CUDA plus not as many comments.)

Any other software (gmp-ecm P-1 factoring, for instance http://www.mersenneforum.org/showthread.php?t=4087) may also be somewhat checkable with Jacobi symbols. Haven't looked at it at all for feasibility or whether it's already there.

1) Jacobi symbols are multiplicative if either top or bottom are constant. So (a/n) (b/n) = (ab/n)

2) From 1), the value of an even power of Jacobi symbol is +1 or 0 not -1; (a/n)(a/n)= (a^2/n) = 1 whether (a/n) is -1 or +1 (or it's 0 if there's a factor f>1 in common for a and n, including if a=n, n>1).

3) x = 3^(E*2*P) is always an even power of 3.

4) 3) and 2) in combination, give a Jacobi symbol (x/Mp) = 1 (or 0 if there's a factor of Mp in there), but not -1.
Let a=3^(E*p), n=Mp. (a/n)(a/n) = 1 = (a^2/n) = ( ((3^(E*p))^2 / Mp)
= ( (3^(E*2*p) / Mp) = ( x / Mp )

5) x appears to be computed modulo Mp, for speed and size. So a natural denominator for the Jacobi is also Mp. Jacobi check time for (x/Mp) should be similar to what Preda has determined for the LL form of gpuOwL and similar size exponents which was under a minute for up to 78M bits for different model cpus. This is also consistent with Prime95
LL test Jacobi check times.

6) If E was fully computed, we could also try to determine the correct Jacobi (E/n) or ((E*2*p)/n) for some well chosen fixed n (n= M31 or perhaps n=M61, since it would fit in registers and be relatively prime to the smaller primes and exponent p; B1<=B2 which typically <<M31 for routine GIMPS P-1 factoring assignments). Doing so appears to require of order 80,000 individual small Jacobi computations for current P-1 Primenet assignments (for the small primes < B1~10^6), and combining them via 1). The speed of this computation would be high because the operands are word size, and speed could be improved by use of precomputed partial results and saving extensions to higher B1 as the program operates, since n is constant. That's only one side of the check on E; the other side is to compute (E/n) or ((E*2*p)/n) and compare. This should also be faster than Preda's timed Jacobi evaluations, since the size of E is ~1.6 million bits for current exponents, much shorter than the ~78 million of the PRP residue corresponding to Preda's timing. A Jacobi symbol check could be applied up to the limit computed explicitly. I'm not sure if this check on computation of E (or most of E or E*2*P) is worth doing.

Returning to considering the more comprehensive Jacobi of 5):
If (x/Mp) yields 1, it's consistent with the computation being correct.

If (x/Mp) is value 0 instead of 1, there's a factor, or the Jacobi computation erred. In the 0 case, the Jacobi could be quickly rechecked. The gcd should reveal a factor. It should be of the correct forms; 1 or 7 mod 8, 2kp+1. If it does not find a factor, or it claims to have found a factor but the claimed factor does not pass the form checks, the gcd has erred. I've seen this happen sometimes in CUDAPm1. (The gcd could be retried immediately by the factoring program, provided it has retained a copy of the input data, since its computation is considerably shorter than the preceding computations. The retry could be attempted by a different library's gcd function than the first time, since a gcd computation error might be a software issue. Or the gcd's input could be saved to disk for later attempts by other means, or inspection for debugging.)

If (x/Mp) yields Jacobi value -1, the P-1 calculation has erred, or the Jacobi check has. In the -1 case the Jacobi could be quickly retried. If the Jacobi repeatedly produced -1, the P-1 attempt or underlying hardware could be regarded as suspect.

But. Does the Jacobi change due to the modulo Mp operation?
Apparently not, since a modulo is part of calculating the Jacobi symbol, as the first step states. https://en.wikipedia.org/wiki/Jacobi..._Jacobi_symbol
So, it appears stage one P-1 computation could be checked (before the -1 or gcd) with as little as a single Jacobi evaluation taking less than a minute on a decent cpu.

If we could compute the Jacobi check on the cpu, and the bulk of the P-1 test on the gpu, in CUDAPm1, it would separate the computations in a way that somewhat guards against some system memory errors, gpu memory errors, cpu and gpu microcode bugs, etc. affecting both the P-1 and the Jacobi check.

The above covers stage 1. Stage 1 for Mersenne numbers ~85 million exponent take ~3.25 hours (nearly 200 minutes) on a gtx1070, or around 12 hours on an i7-7500u chip (both cores). Checking it once or twice occupying half to one minute per check seems like a good trade. Especially when someone is wondering whether a dearth of factors in their gpu run history is due to statistics or due to error. The check could be made a user selectable option, to preserve current speed in the usual case, and employed when there is question about the reliability of the software or hardware, or a higher reliability P-1 result is desired.

Stage 2 seems to be a different situation.

The terms ( H^q - 1 ) in https://en.wikipedia.org/wiki/Pollar...92_1_algorithm are not even powers. The Jacobis are multiplicative but not additive. Therefore as far as I can tell, the correct Jacobi of the stage 2 product is not fixed as stage 1's (or the LL sequence) was, and would need to be determined for each bound and exponent. This would entail as many Jacobi evaluations as there were primes between B1 and B2, to determine a correct Jacobi product.
Since B1 =~ 7*10^5 and B2 =~17*10^6, that's of order 10^6 Jacobi evaluations, one per prime between B1 and B2 (https://en.wikipedia.org/wiki/Prime-counting_function) at about .5 to 1 minute each, a prohibitively large computation cost at nearly a year to imperfectly check a P-1 computation of perhaps a day duration. So it seems a P-1
stage 2 Jacobi check is out of the question. (Unless there's some way to much more swiftly determine what the correct Jacobi would be for the stage 2 product.)

However, checking at a late point in the stage 1 calculation, an intermediate value which is used also in the stage 2 calculation, helps safeguard accuracy of an aspect of stage 2's computation, at no additional computing cost beyond the stage 1 check.
kriesel is offline   Reply With Quote
Old 2018-06-28, 17:08   #2
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

55216 Posts
Default

Quote:
Originally Posted by kriesel View Post
The following is primarily about feasibility. There may well be errors. Hopefully any errors are not fatal to it. There probably are optimizations and impacts I haven't seen.
That is really working, this could be new (or it is even possible that it is already known and in p95, don't know)
rewording a little (but you had the discovery):

making an error check that detects all errors with (roughly) 50% chance in the first stage of the Pollard's (P-1) method !
we calculate in that stage res==a^E mod n, where E is the product of all prime powers <= B1, so if B1>1 and gcd(a,n)=1
(trivial condition), then (res/n)=1, because E is even, where here we used the Jacobi symbol.
If at any powmod operation we make an error at the multiplication then by roughly 50% chance we get (res2/n)=-1,
and suppose that we have done this mistake not at the first prime power (double check the a^(2^K) mod n calculation)
, then the remaining primepowers are all odd, so when you calculate res2^E2 mod n and you won't do further mistake then
you won't change the Jacobi symbol, and so like at the Jacobi test for LLT you can catch all errors with (roughly) 50% chance.


A quick and dirty demo in PARI-Gp:

Code:
perrortest(n,emult,B,perr)={r=Mod(3,n)^emult;forprime(p=2,B,q=p;while(q<=B/p,q*=p);r=r^q;if(p==perr,r=Mod(random(n),n)));
if(kronecker(lift(r),n)!=1,return(0),return(1))}

sum(i=1,1000,perrortest(2^113-1,2*113,10000,3511))
sum(i=1,1000,perrortest(1+2*3*5*7*11*13*17*19*random(2^100),1,10000,3511))
(18:58) gp > sum(i=1,1000,perrortest(2^113-1,2*113,10000,3511))
%95 = 507
(18:58) gp > sum(i=1,1000,perrortest(1+2*3*5*7*11*13*17*19*random(2^100),1,10000,3511))
%96 = 475
(18:58) gp >
so catched roughly half of all errors in Mersenne number case and for testing random numbers.

[we used kroncker test, because that is available in PARI, an extension of Jacobi, but Jacobi would be enough]
in perrortest we use the fixed base=3, using emult if we know that all/some factor of (p-1) has a factor of emult,
using B=B1 and making a random error at p=perr prime [note that at test perr should be an odd prime<=B, to
make a real error]. We return by 0 if an error detected in the test, otherwise we return by 1.
R. Gerbicz is offline   Reply With Quote
Old 2018-06-28, 17:22   #3
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

136210 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
making an error check that detects all errors with (roughly) 50% chance in the first stage of the Pollard's (P-1) method !
Retracting.
This depends also on your method how you get powmod, say if you calculate a^17 mod n:
a->a^2->a^4->a^8->a^16->a^17 (mod n eveything)
and you make error in a^2 mod n, then you won't reuse it at multiplication, just only its square,
so the (possible) Jacobi error will be corrected at a^17 mod n in every case.
The successfull error detection rate could be somewhere(?) at 10-20% for large Mersenne numbers.
R. Gerbicz is offline   Reply With Quote
Old 2018-06-28, 19:51   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×43×47 Posts
Default

I am confident that the Jacobi test is not yet applied in the prime95 program to P-1 factoring, as of v29.4b7, since I both reviewed the source code and asked the author what checks were included.

I'm rather confident the Jacobi symbol error check is not yet applied in the CUDAPm1 program, since error's post about using it for LL test checking occurred years after the last code development and maintenance for CUDAPm1, and there was no obvious sign of it in either the forum thread describing CUDAPm1's development, or in the CUDAPm1 source code.

In other software, such as gmp-ecm, I have no idea if it's there or not.

Thanks for reviewing it.

I'll need more time to understand your retraction. However, even a 10-20% chance of error detection _might_ be worthwhile. Particularly in the case of a GIMPS participant questioning whether his installation is reliable, detecting a Jacobi error occasionally, at such low probabilities, is still higher than the probability of finding a factor, and the detection probability should occur whether there is a factor found or to be found by P-1, or not. (0.9 ^90 is a small number)

For larger GIMPS exponents, P-1 factoring run times can be quite long. Hundred-megadigit exponents take days each on a GTX1070 for modest bounds, and run time scales roughly as >p^2, paralleling primality test run-times.

Last fiddled with by kriesel on 2018-06-28 at 19:59
kriesel is offline   Reply With Quote
Old 2018-06-29, 13:47   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·43·47 Posts
Default Relative probability of finding factors in stage 1 and stage 2

A check of all my P-1 results on GIMPS for the past year shows the following:
605 no-factor-found

9 stage 1 factor (as determined by B1=B2 in the report)
Of these, one was selected to run knowing it already had a known factor, as a reliability test, so should be excluded. Stage 1 factors found 8/(8+9+605) = 1.29%

9 stage 2 factor (as determined by B2 !=B1 in the report)
Stage 2 only runs if stage 1 does not find a factor.
Stage 2 factors found in 9/(605+9)=1.47%

Ratio found 1.47/1.29=1.14
So odds appear about equal stage 1 vs stage 2, and ~2.73% overall (17/622)

Ratios are given to digits far exceeding what's justified, to avoid round-off error only. The statistics are very poor with only 8 or 9 factor-found samples per stage. Ten or a hundred times as many would be better.

Last fiddled with by kriesel on 2018-06-29 at 13:49
kriesel is offline   Reply With Quote
Old 2018-06-29, 13:52   #6
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

2·3·227 Posts
Default

Now I understand why it isn't in the main programs (p95/gmp-ecm), the reason is that they are using a top-bottom way at powmod using sliding window technique (you can see that it is used in gmp also at powmod). And with this you're using only squaring in a row, then a single mulmod (using even a precalculated table), so the (possible) Jacobi error will be corrected in the next squaring.

You could do the powermod in the other direction, but in that case you'd lost in speed, maybe we'd use only your method if we can't store 3 (or 2) residues in memory at the precalculated table and base is large.
R. Gerbicz is offline   Reply With Quote
Old 2018-07-02, 19:29   #7
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·43·47 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
Now I understand why it isn't in the main programs (p95/gmp-ecm), the reason is that they are using a top-bottom way at powmod using sliding window technique (you can see that it is used in gmp also at powmod). And with this you're using only squaring in a row, then a single mulmod (using even a precalculated table), so the (possible) Jacobi error will be corrected in the next squaring.

You could do the powermod in the other direction, but in that case you'd lost in speed, maybe we'd use only your method if we can't store 3 (or 2) residues in memory at the precalculated table and base is large.
It's gradually making more sense to me. A couple of orthogonal ways of thinking about it:

a) the information in the Jacobi result is less than two bits since there are only 3 values. That's not enough to encode success or failure of several prior squarings or multiplies as well as the last operation (squaring or mul by a).

b) even a rather significant error in the first squaring to kth squaring, where a^k+error effect < Mp, doesn't hurt the final P-2 factoring outcome. It's equivalent to starting with a different a, which is allowed. That's good; the math is inherently resistant to early errors. So there's no need, or payoff, to check for error very early in the powmod.

Error in the power to which the base a is raised, however, is deadly in stage 1. E x 2 x p x random may be ok, but what I chose to test was E x 2 x p + random signed 31-bit value (such as a memory error).

Last fiddled with by kriesel on 2018-07-02 at 19:43
kriesel is offline   Reply With Quote
Old 2018-09-11, 13:32   #8
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

21058 Posts
Default

It is my impression that when computing 3^x (mod Mp) through left-to-right exponentiation (i.e. starting with the most significant bit), that the Jacobi check is not so useful, for this reason:

every step (every bit of "x") does a square followed sometimes by mul-by-3,
and the square basically wipes out any previous error that would show in Jacobi.

So the error, instead of persisting, is erased at every iteration.
preda is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
error during Jacobi check on 330,000,000+ exponent evanh Hardware 5 2018-02-20 03:46
Endlessly Running Jacobi error check on v29.3 emiller Software 10 2017-11-14 10:26
LL testing: Jacobi symbol of the (interim or final) residue minus 2 as error check GP2 Number Theory Discussion Group 33 2017-08-21 22:14
Yet another new factoring algorithm\primality test:Digital Coding ?? tServo Miscellaneous Math 3 2014-04-10 18:52
Paging whoever is P-1 trial-factoring 3.05M - 3.13M GP2 Data 0 2003-11-04 08:43

All times are UTC. The time now is 23:44.

Fri Jul 3 23:44:51 UTC 2020 up 100 days, 21:17, 1 user, load averages: 1.25, 1.26, 1.20

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.