mersenneforum.org  

Go Back   mersenneforum.org > Math Stuff > Computer Science & Computational Number Theory

Reply
 
Thread Tools
Old 2020-01-25, 20:52   #78
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

1,493 Posts
Default

Quote:
Originally Posted by ewmayer View Post
@Jeppe: This is simply the basic Pépin residue ... (And also one wants the 2nd double-check run at the slightly larger FFT length to finish and confirm the Pépin-test results.)
Have you checked that:
(3^(2^(2^m)) mod F_n) mod q = 3^(2^(2^m)) mod q
is true for the stored interim residues?

Since you have 3^(2^(2^m)) mod F_n then to get the left side is trivial and you can get quickly the right side, becuase q is small , the check is cheap. Though it is not that very-very strong check, but the two known factors could find a mismatch with probability = 1-(1e-10) which is still quite impressive. Note that we don't get the strength of 1/(q-1) or 1/znorder(Mod(3,q)) because the order is divisible by a "large" power of two, and that will be lost in the iterated squarings of a wrong residue.

Last fiddled with by R. Gerbicz on 2020-01-25 at 20:54 Reason: small typo
R. Gerbicz is offline   Reply With Quote
Old 2020-01-27, 21:41   #79
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×3×29×67 Posts
Default

Quote:
Originally Posted by R. Gerbicz View Post
Have you checked that:
(3^(2^(2^m)) mod F_n) mod q = 3^(2^(2^m)) mod q
is true for the stored interim residues?

Since you have 3^(2^(2^m)) mod F_n then to get the left side is trivial and you can get quickly the right side, becuase q is small , the check is cheap. Though it is not that very-very strong check, but the two known factors could find a mismatch with probability = 1-(1e-10) which is still quite impressive. Note that we don't get the strength of 1/(q-1) or 1/znorder(Mod(3,q)) because the order is divisible by a "large" power of two, and that will be lost in the iterated squarings of a wrong residue.
ITYM (3^(2^(2^m - 1)) mod Fm) mod q = 3^(2^(2^m - 1)) mod q, since the Pépin test is just an Euler-pseudoprime test (which happens to prove rigorous in the special case of the Fm), i.e. computes 3^(2^((Fm - 1)/2)) mod Fm = 3^(2^((2^m)/2)) mod Fm. But in any event, that's a good suggestion - fiddled my code to load the final Pépin-test residue R from the savefile and for the 2 known small prime factors p=640126220763137 and q=1095981164658689 computed

R == 367500396407933 (mod p) and
R == 95971534699540 (mod q).

I then separately computed 3^(2^(2^m - 1)) mod p and q via (2^m-1) = 1073741823 iterated squarings modulo p and q, separately, and the results match, which anyone can confirm on even quite modest hardware. So we have quite good confidence in the result, on top of the 2 runs agreeing through ~740 Miter (as of the latest 64M-FFT run update from Ryan), and said runs being done on high-end server ssytems with ECC memory.

For F31 I will also be using the now-famous periodic whole-residue check named in honor of the above-quoted forum member. :)

Last fiddled with by ewmayer on 2020-01-27 at 21:43
ewmayer is offline   Reply With Quote
Old 2020-01-28, 08:49   #80
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

101110101012 Posts
Default

Quote:
Originally Posted by ewmayer View Post
I then separately computed 3^(2^(2^m - 1)) mod p and q via (2^m-1) = 1073741823 iterated squarings modulo p and q, separately, and the results match, which anyone can confirm on even quite modest hardware.
Trivial note: it is faster to reduce the exponent mod (p-1) [use Fermat's little theorem].
R. Gerbicz is offline   Reply With Quote
Old 2021-07-02, 18:57   #81
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101101100010102 Posts
Default Fermat-modmul with circularly shifted residues

I spent an appreciable amount of time last year wrestling with basic concepts and implementation-in-code of Fermat-mod squaring chains in the presence of residue shift. This proves rather more interesting (a polite euphemism for "pain in the butt") than for the Mersenne-mod case. The Gerbicz check renders it less important, but it may still prove useful in some contexts, so figured I'd post a summary.

Let s0 = initial shift, i.e. instead of starting the Pépin test of Fm with initial seed x0, we start with (x0 << s0) (mod Fm).

[1] Data layout: for Mersenne-mod IBDWT, we have 2 different wordsizes differing by one bit, so need to loop over the elements in our variable-wordsize array until we find the one containing the (s0)th bit, then shift x0 by the appropriate small number of bits to place it on the proper bit boundary within that target word. For Fermat-mod, the easy case is that of power-of-2 FFT length, for which we have a fixed power-of-2 wordsize. More generally we will have a non-power-of-2 FFT length optimized for the target modulus - e.g. for F24 we can use 896 Kdoubles, and for F30 I did one run at 60M and the other (now 93% complete) at 64M. In both cases we use a special acyclic-transform-effecting DWT which multiplies our N input words by the first N (2N)th complex roots of unity, i.e. the (j)th input gets multiplied by DWT weight w[j] = exp(I*Pi*j/N). This means multiplying our original real input word by a complex number, but we notice that w[j+N/2] = I*w[j], making it natural to group the (j)th and (j+N/2)th input words together in memory and treat them as a single input to a complex FFT. This is the so-called "right-angle transform" trick, and is detailed in the original Crandall/Fagin 1994 DWT paper. In the Mersenne-mod IBDWT case we can also treat adjacent even/odd-indexed imput elements as a single input to a complex FFT, but as the Mersenne-mod IBDWT weights are strictly real, these are simply pairs of adjacent input words.

For Fermat-mod arithmetic using a non-power-of-2-length transform we need to layer a Mersenne-style dual-wordlength base representation and similarly defined IBDWT weights atop our acyclic-effecting DWT. One simplification relative to the Mersenne case is that for FFT length [odd]*2^k, the Fermat-mod wordsizes and IBDWT weights recur in repeating length-[odd] patterns.

In the context of a right-angle Fermat-mod acyclic transform, finding the target word for a given initial-seed shift means looping first over the even-indexed elements of our residue vector (considered here as a length-N real array), then if the accumulated wordsizes are still less than the target shift s0, looping over the odd-indexed elements until we reach the target word. For the Fermat-mod case we need only do this for the initial seed of the Pépin test; for the Mersenne-mod case, specifically the Lucas-Lehmer test, we must also repeat the exercise on each subsequent iteration in order to find the proper bit offset for injection of the -2 term added to the outout of each mod-squaring output. This can be done very efficiently - specifically in O(1) time - by initializing a suitable bitmap encoding the variable wordsizes of the Mersenne-mod IBDWT and using that to effect a fast lookup scheme.

[2] For a basic squaring-chain-with-shift scheme as we do for Mersenne number M(p), each mod-squaring doubles the shift count (mod p). That means that for any s0 != 0 (mod p) the shift will remain nonzero and nonrepeating during the LL/PRP-test, since at squaring i, our shift s = s0.2^i (mod p) and the multiplicative group (mod p) is of maximal order p-1.

OTOH for squaring chains modulo a Fermat number Fm, each subsequent squaring doubles the shift count (mod 2^m), so we have the problem that any nonzero initial shift s0 will lead to a shift s = 0 after precisely m square-mods, and the shift will remain 0 throughout the remainder of the squaring chain. One way to get around this problem is to define an auxiliary pseudorandom-bit array, and on the (i)th square-mod, if bit[i] = 1, do a further mod-doubling of the residue, thus yielding the shift-count update s[i] = 2.s[i-1] + bit[i]. For a pair of Pépin tests of Fm such as are typically done to allow for real-time cross-validation of interim residues, it is also advisable to seed the bit-arrays differently for each run.

[3] For the random-bit-offset scheme in [2], we further need a postprocessing step to determine the correct sign to apply to the final residue; this relates to the shift-doubling modulo the binary exponent which occurs on each square-mod. For Mp, squaring a shifted residue R.2^s gives R^2.2^2s (mod Mp) and 2^2s == 2^(2s-p) (mod Mp), so at each iteration we simply double the shift count (mod p). For arithmetic modulo Fm, by way of contrast, each time the doubled shift count incurs a reduction (mod 2^m) the shifted residue undergoes a change of sign: R^2.2^2s (mod Fm) == -R^2.2^(2s-2^m) (mod Fm), so a shifted-residue scheme must keep track of both the shift count and the sign of the shifted residue, relative to the true unshifted one.

However: since - unlike LL - this PRP test involves just repeated mod-squaring, we don't care if the sign of the residue on any particular intermediate iteration is flipped, since the subsequent mod-squaring will flip it back to +. The only thing we care about is whether the sign of the final residue (either for the Pépin test or for a particular checkpoint subinterval thereof) has flipped w.r.to that of its immediate predecessor, the penultimate residue, since such a final-squaring sign-flip has no ensuing mod-squaring to autocorrect it. Thus, if there is a final-squaring sign flip we need to unflip the sign of the resulting residue manually. But that is not all, since generating a shift-removed final residue also involves a modular multiply or divide by a power of 2. I've always found it conceptually easier to multiply-mod rather to divide-mod, so for both that and less-code-to-maintain reasons, my mi64 multiword int library has low-level code only for leftward circular shift, and effects s-bit rightward circular shift of a b-bit number by a (b-s)-bit leftward one. That is all that is needed working modulo a Mersenne number, as these are of form 2^p-1 and thus any bits off-shifted to the left get rewrapped from the right with + sign. Once again, Fermat moduli are trickier. Modulo Fm, a leftward circular shift must feed the bits off-shifted to the left back into the right end with a - sign, and the aforementioned shrc-done-as-shlc substitution implies a negation (mod Fm) of the resulting shift-removed residue.

Thus, if at end of our current checkpoint interval or Pépin test, our residue R needed a sign-flip due the doubled-shift-count needing a reduction (mod 2^m), our shrc-done-as-shlc already took care of it. If not, the residue needs an explicit negation. Rather than doing an explicit Fm - R with the attendant digitwise negations and borrows, one can simply do a bitwise-complement of the residue vector and further += 2 of limb 0 in order to recover the unshifted residue.

Last fiddled with by ewmayer on 2021-08-04 at 00:42 Reason: ficks speelink
ewmayer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
P-1/P+1 on Fermat numbers ATH Operazione Doppi Mersennes 2 2015-01-25 06:27
What are the Primality Tests ( not factoring! ) for Fermat Numbers? Erasmus Math 46 2014-08-08 20:05
LLT numbers, linkd with Mersenne and Fermat numbers T.Rex Math 4 2005-05-07 08:25
Two Primality tests for Fermat numbers T.Rex Math 2 2004-09-11 07:26
Fermat Numbers devarajkandadai Math 8 2004-07-27 12:27

All times are UTC. The time now is 10:06.


Sat Oct 16 10:06:11 UTC 2021 up 85 days, 4:35, 0 users, load averages: 1.22, 1.03, 0.97

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.