mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Prime Gap Searches (https://www.mersenneforum.org/forumdisplay.php?f=131)
-   -   4e18-5e18 (https://www.mersenneforum.org/showthread.php?t=22187)

CRGreathouse 2017-06-05 21:58

[QUOTE=ATH;460572]Where can I find any information on the Euler-Plumb/Plump test?[/QUOTE]

Hmm. It looks like maybe it originated here
[url]http://philzimmermann.com/EN/bnlib/[/url]
where the author, Colin Plumb, writes
[code] /*
* Now, check that bn is prime. If it passes to the base 2,
* it's prime beyond all reasonable doubt, and everything else
* is just gravy, but it gives people warm fuzzies to do it.
*
* This starts with verifying Euler's criterion for a base of 2.
* This is the fastest pseudoprimality test that I know of,
* saving a modular squaring over a Fermat test, as well as
* being stronger. 7/8 of the time, it's as strong as a strong
* pseudoprimality test, too. (The exception being when bn ==
* 1 mod 8 and 2 is a quartic residue, i.e. bn is of the form
* a^2 + (8*b)^2.) The precise series of tricks used here is
* not documented anywhere, so here's an explanation.
* Euler's criterion states that if p is prime then a^((p-1)/2)
* is congruent to Jacobi(a,p), modulo p. Jacobi(a,p) is
* a function which is +1 if a is a square modulo p, and -1 if
* it is not. For a = 2, this is particularly simple. It's
* +1 if p == +/-1 (mod 8), and -1 if m == +/-3 (mod 8).
* If p == 3 mod 4, then all a strong test does is compute
* 2^((p-1)/2). and see if it's +1 or -1. (Euler's criterion
* says *which* it should be.) If p == 5 (mod 8), then
* 2^((p-1)/2) is -1, so the initial step in a strong test,
* looking at 2^((p-1)/4), is wasted - you're not going to
* find a +/-1 before then if it *is* prime, and it shouldn't
* have either of those values if it isn't. So don't bother.
*
* The remaining case is p == 1 (mod 8). In this case, we
* expect 2^((p-1)/2) == 1 (mod p), so we expect that the
* square root of this, 2^((p-1)/4), will be +/-1 (mod p).
* Evaluating this saves us a modular squaring 1/4 of the time.
* If it's -1, a strong pseudoprimality test would call p
* prime as well. Only if the result is +1, indicating that
* 2 is not only a quadratic residue, but a quartic one as well,
* does a strong pseudoprimality test verify more things than
* this test does. Good enough.
*
* We could back that down another step, looking at 2^((p-1)/8)
* if there was a cheap way to determine if 2 were expected to
* be a quartic residue or not. Dirichlet proved that 2 is
* a quartic residue iff p is of the form a^2 + (8*b^2).
* All primes == 1 (mod 4) can be expressed as a^2 + (2*b)^2,
* but I see no cheap way to evaluate this condition.
*/[/code]

The code itself, if I've chosen the right pieces:
[code] if (bnCopy(e, bn) < 0)
return -1;
(void)bnSubQ(e, 1);
l = bnLSWord(e);

j = 1; /* Where to start in prime array for strong prime tests */

if (l & 7) {
bnRShift(e, 1);
if (bnTwoExpMod(a, e, bn) < 0)
return -1;
if ((l & 7) == 6) {
/* bn == 7 mod 8, expect +1 */
if (bnBits(a) != 1)
return 1; /* Not prime */
k = 1;
} else {
/* bn == 3 or 5 mod 8, expect -1 == bn-1 */
if (bnAddQ(a, 1) < 0)
return -1;
if (bnCmp(a, bn) != 0)
return 1; /* Not prime */
k = 1;
if (l & 4) {
/* bn == 5 mod 8, make odd for strong tests */
bnRShift(e, 1);
k = 2;
}
}
} else {
/* bn == 1 mod 8, expect 2^((bn-1)/4) == +/-1 mod bn */
bnRShift(e, 2);
if (bnTwoExpMod(a, e, bn) < 0)
return -1;
if (bnBits(a) == 1) {
j = 0; /* Re-do strong prime test to base 2 */
} else {
if (bnAddQ(a, 1) < 0)
return -1;
if (bnCmp(a, bn) != 0)
return 1; /* Not prime */
}
k = 2 + bnMakeOdd(e);
}[/code]

No doubt danaj could elucidate further. Edit: his version is [url=https://github.com/danaj/Math-Prime-Util/blob/4fc8396ad9886652d8d1b22899cb787d43abfcd2/primality.c#L119]here[/url].

R. Gerbicz 2017-06-05 22:01

[QUOTE=Antonio;460557]I am pleased to present (with the kind permission of Dana Jacobsen) a new release of the gap code.
[/QUOTE]
[QUOTE=Antonio;460557]
The value of delta will need to be tuned to a lower value to get best performance (it was while I was doing this that I found the bug in the original code), for my i5-3570k I needed to change from:
gap3_4 -n1 %1 -n2 %2 -gap 1250 -delta 200 -sb 24 -bs 16 -mem 12 -t 4
to:
gap5_d -n1 %1 -n2 %2 -gap 1250 -delta 155 -sb 24 -bs 16 -mem 12 -t 4

My throughput increased from around 20.9e9 n/s to around 26.3e9 n/s.
[/QUOTE]

Thanks! Good code, in my test the rate has been reached 27e9 n/sec.

You can get the modinverse roughly 4-5 times faster, but in the code it will be completely in the noise level, because the original is also fast. It is a known idea, the Hensel lifting:
[CODE]
static inline ui64 modular_inverse64(const ui64 a){

static const char mask[128] = {255,85,51,73,199,93,59,17,15,229,195,89,215,237,203,33,
31,117,83,105,231,125,91,49,47,5,227,121,247,13,235,65,63,149,115,137,7,157,123,81,79,
37,3,153,23,45,11,97,95,181,147,169,39,189,155,113,111,69,35,185,55,77,43,129,127,213,
179,201,71,221,187,145,143,101,67,217,87,109,75,161,159,245,211,233,103,253,219,177,175,
133,99,249,119,141,107,193,191,21,243,9,135,29,251,209,207,165,131,25,151,173,139,225,
223,53,19,41,167,61,27,241,239,197,163,57,183,205,171,1};

ui64 ret=mask[(a>>1)&127];
ret*=2+a*ret;
ret*=2+a*ret;
ret*=2+a*ret;
return ret;
}
[/CODE]
with only 3 lifts you get the inverse mod 2^64 from mod 2^8.

Antonio 2017-06-06 03:45

1 Attachment(s)
[QUOTE=R. Gerbicz;460577]Thanks! Good code, in my test the rate has been reached 27e9 n/sec.

You can get the modinverse roughly 4-5 times faster, but in the code it will be completely in the noise level, because the original is also fast. It is a known idea, the Hensel lifting:
[CODE]
static inline ui64 modular_inverse64(const ui64 a){

static const char mask[128] = {255,85,51,73,199,93,59,17,15,229,195,89,215,237,203,33,
31,117,83,105,231,125,91,49,47,5,227,121,247,13,235,65,63,149,115,137,7,157,123,81,79,
37,3,153,23,45,11,97,95,181,147,169,39,189,155,113,111,69,35,185,55,77,43,129,127,213,
179,201,71,221,187,145,143,101,67,217,87,109,75,161,159,245,211,233,103,253,219,177,175,
133,99,249,119,141,107,193,191,21,243,9,135,29,251,209,207,165,131,25,151,173,139,225,
223,53,19,41,167,61,27,241,239,197,163,57,183,205,171,1};

ui64 ret=mask[(a>>1)&127];
ret*=2+a*ret;
ret*=2+a*ret;
ret*=2+a*ret;
return ret;
}
[/CODE]with only 3 lifts you get the inverse mod 2^64 from mod 2^8.[/QUOTE]I added the Hensel lifting code and yes it looks like a speed up of less than 1%. However I know Dana was going to look at another possible speed up which could yield another 5% (or there-about) increase so it could be added in if that happens.
For people who want to compile their own version and try it I have attached a version 0.05.e of the source code.

Thomas11 2017-06-06 08:27

Thanks to all for this impressive speed-up! :bow:

Meanwhile I've got 4160-4172e15 completed (note the quartet!):
[CODE]
4160-4161e15 78 kgaps, largest 1142 @ 4160855564170607801
4161-4162e15 84 kgaps, largest 1380 @ 4161256227746743067
4162-4163e15 111 kgaps, largest 1188 @ 4162650991648045261
4163-4164e15 81 kgaps, largest 1140 @ 4163851461415885439
4164-4165e15 90 kgaps, largest 1204 @ 4164005813345056153
4165-4166e15 86 kgaps, largest 1202 @ 4165268264997138737
4166-4167e15 82 kgaps, largest 1132 @ 4166895731083193611
4167-4168e15 80 kgaps, largest 1200 @ 4167562342954892237
4168-4169e15 79 kgaps, largest 1200 @ 4168914553292110091
4169-4170e15 95 kgaps, largest 1126 @ 4169922460669612393
4170-4171e15 98 kgaps, largest 1234 @ 4170143483650450639
4171-4172e15 83 kgaps, largest 1110 @ 4171138343914429271 & 4171422212516359189 & 4171664786200787539 & 4171829679529702439[/CODE]

Reserving 4180-4200e15.
If you don't mind, I would also start from 4400e15 (reserving the first 10 blocks).

ET_ 2017-06-06 10:16

4012-4013 completed.

[code]
4012-4013e15 87 kgaps, largest 1196 @ 4012914149323655627
[/code]

Reserving 4070-4071 and 4071-4072

pinhodecarlos 2017-06-06 18:25

Is there away someone could update the reservation table with more frequency? TIA

danaj 2017-06-06 19:59

[QUOTE=ATH;460572]There are only 31,894,014 strong 2-prps under 2^64. Is that too slow to give any speed benefits?[/quote]

I did not find the Euler or Miller-Rabin test to be consistently faster than a Fermat test, while the "Euler-Plumb" test comes out a small amount faster in my testing. This may be a function of implementation. The benefit for this thread's purpose is in the speed rather than strength, so we'd only want to use M-R if it was consistently faster.

Background on my using the test. I have looked for some fast compositeness (nee primality) test that is discriminating but reasonable faster than a M-R test. No real luck yet. The Fermat test is really not much faster than a M-R test (assuming equal implementations), but the Euler-Plumb test seems to be just a bit faster for me. Not enough faster to make it useful as a pretest in this scheme:

1. Simple pretests, e.g. trial division / gcd
2. [I]? pretest ?[/I]
3. M-R base 2
4 (strong or AES or ES) Lucas test

3&4 being the BPSW test.

However it is useful in a few places as a "maybe prime" pretest in place of a Fermat or M-R test, and for 64-bit inputs we have a lot of room for modification as it is correct for all inputs. We can probably substitute it in #3 for a small performance gain (only for 64-bit!).


[quote]Where can I find any information on the Euler-Plumb/Plump test?[/QUOTE]

Charles gives a good reference, and it looks like most others (e.g. GNU Crypto) have copied straight from that. My code is completely different, but the concept is all his.


Fermat base 2: [$]2^{p-1} \equiv 1 \pmod p[/$]
119M 64-bit pseudoprimes


Euler (Euler-Jacobi) base 2: [$]2^{(p-1)/2} \equiv \left(\frac{2}{p}\right) \pmod p[/$]
64M 64-bit psuedoprimes, all of which are also Fermat base 2 pseudoprimes.

So the Euler test is more discriminating than the Fermat test and could arguably be a small bit faster.


Colin Plumb started there and did a couple tricks. The first notes that [FONT="Courier New"](2|p)[/FONT] simplifies to [FONT="Courier New"](p % 8 == 1 || p % 8 == 7) ? 1 : -1[/FONT]. The second is to examine what happens if we backed up a square and notes in the [FONT="Courier New"]p % 8 == 1[/FONT] case we can skip the last squaring and accept -1 or +1. This not only sometimes saves another modular squaring but also removes some pseudoprimes. None are added -- all "Euler-Plumb" pseudoprimes are also Euler base 2 pseudoprimes and also Fermat base 2 pseudoprimes. 48M 64-bit pseudoprimes.


The Miller-Rabin (strong pseudoprime) test is stronger yet, with all pseudoprimes being a subset of the Fermat, Euler, and Euler-Plumb tests. 32M 64-bit pseudoprimes.

mart_r 2017-06-06 20:23

1 Attachment(s)
[QUOTE=Antonio;459546]OK, so it seems there are 2 problems.
1. It appears that Perl doesn't have write permission in the folder you are using.
2. You have a later version of Term::ReadKey on your system which doesn't work well with windows.
Could you try the attached code please and let me know the result..

[/QUOTE]

See attachment...

The problems as described before are the same (need to press any key in order to keep the calculation going). Good thing is, there's a log file now.

pinhodecarlos 2017-06-06 20:45

Why don't you use the available windows binaries? Which processor do you have?

danaj 2017-06-06 21:12

While the Perl code should be portable, Robert G's C code is ~10x faster so it is definitely preferred.

While more-or-less similar sieving could be done in the Perl library, it would be hard to get it both reasonably generic and avoid overhead in the C/Perl interface. IMO it's best to leave it all in C. It is still a mildly interesting problem (how much partial sieving would improve results from the current Perl algorithm), but I suspect not really relevant to this thread any more.

It looks currently like we should get the 4e18 to 5e18 range covered in under 6 months, which is pretty incredible. Maybe not given Oliveira e Silva's project record, but previously I don't think anyone had come close to matching that performance.

Antonio 2017-06-06 21:43

1 Attachment(s)
[QUOTE=mart_r;460679]See attachment...

The problems as described before are the same (need to press any key in order to keep the calculation going). Good thing is, there's a log file now.[/QUOTE]
Sorry about the programming errors, the attached version should work ok for you now though.


All times are UTC. The time now is 22:04.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.