mersenneforum.org  

Go Back   mersenneforum.org > Math Stuff > Computer Science & Computational Number Theory

Reply
 
Thread Tools
Old 2019-05-09, 17:07   #67
axn
 
axn's Avatar
 
Jun 2003

10010000101112 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
For future runs, started from scratch, would Gerbicz error checking eliminate the need for parallel runs?
Yes. Even for the present run, you can start from a known good check point and continue with GEC.
axn is offline   Reply With Quote
Old 2019-05-09, 19:39   #68
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×3×1,879 Posts
Default

Quote:
Originally Posted by axn View Post
Yes. Even for the present run, you can start from a known good check point and continue with GEC.
I'm not so sure about that - it's the same issue that's been debated in other threads hereabouts: "Does GEC means GIMPS can skip double-checks?" GEC can certainly give the tester much-greater confidence in the results of his run, but then there remains the issue of "now you need to convince the rest of the world".

My sense is that for academic-style comp-NT research like Fermat testing you're always going to need at least 2 independent runs. The academic-CS types are already gonna look askance at the fact of 2 runs done using floating-point rather than integer math. I will post the complete set of 10Miter interim residue files online after my runs complete so anyone who desires to do so can do a parallel DC using whatever code/algorithm they like, but personally I've no interested in (IMO) wasting time on the kind of pure-integer parallel verification we did for F24.
ewmayer is online now   Reply With Quote
Old 2019-05-09, 20:00   #69
Mysticial
 
Mysticial's Avatar
 
Sep 2016

7·47 Posts
Default

Personally, I'm part of the "independent run" camp because that limits the number of failure points that can lead to an incorrect announcement.

What's the probability that GEC fails to detect an error? How many GEC checks are there? What about implementation bugs? Is it true that a single missed error among all the checks will result in an undetected wrong result?

The rule that I usually follow is: "Can an adversary cause a misleading result by injecting a single error into a flawless computation at just the right place and time?"

I don't know how GEC is implemented. But lets say your save file encounters a bit flip from when you wrote it to when you read it. Will it catch that?
Mysticial is offline   Reply With Quote
Old 2019-05-09, 22:04   #70
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

260128 Posts
Default

A bit of good news re. the missing data from my 60M-FFT F30 run ... that was running on a 32-core avx2 Xeon, the last time I was able to access said machine was late February (Feb 21, as I now can precisely date.) I knew that run was at iteration ~700M at that point, but my local copy of the run-status file was months older, and only covered through iteration 531.4M. But! - and this is one of the reasons I keep my macbook classic's OS frozen at the wonderfully stable OS X 10.6.8, despite the attendant lack-of-support headache, and why I set my xterms to save en effectively unlimited command history - it turns out the Xterm from which I would do my several-times-weekly run-status quick-checks has data going all the way back to the above file-download date, last October. And a check of when-did-I-last-need-to-reboot confirms:

MacBook:obj_sse2 ewmayer$ uptime
14:56 up 206 days, 15:51, 3 users, load averages: 0.22 0.24 0.39

So by simply scrolling down through said window's long captured history I was able to recover a nicely granular snapshot of the 'missing' run history. My regular run-status checks consisted of 'tail -2 f30.stat && date', here the last portion of the thusly-recovered history:
Code:
[Feb 19 15:10:28] F30 Iter# = 681200000 [63.44% complete] clocks = 01:36:54.030 [  0.0581 sec/iter] Res64: C120506BBDB97A13. AvgMaxErr = 0.212160316. MaxErr = 0.312500000.
[Feb 19 16:50:10] F30 Iter# = 681300000 [63.45% complete] clocks = 01:39:38.457 [  0.0598 sec/iter] Res64: 63EA4A0A58D2DCFA. AvgMaxErr = 0.212213476. MaxErr = 0.312500000.

[Feb 21 20:10:37] F30 Iter# = 684600000 [63.76% complete] clocks = 01:35:04.862 [  0.0570 sec/iter] Res64: D4F7D0654B95C2C3. AvgMaxErr = 0.212131626. MaxErr = 0.281250000.
[Feb 21 21:48:03] F30 Iter# = 684700000 [63.77% complete] clocks = 01:37:22.205 [  0.0584 sec/iter] Res64: 9E0B4E5DD7C1C0A5. AvgMaxErr = 0.212171292. MaxErr = 0.312500000.
My KNL run @64M went up through just a little past iteration 730M, so assuming the avx-512 Skylake-Xeon timings @60M are, say, ~10% faster than @64M, I'll probably have my new cycle donor restart the 60M run from the iteration 680M checkpoint file and the 64M run from the iteration 730M checkpoint file, and let the former slowly catch up with the latter.
ewmayer is online now   Reply With Quote
Old 2019-05-09, 23:12   #71
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

23×863 Posts
Default

Quote:
Originally Posted by Mysticial View Post
What's the probability that GEC fails to detect an error?
Infinitessimal to a very large power.

Quote:
How many GEC checks are there?
Irrelevant.

Quote:
What about implementation bugs?
None known in prime95 29.6 (29.4 had one). Of course, if they were known we would fix them.

Quote:
Is it true that a single missed error among all the checks will result in an undetected wrong result?
A proper GEC implementation won't have a single missed error.

Quote:
The rule that I usually follow is: "Can an adversary cause a misleading result by injecting a single error into a flawless computation at just the right place and time?"
I've tried to make sure that can't happen, but now that you mention it there may be a way. If the loop counter is corrupted right after a GEC check there might be an issue. I'll have to go read the code. I think stack corruption that controls program flow (or instruction pointer corruption) is the biggest vulnerability.

Quote:
I don't know how GEC is implemented. But lets say your save file encounters a bit flip from when you wrote it to when you read it. Will it catch that?
Yes. There are always two bignums in memory (and the save file). If either is bit-flipped a later GEC check will catch the error.


Taking all the above into account, I agree that double-checks are still necessary, albeit at a lower priority than LL double-checking.

Last fiddled with by Prime95 on 2019-05-09 at 23:15
Prime95 is online now   Reply With Quote
Old 2019-05-10, 02:47   #72
axn
 
axn's Avatar
 
Jun 2003

10010000101112 Posts
Default

Quote:
Originally Posted by ewmayer View Post
I'm not so sure about that - it's the same issue that's been debated in other threads hereabouts: "Does GEC means GIMPS can skip double-checks?" GEC can certainly give the tester much-greater confidence in the results of his run, but then there remains the issue of "now you need to convince the rest of the world".

My sense is that for academic-style comp-NT research like Fermat testing you're always going to need at least 2 independent runs. The academic-CS types are already gonna look askance at the fact of 2 runs done using floating-point rather than integer math. I will post the complete set of 10Miter interim residue files online after my runs complete so anyone who desires to do so can do a parallel DC using whatever code/algorithm they like, but personally I've no interested in (IMO) wasting time on the kind of pure-integer parallel verification we did for F24.
Unfortunately, convincing the rest of the world is more of an expectation management thing rather than actual mathematical thing. So you're probably right that independent double check is necessary in these scenarios. However ...

A simple scheme where a single GEC run plus saving all the GEC-checked full residues, should together constitute a verifiable proof. With these residues, you or anyone else can do an independent double check later. In fact, this could be a novel angle to an academic paper -- how you managed to do a single run to positively verify a number using GEC!
/IMO
axn is offline   Reply With Quote
Old 2019-05-10, 06:30   #73
Mysticial
 
Mysticial's Avatar
 
Sep 2016

7·47 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Taking all the above into account, I agree that double-checks are still necessary, albeit at a lower priority than LL double-checking.
I imagine that some kind of double-check is needed anyway to protect against Byzantine faults.
Mysticial is offline   Reply With Quote
Old 2019-05-16, 19:06   #74
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2·3·1,879 Posts
Default

Thanks to a generous donation of cycles on a 32-core avx512 machine by a forumite who wishes to remain anonymous, the 60M-FFT F30 run has been restarted from the 680Miter savefile. At that FFT length this machine gets 60ms/iter using 16 cores, but the || scaling deteriorates beyond that, using all 32 cores we get 50ms/iter. In contrast, David Stanfill's 32-core avx2 Xeon which I had been using for that run was in the 80-90ms range using 16 cores, but continued to scale nicely up to the full 32, getting 50 ms/iter as its fastest timing. More typically, due to other system loads (mainly the GPU), I got ~55 ms/iter on that one.

I am still trying to fiind out whassup w/David and the GIMPS KNL; to that end I sent e-mail to several addresses (squirrelsresearch.com@domainsbyproxy.com, press@airsquirrels.com) last weekend, but have received no reply. I was hoping to arrange to have the KNL ground-shipped to a GIMPS officer (maybe Aaron Blosser, who admins the Primenet server, has room) and then restart my 64M run on it.
ewmayer is online now   Reply With Quote
Old 2020-01-19, 21:37   #75
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2×3×1,879 Posts
Default

My anonymous cycle donor's continuation of my 2/3-complete run of F30 @60M FFT length finished on
Friday, he has given me permission to thank him by name: it is Ryan Propper, who also did some yeoman's mass-DC work for GIMPS last year. Here the final residues, again the (mod 2^36) one of the now-traditional Selfridge-Hurwitz residue triplet is simply a decimal-form recapitulation of the low 9 hexits of the GIMPS-style Res64:
Code:
[Jan 15 11:28:26] F30 Iter# = 1069900000 [99.64% complete] clocks = 01:34:00.016 [  0.0564 sec/iter] Res64: D4E1ADB4D56B90F5. AvgMaxErr = 0.184720270. MaxErr = 0.281250000.
[Jan 15 13:02:36] F30 Iter# = 1070000000 [99.65% complete] clocks = 01:34:05.574 [  0.0565 sec/iter] Res64: E0A3C552712AD603. AvgMaxErr = 0.184693172. MaxErr = 0.281250000.
[Jan 15 14:36:48] F30 Iter# = 1070100000 [99.66% complete] clocks = 01:34:05.895 [  0.0565 sec/iter] Res64: 44DE22333F576C20. AvgMaxErr = 0.184676185. MaxErr = 0.281250000.
[Jan 15 16:09:26] F30 Iter# = 1070200000 [99.67% complete] clocks = 01:32:33.138 [  0.0555 sec/iter] Res64: C92DEFA95553316A. AvgMaxErr = 0.184739228. MaxErr = 0.265625000.
...
[Jan 17 23:54:10] F30 Iter# = 1073741823 [100.00% complete] clocks = 00:38:54.202 [  0.0558 sec/iter] Res64: A70C2A3DB98D6D9D. AvgMaxErr = 0.184703967. MaxErr = 0.250000000.
F30 is not prime. Res64: A70C2A3DB98D6D9D. Program: E17.1
F30 mod 2^36     =          58947628445
F30 mod 2^35 - 1 =          26425548225
F30 mod 2^36 - 1 =          59810773698
It's kinda stunning to see those iteration counts over 1 billion in the stat-file checkpoint lines - one knows they're gonna occur, but still, seeing actual run data like that for the first time is a bit of a "wow" experience.

These still await confirmation by way of completion of the second run @64M FFT - I had been doing both the 60M and 64M runs on hardware hosted by David Stanfill, the 64M one was on the GIMPS-crowdfunded Intel Knights Landing machine which we used for early avx-512 code development. David went AWOL last spring, and Ryan has kindly agreed to also finish the 64M-FFT run on hardware available to him. Restarting that one @iteration 730M, at the 60 ms/iter he is getting on the same 32-core AVX-512 Intel Skylake virtual machine he used to finish the 60M run, the ETC is 8 months from now.

Ryan also sent me the every-10Miter persistent residue savefiles (in the same kind of FFT-length-independent bytewise format Mlucas uses for Mersenne-test savefiles) for his run, so - again pending cross-confirmation via the 64M run - I have a complete chain of 107 such flles (128 Mbyte each) which can be used for any future independent confirmation of the F30 computation in distributed-computation fashion by multiple machines, each crunching one such 10Miter subinterval and verifying the next link in the residue chain.

Last fiddled with by ewmayer on 2020-01-19 at 21:38
ewmayer is online now   Reply With Quote
Old 2020-01-25, 19:01   #76
JeppeSN
 
JeppeSN's Avatar
 
"Jeppe"
Jan 2016
Denmark

11510 Posts
Default

Does this simply say that F30 is composite, or does it help to determine whether the cofactor F30 / ((149041*2^32 + 1)*(127589*2^33 + 1)) is composite (Suyama)? /JeppeSN
JeppeSN is offline   Reply With Quote
Old 2020-01-25, 19:55   #77
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2·3·1,879 Posts
Default

@Jeppe: This is simply the basic Pépin residue ... I have Suyama cofactor-PRP code based on my own custom GMP-style bigint library also in place, but the various short-length residues of said test for the smaller Fermats where others have done the Pépin+Suyama and published e.g. the Res64 for the latter do not agree with mine. It may simply be a difference in some convention re. generation of the residue, but I need to dig into things and determine its source before I make any public announcements re. the F30 cofactor. (And also one wants the 2nd double-check run at the slightly larger FFT length to finish and confirm the Pépin-test results.)
ewmayer is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
P-1/P+1 on Fermat numbers ATH Operazione Doppi Mersennes 2 2015-01-25 06:27
What are the Primality Tests ( not factoring! ) for Fermat Numbers? Erasmus Math 46 2014-08-08 20:05
LLT numbers, linkd with Mersenne and Fermat numbers T.Rex Math 4 2005-05-07 08:25
Two Primality tests for Fermat numbers T.Rex Math 2 2004-09-11 07:26
Fermat Numbers devarajkandadai Math 8 2004-07-27 12:27

All times are UTC. The time now is 23:18.

Thu Jul 2 23:18:59 UTC 2020 up 99 days, 20:52, 1 user, load averages: 1.20, 1.19, 1.31

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.