mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2020-08-11, 02:33   #12
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2×1,489 Posts
Default

Ok 30.3b2 worked to upload the proof.
ATH is offline   Reply With Quote
Old 2020-08-11, 12:42   #13
Happy5214
 
Happy5214's Avatar
 
"Alexander"
Nov 2008
The Alamo City

6278 Posts
Default

I had an error when trying to certify a PRP-CF proof (v30.3b2, Ubuntu 20.04):

Code:
[Worker #1 Aug 11 07:40] Starting certification of M8608507 using FFT length 448K, Pass1=448, Pass2=1K, clm=4
[Comm thread Aug 11 07:40] CURL library error: 
[Comm thread Aug 11 07:40] CURL library error: 
[Worker #1 Aug 11 07:40] Error getting CERT starting value.
[Worker #1 Aug 11 07:40] Aborting processing of this work unit -- will try again later.
Happy5214 is offline   Reply With Quote
Old 2020-08-12, 03:31   #14
Aramis Wyler
 
Aramis Wyler's Avatar
 
"Bill Staffen"
Jan 2013
Pittsburgh, PA, USA

2×5×41 Posts
Default

Running mprime for the first time on a new Ryzen 5 3600.


I used mostly defaults - 2 workers and 3 cores each - with work type 150 (First time Prime checks).


I'm posting because I'm getting an enormous number of potential round off errors on each worker. The build is new, the cpu could be defective, but I haven't seen any errors or heat issues other than these roundoff errors.


[Worker #2 Aug 11 23:22] Setting affinity to run helper thread 2 on CPU core #6
[Worker #2 Aug 11 23:22] M110534549 stage 1 is 1.17% complete.
[Worker #2 Aug 11 23:23] Possible roundoff error (0.5), backtracking to last save file.
[Worker #2 Aug 11 23:23] Setting affinity to run helper thread 1 on CPU core #5
[Worker #2 Aug 11 23:23] Using FMA3 FFT length 6M, Pass1=1536, Pass2=4K, clm=1, 3 threads
[Worker #2 Aug 11 23:23] Setting affinity to run helper thread 2 on CPU core #6
[Worker #2 Aug 11 23:23] M110534549 stage 1 is 1.22% complete.
[Worker #1 Aug 11 23:23] M110534311 stage 1 is 1.23% complete. Time: 112.608 sec.
[Worker #1 Aug 11 23:24] Possible roundoff error (0.5), backtracking to last save file.
[Worker #1 Aug 11 23:24] Setting affinity to run helper thread 1 on CPU core #2
[Worker #1 Aug 11 23:24] Using FMA3 FFT length 6M, Pass1=1536, Pass2=4K, clm=1, 3 threads
[Worker #1 Aug 11 23:24] Setting affinity to run helper thread 2 on CPU core #3
[Worker #1 Aug 11 23:24] M110534311 stage 1 is 0.14% complete.
[Worker #2 Aug 11 23:25] Possible roundoff error (0.5), backtracking to last save file.
[Worker #2 Aug 11 23:25] Setting affinity to run helper thread 1 on CPU core #5
[Worker #2 Aug 11 23:25] Setting affinity to run helper thread 2 on CPU core #6
[Worker #2 Aug 11 23:25] Using FMA3 FFT length 6M, Pass1=1536, Pass2=4K, clm=1, 3 threads
[Worker #2 Aug 11 23:25] M110534549 stage 1 is 1.40% complete.
[Worker #2 Aug 11 23:25] Possible roundoff error (0.5), backtracking to last save file.
[Worker #2 Aug 11 23:25] Setting affinity to run helper thread 2 on CPU core #6
[Worker #2 Aug 11 23:25] Setting affinity to run helper thread 1 on CPU core #5
[Worker #2 Aug 11 23:25] Using FMA3 FFT length 6M, Pass1=1536, Pass2=4K, clm=1, 3 threads
[Worker #2 Aug 11 23:25] M110534549 stage 1 is 1.40% complete.
[Worker #1 Aug 11 23:25] M110534311 stage 1 is 0.74% complete. Time: 112.937 sec.
[Worker #1 Aug 11 23:26] Possible roundoff error (0.5), backtracking to last save file.
[Worker #1 Aug 11 23:26] Using FMA3 FFT length 6M, Pass1=1536, Pass2=4K, clm=1, 3 threads
[Worker #1 Aug 11 23:26] Setting affinity to run helper thread 2 on CPU core #3
[Worker #1 Aug 11 23:26] Setting affinity to run helper thread 1 on CPU core #2
[Worker #1 Aug 11 23:26] M110534311 stage 1 is 0.14% complete.



EDIT: This is with v30.3 build 2 on 64 bit debian.

Last fiddled with by Aramis Wyler on 2020-08-12 at 03:34 Reason: Add Prime95 version.
Aramis Wyler is offline   Reply With Quote
Old 2020-08-12, 07:02   #15
intelfx
 
Jul 2020

13 Posts
Default

Quote:
Originally Posted by Happy5214 View Post
I had an error when trying to certify a PRP-CF proof (v30.3b2, Ubuntu 20.04):

Code:
[Worker #1 Aug 11 07:40] Starting certification of M8608507 using FFT length 448K, Pass1=448, Pass2=1K, clm=4
[Comm thread Aug 11 07:40] CURL library error: 
[Comm thread Aug 11 07:40] CURL library error: 
[Worker #1 Aug 11 07:40] Error getting CERT starting value.
[Worker #1 Aug 11 07:40] Aborting processing of this work unit -- will try again later.
Same here.


Two issues:
  1. When I first started mprime today, mprime reported that it got CERT work, but then proceeded to work on previous assignments:
    Code:
    Aug 12 07:28:32 stratofortress.nexus.i.intelfx.name mprime[15264]: [Comm thread Aug 12 07:28] Sending expected completion date for M110701609: Aug 23 2020 
    Aug 12 07:28:33 stratofortress.nexus.i.intelfx.name mprime[15264]: [Work thread Aug 12 07:28] Running Jacobi error check.  [Aug 12 07:28] PrimeNet success code with additional info: 
    Aug 12 07:28:33 stratofortress.nexus.i.intelfx.name mprime[15264]: [Comm thread Aug 12 07:28] Server assigned CERT work. 
    Aug 12 07:28:33 stratofortress.nexus.i.intelfx.name mprime[15264]: [Comm thread Aug 12 07:28] Got assignment 005E9C2063038514CF0D0DD5E4DCFCAE: CERT M10447057 
    Aug 12 07:28:33 stratofortress.nexus.i.intelfx.name mprime[15264]: [Comm thread Aug 12 07:28] Done communicating with server. 
    Aug 12 07:28:58 stratofortress.nexus.i.intelfx.name mprime[15264]: [Work thread] Passed.  Time: 25.646 sec. 
    Aug 12 07:28:58 stratofortress.nexus.i.intelfx.name mprime[15264]: [Work thread Aug 12 07:28] Resuming primality test of M109983959 using FMA3 FFT length 6M, Pass1=1536, Pass2=4K, clm=1, 16 threads
  2. When I manually edited my worktodo.txt to place the cert work in front of the queue, mprime entered a failure loop:
    Code:
    Aug 12 07:30:06 stratofortress.nexus.i.intelfx.name mprime[16107]: [Work thread Aug 12 07:30] Starting certification of M10447057 using FMA3 FFT length 560K, Pass1=448, Pass2=1280, clm=1, 16 threads 
    Aug 12 07:30:06 stratofortress.nexus.i.intelfx.name mprime[16107]: [Comm thread Aug 12 07:30] CURL library error: 
    Aug 12 07:30:06 stratofortress.nexus.i.intelfx.name mprime[16107]: [Comm thread Aug 12 07:30] CURL library error: 
    Aug 12 07:30:06 stratofortress.nexus.i.intelfx.name mprime[16107]: [Work thread Aug 12 07:30] Error getting CERT starting value.  Will try again later. 
    Aug 12 07:30:06 stratofortress.nexus.i.intelfx.name mprime[16107]: [Work thread Aug 12 07:30] Aborting processing of this work unit.
    (mprime went on to repeat these messages indefinitely)
Are those bugs, server problems or misconfigurations on my part?


Edit: when I subsequently edited worktodo.txt to put the CERT assignment in the back of the queue and restarted mprime, it still attempted to pick up the CERT assignment, despite it was at the end of the queue (which suggests priority behavior for CERT work). Hence I conclude that (1) is a bug.

Last fiddled with by intelfx on 2020-08-12 at 07:10
intelfx is offline   Reply With Quote
Old 2020-08-12, 10:35   #16
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3·1,193 Posts
Default

Due to a server issue, Linux clients can neither get the CERT starting value, nor upload proofs. Aaron or I will have a fix today.
Prime95 is online now   Reply With Quote
Old 2020-08-12, 10:38   #17
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3·1,193 Posts
Default

Quote:
Originally Posted by Aramis Wyler View Post
Running mprime for the first time on a new Ryzen 5 3600.

I used mostly defaults - 2 workers and 3 cores each - with work type 150 (First time Prime checks).

I'm posting because I'm getting an enormous number of potential round off errors on each worker. The build is new, the cpu could be defective, but I haven't seen any errors or heat issues other than these roundoff errors.
Hardware issues. Do the standard remedies, try lowering memory frequencies, or CPU speed, or increase voltages. Find a combination that can pass the torture test.
Prime95 is online now   Reply With Quote
Old 2020-08-12, 19:03   #18
S485122
 
S485122's Avatar
 
Sep 2006
Brussels, Belgium

33×59 Posts
Default

Updated the software to the latest version.

Received a Cert work unit ... to do some certifying of a cofactor for the factored Mersenne number 10482449. The configured work preference is double checking primality testing.

Working with the configuration I will be spared from cofactor certifying (and any other CERT jobs.) When viewing the result of the cert work done : "n/a" one has to go to the status of the number https://www.mersenne.org/report_expo...0482449&full=1 to see that the cofactor work has been certified OK (Verified). But it is not clear what the the status of the exponent is : fully factored ? Anyway that type of work (cofactors) is absolutely not something I signed up for. The imposing of that kind of work is at ends with the "let the user decide" philosophy of the project.

There obviously remains a wee bit of tuning to do on PrimeNet.

Jacob
S485122 is offline   Reply With Quote
Old 2020-08-12, 19:39   #19
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2·3·1,193 Posts
Default

Quote:
Originally Posted by S485122 View Post
Updated the software to the latest version.

Received a Cert work unit ... to do some certifying of a cofactor for the factored Mersenne number 10482449. The configured work preference is double checking primality testing.

Working with the configuration I will be spared from cofactor certifying (and any other CERT jobs.) When viewing the result of the cert work done : "n/a" one has to go to the status of the number https://www.mersenne.org/report_expo...0482449&full=1 to see that the cofactor work has been certified OK (Verified). But it is not clear what the the status of the exponent is : fully factored ? Anyway that type of work (cofactors) is absolutely not something I signed up for. The imposing of that kind of work is at ends with the "let the user decide" philosophy of the project.

There obviously remains a wee bit of tuning to do on PrimeNet.
CERT for PRP-CF is trivially quick work. I'm not sure why you found it so distasteful.

CERT for PRP is really a kind of PRP-DC. It is not a separate work preference choice as the server does not have much of that work type to hand out.

I'm glad you figured out how to disable CERT work. Kriesel also disabled CERT work because of the impact on his LL testing -- a Jacobi check to save his LL test and another Jacobi check on resume. I'm thinking the ability to turn off CERT work needs to be more prominent -- perhaps a checkbox at the bottom of the Worker Windows dialog box.

I agree, the server web pages need a lot of work due to proofs.
Prime95 is online now   Reply With Quote
Old 2020-08-12, 20:52   #20
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

111338 Posts
Default

To clarify, I set download rate to 0 on prime95 on most of my systems but not all, to a point where I think I'll be doing my fair share. Doing an order of magnitude more CERTs than I do primality tests was a small drag on throughput/efficiency and made my testing throughput unpredictable. There were others that were interested in doing more CERTs than they were being assigned. So throttling my CERT throughput down considerably from the initial disparity created a win-win. And I am appreciative of those who are doing CERTs on my PRP or PRPDC in 120M-200M. These runs are to possibly detect any issues with fft length cutoffs etc, well ahead of the wavefront. (https://www.mersenneforum.org/showpo...1&postcount=6; similar lower priority effort with LL/LLDC at https://www.mersenneforum.org/showpo...78&postcount=4)

GIMPS is going through a complicated transition currently, and more rapidly it seems than originally projected. Software bugs are being identified and dealt with, in server and client code. Good bug reports, and patience, are recommended.

It will take a long time to get the bulk of the clients updated. Early adopters of prime95/mprime v30.x are bearing the brunt of CERT for both mprime/prime95 and gpuowl production. (Either curtisc or Ben Delo updating a fraction of their fleet would help a lot. But like for everyone in this all-volunteer project, their kit, their call. And if they had started already, we wouldn't know without doing some checking.)

Last fiddled with by kriesel on 2020-08-12 at 20:56
kriesel is offline   Reply With Quote
Old 2020-08-12, 22:22   #21
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

2×1,489 Posts
Default

Quote:
Originally Posted by Prime95 View Post
CERT for PRP-CF is trivially quick work. I'm not sure why you found it so distasteful.
CERT for PRP-CF for a 10.48M exponent took 18 sec on 8 cores, so 2.5 min tops if running it on a single core, and maybe 3-5 minutes at most if you have a very old cpu and running it on 1 core.

Last fiddled with by ATH on 2020-08-12 at 22:23
ATH is offline   Reply With Quote
Old 2020-08-12, 23:03   #22
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

11·709 Posts
Default

Is there a worktype option to select proof work?
Xyzzy is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 17:24.

Mon Nov 23 17:24:11 UTC 2020 up 74 days, 14:35, 2 users, load averages: 2.64, 2.39, 2.06

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.