mersenneforum.org gpuOwL: an OpenCL program for Mersenne primality testing
 Register FAQ Search Today's Posts Mark Forums Read

 2020-01-11, 19:44 #1761 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 2·3·1,193 Posts @nomead: You may want to try the four combinations of CARRY32/CARRY64 with and without OLD_CARRY_LAYOUT. @everyone: Treat the new sin/cos and middlemul1 implementations (and now the new middlemul2 implementation) as test code. Clearly we need to do more analysis on the accuracy of these functions. For me, these new options yield a 25us (3.5%) improvement on Radon VII. No errors the last 12 hours, but I am not operating near the upper limit of the FFT size.
 2020-01-11, 20:18 #1762 Prime95 P90 years forever!     Aug 2002 Yeehaw, FL 1BF616 Posts It seems the Chebyshev method has accuracy issues. Until preda checks in the code that selects new defaults, "-use ORIGINAL_TWEAKED,ORIG_MIDDLEMUL2" is recommended.
2020-01-11, 20:27   #1763
PhilF

Feb 2005

22516 Posts

Quote:
 Originally Posted by Prime95 It seems the Chebyshev method has accuracy issues. Until preda checks in the code that selects new defaults, "-use ORIGINAL_TWEAKED,ORIG_MIDDLEMUL2" is recommended.
Do "accuracy issues" reveal themselves as Gerbicz errors?

2020-01-11, 20:48   #1764
Prime95
P90 years forever!

Aug 2002
Yeehaw, FL

2·3·1,193 Posts

Quote:
 Originally Posted by PhilF Do "accuracy issues" reveal themselves as Gerbicz errors?
Yes.

2020-01-12, 00:41   #1765

"Sam Laur"
Dec 2018
Turku, Finland

23×41 Posts

Quote:
 Originally Posted by Prime95 @nomead: You may want to try the four combinations of CARRY32/CARRY64 with and without OLD_CARRY_LAYOUT.
Ok, I tested that with several different FFT sizes, and at least on that card, OLD_CARRY_LAYOUT or not, didn't make any difference (the performance is exactly the same to the microsecond level... both on CARRY64 and CARRY32.)

Quote:
 Originally Posted by Prime95 @everyone: Treat the new sin/cos and middlemul1 implementations (and now the new middlemul2 implementation) as test code. Clearly we need to do more analysis on the accuracy of these functions. For me, these new options yield a 25us (3.5%) improvement on Radon VII. No errors the last 12 hours, but I am not operating near the upper limit of the FFT size.
Good hint there about the accuracy, I backed the exponent off a bit From 94000013 to 93000067 wasn't enough, errors still occurred but less often than before. But 92000059 was (still at FFT 5120K)-, now with NO_ASM,CARRY64,LESS_ACCURATE I reliably get that 3.463 ms / iter. So in total that's over 4% faster than where I started. Didn't build a version with the MiddleMul2 options yet.

 2020-01-12, 01:09 #1766 nomead     "Sam Laur" Dec 2018 Turku, Finland 32810 Posts Going from ORIG_MIDDLEMUL2 to CHEBYSHEV_MIDDLEMUL2 again improved the timing from 3.463 to 3.411 ms. Accuracy perhaps degraded a little bit more; now 93000067 exits immediately due to check errors (first check fails three times in a row) while with ORIG_MIDDLEMUL2 it got up to 30k iterations before failing. But 92000059 still works fine for at least 100k iterations.
2020-01-12, 03:02   #1767
PhilF

Feb 2005

32·61 Posts

Quote:
 Originally Posted by preda I added some untested code that is supposed to: 1. when a P-1 factor is found, all PRP entries from worktodo.txt for the same exponent are removed. No result is written (to results.txt) for these deleted tasks. 2. when a P-1 factor is found in the background (GCD) while a PRP test for the same exponent is ongoing, the PRP test is aborted early and the point 1. above is applied. I think this solution [in addition to bugs] has the problem of leaving PRP assignments "hanging" on primenet. Maybe the server could implement auto-release of the PRP assignments of a user when that user submits a factor for the same exponent (because, after a factor found, it does not make sense for the user that found the factor to pursue the PRP tests)
A possible bug:

I updated to the latest commit (267cc60). I was in the middle of a PRP test. I had previously completed a P-1 test on this exponent, so the p1.owl and p2.owl save files were already in the exponent's save folder, which may or may not be relevant. But what happened is the new gpuowl wiped out the save files and started the PRP test over. In the log file is the line:
Code:
 'worktodo.txt': Could not find the line 'PRP=<AID>,1,2,101949599,-1,76,2' to delete
So it looks like it thought there was a factor? Regardless, it appears to have wiped out the save files and started from scratch without checking whether or not the test already had some progress made.

2020-01-12, 04:29   #1768
preda

"Mihai Preda"
Apr 2015

101001100002 Posts

Quote:
 Originally Posted by PhilF A possible bug: I updated to the latest commit (267cc60). I was in the middle of a PRP test. I had previously completed a P-1 test on this exponent, so the p1.owl and p2.owl save files were already in the exponent's save folder, which may or may not be relevant. But what happened is the new gpuowl wiped out the save files and started the PRP test over. In the log file is the line: Code:  'worktodo.txt': Could not find the line 'PRP=,1,2,101949599,-1,76,2' to delete So it looks like it thought there was a factor? Regardless, it appears to have wiped out the save files and started from scratch without checking whether or not the test already had some progress made.
Was there a factor? (this would help diagnose the situation. If there was a factor it's mostly fine, if there wasn't it's more serious) -- you could look towards the end of your results.txt and see whether the P-1 reported a factor found ("F"). Also, you were running with -cleanup?

Until the problem is fixed (investigating), I'd recommend running without -cleanup ; also make sure you have a newline on the last line of your worktodo.txt . What dose your worktodo.txt look line now?

Last fiddled with by preda on 2020-01-12 at 04:32

2020-01-12, 04:39   #1769
PhilF

Feb 2005

32×61 Posts

Quote:
 Originally Posted by preda Was there a factor? (this would help diagnose the situation. If there was a factor it's mostly fine, if there wasn't it's more serious) -- you could look towards the end of your results.txt and see whether the P-1 reported a factor found ("F"). Also, you were running with -cleanup?
No, there was no factor. The PRP test was about halfway finished.

I was not running with -cleanup. The command line I am using is:

gpuowl -device 0 -user pfrakes -cpu i7-4790 -B1 1000000 -B2 32000000

Worktodo.txt contained:

PRP=<aid redacted>,1,2,101949599,-1,76,0

I just realized that worktodo.txt contains a PFactor= line for this exponent which may have already been there when I updated gpuowl (or gpuowl added it, I'm not sure). If it was already there maybe it confused the program.

EDIT: There is not a newline at the end of the worktodo.txt file. Could that be the problem?

Last fiddled with by PhilF on 2020-01-12 at 04:42

2020-01-12, 08:04   #1770
preda

"Mihai Preda"
Apr 2015

24×83 Posts

Could you double check whether you actually lost the PRP savefiles? that's higly surprising, because gpuOwl does not delete the content of the past exponents ever, except when using -cleanup (which you aren't using).

So, please track down the exponent on which you were PRP half-way (from gpuowl.log). Next look in the folder for that exponent, you should have the savefiles safely there -- not deleted and not lost.

What I think happened is this: you simply started a new exponent (a different one) from worktodo.txt. The order of worktodo entries changed, and the exponent you were 50% through is still there. Maybe it even has an entry in the worktodo.txt.

An extended excerpt of gpuowl.log would help with understanding what happened.

Quote:
 Originally Posted by PhilF No, there was no factor. The PRP test was about halfway finished. I was not running with -cleanup. The command line I am using is: gpuowl -device 0 -user pfrakes -cpu i7-4790 -B1 1000000 -B2 32000000 Worktodo.txt contained: PRP=,1,2,101949599,-1,76,0 I just realized that worktodo.txt contains a PFactor= line for this exponent which may have already been there when I updated gpuowl (or gpuowl added it, I'm not sure). If it was already there maybe it confused the program. EDIT: There is not a newline at the end of the worktodo.txt file. Could that be the problem?

2020-01-12, 08:43   #1771
paulunderwood

Sep 2002
Database er0rr

1101101011012 Posts

Quote:
 Originally Posted by paulunderwood I don't understand it. I git cloned gpuowl and compiled, and it runs slower than before 1240 us. vs. 750 us. What am I doing wrong?
The latest cloned version compiles just fine. I was running at 760 us and now it is 709 us -- another amazing speed-up. Plus I can do the P-1 pre-factoring

 Similar Threads Thread Thread Starter Forum Replies Last Post Bdot GPU Computing 1657 2020-10-27 01:23 xx005fs GpuOwl 0 2019-07-26 21:37 1260 Software 17 2015-08-28 01:35 CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12 Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 02:21.

Sat Nov 28 02:21:36 UTC 2020 up 78 days, 23:32, 3 users, load averages: 0.96, 1.20, 1.19