mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2020-05-20, 00:04   #2179
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default gpuowl-win v6.11-285 build

Here it is. See the commit descriptions at github for what's new or changed. https://github.com/preda/gpuowl
Attached Files
File Type: txt build-log.txt (8.2 KB, 66 views)
File Type: 7z gpuowl-v6.11-285-gf25ecbd.7z (474.9 KB, 73 views)
kriesel is online now   Reply With Quote
Old 2020-05-20, 03:35   #2180
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55B16 Posts
Default

Quote:
Originally Posted by kriesel View Post
Another one, matching Roland Clarkson's first test but rejected by the server:
Code:
2020-05-15 00:28:04 asr2/radeonvii2 121642771 OK 121600000  99.96%; 2151 us/it; ETA 0d 00:02; f394cb39ecc84d04 (check 1.16s)

{"status":"C", "exponent":"121642771", "worktype":"PRP-3", "res64":"a3569f57e1792d__", "residue-type":"1", "errors":{"gerbicz":"0"}, "fft-length":"7340032", "program":{"name":"gpuowl", "version":"v6.11-278-ga39cc1a"}, "user":"kriesel", "computer":"asr2/radeonvii2", "timestamp":"2020-05-15 05:29:38 UTC"}
https://www.mersenne.org/report_expo...exp_hi=&full=1
How do you get the assignments? I understand that the manual assignment page is smart enough to only return DC assignments that had a non-zero shift initially. If the server does not do that (but I was under the impression it did), maybe you could also verify that the initial LL had non-zero shift before starting the gpuowl DC on it.
preda is offline   Reply With Quote
Old 2020-05-20, 04:46   #2181
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72×197 Posts
Default

Mihai, you miss the point. It doesn't matter how and where we got those assignments from. The tool doesn't work properly.

Assuming there is an yet-to-be-uncovered error in gpuOwl multiplication routines, starting always with the shift zero will always deal with the same data, therefore always producing the same incorrect result, and this we can't check.

This is most relevant for LL tests, or for P-1, where there is no GC. Random (or specified) shift at startup is a must to have, it ensures not only the sanity of the tests, and allows re-testing/DCing/etc of ANY former result (including results produced by gpuOwl itself), therefore adding more utility to the tool, but also ensures the sanity of the code itself. Here is where we (and Ken) are barking.

This is by no mean trying to undermine your work. You did an amazing effort to implement all the FFT and stuff by yourself from scratch, and to make it faster, and to share it with the community, and we really appreciate you for this. As a programmer myself, I can testify about the huge effort and knowledge needed for such a task. Now, why don't you want to make it... properly?

This should include shifts and proper file-names, and keeping the history, as cudaLucas is doing, till then, many of us would still prefer cudaLucas, albeit they don't have the time/guts/whatever to publicly speak here.

Everybody wants to use gpuOwl, because it is faster. But as it is now, its usefulness is quite limited, it can only be used to double check old runs which were NOT done with a zero shift. We are too paranoid to use it for new tests, and assuming the happiest case where more and more people start to use it, we will reach a point where gpuOwl users will have to WAIT for other people to complete P95 tests, to have something to DC for themselves. As a R7 is about 6-7 times faster than a 10-cores i7 processor at PRP, it is enough that 1/7 of the users put their cards to work, and we are in the mud. This may seem far-fetched, and long in the future, because many users don't have R7, but they also don't have 10-cores CPUs. The future may be sooner than most of us imagine. I already have a list of tests which were PRP and DC in parallel runs in two cards, and I could not report the DC because of the same shift. I don't cry for credit or candies, but first of all, this is a waste of resources, and this slows the project down in long run as somebody will have to re-do in the future the work I already did and can not report.

Also, every time gpuOwl produces a mismatch, we will still need to wait for the (slow) run of P95 for the TC. And there are many other situations.

But yet, this is not the worst, the worst is that, in spite of the fact that I did TWO RUNS in parallel, I am still not confident in the fact that the result is correct. I am only confident that there was no hardware error, as the both runs produced the same final residue, so my hardware is sane. But I cannot be sure (and I mean general "I" here) that the FFT implementation is correct, because both of the instances started with the same shift, so they dealt with the same data along the test. If there is an error in the code, then both of the runs have the error. And my paranoia won't let me sleep... hehe...

Your job must be to offer an alternative to P95, not a secretary to it.

Last fiddled with by LaurV on 2020-05-20 at 05:27
LaurV is offline   Reply With Quote
Old 2020-05-20, 09:38   #2182
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,419 Posts
Default

For we hard core GIMPSters, the numerical discrepancy between gpuowl and cpu throughput will become much more than what laurv has stated. One modest 4-core cpu can support several Radeon VII (or Radeon Pro VII when they come out), with a suitable power supply, motherboard and chassis. Something like that is what George and Ernst are now doing, and others too. The power/performance efficiency of the Radeon VII will drive it that way.
We need nonzero shift in gpuowl, both PRP and LL. You've done it before in LL. Please bring it back.
Other error detection measures would be very welcome too. (You've done the Jacobi check before too.)
kriesel is online now   Reply With Quote
Old 2020-05-20, 09:49   #2183
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

10101001010112 Posts
Default

Quote:
Originally Posted by preda View Post
How do you get the assignments? I understand that the manual assignment page is smart enough to only return DC assignments that had a non-zero shift initially. If the server does not do that (but I was under the impression it did), maybe you could also verify that the initial LL had non-zero shift before starting the gpuowl DC on it.
I think you understand correctly. Where there is only one completed first test in a million-exponent-range bin, I will sometimes run a test without an assignment if I can not get one. See https://mersenneforum.org/showpost.p...9&postcount=14 Some of that gets done when I'm too tired or not fully awake yet. I've overlooked the 0-offset coincidence problem at times. (Chalsal's "Never send a man to do a computer's job" comes to mind here.) Pseudorandom offset in gpuowl, as in essentially all other GIMPS production primality testers, would make that a nonissue and make gpuowl their equal. These have it:
mprime/prime95;
mlucas;
cudalucas
(I think cllucas did not, and was not used much.)

Last fiddled with by kriesel on 2020-05-20 at 09:51
kriesel is online now   Reply With Quote
Old 2020-05-20, 11:08   #2184
axn
 
axn's Avatar
 
Jun 2003

2×3×7×112 Posts
Default

Quote:
Originally Posted by kriesel View Post
We need nonzero shift in gpuowl, both PRP and LL. You've done it before in LL. Please bring it back.
Other error detection measures would be very welcome too. (You've done the Jacobi check before too.)
LL is not the future. PRP is. Non-zero shift is a relic of LL days without effective error check. It is completely unnecessary with PRP/GEC.

IMO, It is high time we made PRP the default test type and start forcing everyone to use these instead of 1st time LL test. [Yes, I know why it can't happen -- damn older clients].
axn is offline   Reply With Quote
Old 2020-05-20, 11:59   #2185
Jan S
 
Oct 2018
Slovakia

2·3·11 Posts
Default

@axn:

But it's still here LL-double/triple check. Morning i started one triple check via GPUowl, but when i read this thread, i stopped him.
Jan S is offline   Reply With Quote
Old 2020-05-20, 12:13   #2186
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by LaurV View Post
Mihai, you miss the point. It doesn't matter how and where we got those assignments from. The tool doesn't work properly.
[...]
Your job must be to offer an alternative to P95, not a secretary to it.
Laur, I do consider all feature requests and I try not to reject dogmatically. (I also try to keep the scope, and the code-size, to a minimum.)

Talking about LL and PRP (i.e. ignoring P-1 for a moment), I think the offset is most useful for LL. Talking about LL with gpuowl, the focus is on double-checking past LL results. The majority of the past LL was done with non-zero offset with mprime. Validating an mprime, non-zero offset result with gpuowl is very strong, stronger than validating different-offsets with a single software.

So, what is the use-case that is a pain point for you, that is not covered? Are you doing first-time LL with GPUs? If so, maybe you should try to do PRP instead of LL.

The number of first-time LL tests done with gpuowl should be a tiny minority, that minority will be checked with mprime without any difficulty.
preda is offline   Reply With Quote
Old 2020-05-20, 12:30   #2187
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default Jacobi is back

Hi, in a recent commit https://github.com/preda/gpuowl/comm...a13478c192bb3d
I try to bring back the Jacobi check for LL. This is how it works and what changes for LL:

1. When: by default a Jacobi check will be done every 1M iterations. This can be configured with the -jacobi <step> command line argument, giving it a number of iterations. The check is rather slow (on the order of 1minute) and takes up 1-core of CPU, so I think it shouldn't be done too often (thus the default of 1M iterations)

2. Savefiles: an LL is only ever saved after a successful Jacobi check. There is no possibility to do an LL save that did not pass Jacobi. This, combined with the above point about the frequency of Jacobi, means that the frequency of saves is reduced (by default every 1M its). The Jacobi check is also triggered on exit (Ctrl-C), thus if the user is willing to wait the 1min after Ctrl-C the savefile will be up-to-date. OTOH if there's a power-cut no luck.

3. Moving backwards: the check is done in the background on CPU, while the LL test keeps advancing. In the eventuality that the background Jacobi fails, the test should automatically resume from the most recent savepoint.

4. Logging: the log-lines for LL now contain these codes:
"LL": a simple not-checked log line of LL
"OK": an iteration that passed Jacobi
"EE": an iteration that failed Jacobi

There may be bugs, as usual.
preda is offline   Reply With Quote
Old 2020-05-20, 12:31   #2188
axn
 
axn's Avatar
 
Jun 2003

508210 Posts
Default

Quote:
Originally Posted by Jan S View Post
@axn:

But it's still here LL-double/triple check. Morning i started one triple check via GPUowl, but when i read this thread, i stopped him.
Sure. But it doesn't make sense to invest time/effort in a dead end, There are cudalucas/ cllucas/ older versions of gpuowl, etc. for that purpose. Also, looking at the points preda said, it might be possible to doublecheck with zero-shift, if the original had non-zero. I say "might" because I don't know whether server will accept or reject it; it should, but I don't know.
axn is offline   Reply With Quote
Old 2020-05-20, 12:35   #2189
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101010110112 Posts
Default

Quote:
Originally Posted by Jan S View Post
@axn:

But it's still here LL-double/triple check. Morning i started one triple check via GPUowl, but when i read this thread, i stopped him.
I think LL double-check with gpuowl is fine. Because it double-checks a previous LL that was done with non-zero offset, and with a different software, thus the zero-offset gpuowl check is as strong as can be.

The only problem appears when attempting to double-check gpuowl LL with gpuowl, that is not a good idea.
preda is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 19:07.


Sun Aug 1 19:07:35 UTC 2021 up 9 days, 13:36, 0 users, load averages: 2.02, 2.13, 1.93

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.