![]() |
|
|
#2190 |
|
"Mihai Preda"
Apr 2015
55B16 Posts |
|
|
|
|
|
|
#2191 | |
|
Romulan Interpreter
Jun 2011
Thailand
72×197 Posts |
Quote:
Points 3 and 4 are already working like that in the version I run. I was never neither pro nor contra GC/JC, and personally it wont hurt neither help me. (You would remark that I didn't make any comment related to GC/JC). My "test case" is for years now, TWO cards running in parallel, and checking the residues at every checkpoint. This way, there is no time lost when the mismatch happens, and I consider that such procedure can't be further optimized, no matter what other people think. I still do this with cudaLucas, but now, it is your fault because you made gpuOwl faster and I can't stop wishing to switch to it, and therefore I can't stop bothering you to make it to my liking ![]() What do one needs for my "way"? First, two identical cards. Or how many cards one has, they have to come in pairs. This I have. Second, a fast software. That is for now, gpuOwl. Third, due to the fact that the software runs in the same computer, the software should operate with different set of data. This will eliminate any bugs in the software, as long as both copies rund on different data and get the same result, the software is "sane", the FFT squaring is "sane". Otherwise, we don't know, unless a P95 (or other) test is run and we can compare the results. GC is still prone to errors, and when you have hundred million bits iterated hundred million times, the errors are usually cleverer than us. The chance is negligible, but not zero. Additionally, how do you "convince" PrimeNet to accept your DCs? If they were both done in a "no-shift" test, they are not two different tests. This just ensures the hardware is sane, as I already said, but it says nothing about the sanity of the software. Fourth, a history. Sometimes, against all our precautions, one card gets faster than the other, because we do other work in the computer, or play games, or watch videos, etc, and when a mismatch happens, it can be that one card is "more than two checkpoints" in advance compared with the other. In that moment, the only way to continue the test is starting both instances from scratch, wasting a lot of time, days, or even weeks. Because you don't know which one was wrong, and the program only keeps the last two checkpoints. All checkpoints should be kept, the way cudaLucas does, and those should be deleted manually by the user at the end of the test. You can provide an option to delete them automatically, for the lazy users, but I strongly DO consider that doing a manual checking your folders and cleaning your checkpoints once or twice per month (one 332M test takes about 17 days on R7, and about double in all the others) it is not "too much" for the user. I mean, how lazy can one be? You may say that such errors won't happen often, and their chance to happen when one card is more advanced than the other is slim, especially with GC active, but trust me, these things DO happen, and they may be "extremely rare", but every time such thing would happen, losing a week or two of two cards would be totally pissing off, and I love my monitor so much, I don't want to break it with my head. In fact, in such situation, will be better to let both test finish and report both residues, in the hope that one of them would be good, and you didn't waste the time, at least for one card. But this is another can of worms, you will end up by everybody reporting two results, one correct and one fake, and claiming they ran two tests in parallel, when they in fact did only one (some people may do this, for credits, whatever). And I could argue like that, endless... The moral is we need a history, i.e. instead of deleting old files, rename them "exponent.iteration.residue.whatever", where the first 3 fields are MUST to have, so our comparison/resume tools work with minimum changes .
Last fiddled with by LaurV on 2020-05-20 at 13:33 |
|
|
|
|
|
|
#2192 |
|
Romulan Interpreter
Jun 2011
Thailand
72·197 Posts |
Crosspost. I already replied, but to make it clear:
The file names should contain the exponent, the iteration, the residue. This is a must, for easily sorting, comparing, etc. Old files should not be deleted, but renamed properly and kept in the folder. The residue is needed in the name because (in case of shift/offset) the content of the files are different and can not be used for comparison. We discussed this in the past, and you came with the idea of putting the file header inside. Which is very good, but why would I need to open all >50MB files and "cat" them to get the residues? If I have the same files in both folders, it means the test is running smooth. If one folder has 55418387.12000000.adef1234cdeb9876.ll and the other folder has 55418387.12000000.def1234cdeb98765.ll instead, I know immediately that one card is in the weeds and I can stop and resume both from the last, I don't need any tool for that, just sharp eyes and fast fingers. Last fiddled with by LaurV on 2020-05-20 at 13:44 |
|
|
|
|
|
#2193 | |
|
"Mihai Preda"
Apr 2015
3·457 Posts |
Quote:
Running PRP you'd detect the errors just as well, and you'd double the capacity. Give it a try -- if you succedd in producing a failure of the check (i.e. a non-detected error), as you suspect may be possible, that would be a momentous achievement (but also much more difficult than simply finding the next mersenne prime IMO :) |
|
|
|
|
|
|
#2194 | |
|
Romulan Interpreter
Jun 2011
Thailand
226658 Posts |
Quote:
![]() The problem is exactly THAT: you run two tests that will always match, as long as there are no glitches in the hardware, no matter what you do in the software, because you always do the same thing, applied to the same data. You don't know if they have an error, unless you have an etalon. Maybe that is why there was no error found up to now, and not because GC is so strong... (ranting here...I understand the math part). |
|
|
|
|
|
|
#2195 | |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
Quote:
I'll think about doing something better about keeping those files around. Shouldn't be hard to make a script tool (bash, perl, etc) that would rename the files adding the residue which is easily parsed from the first line to the file-name if desired. (but anyway the proper fix is to switch to PRP) |
|
|
|
|
|
|
#2196 | |
|
"Mihai Preda"
Apr 2015
3·457 Posts |
Quote:
(no, I don't actually recommend doing that, would be a waste of valuable resources) |
|
|
|
|
|
|
#2197 | |||
|
Romulan Interpreter
Jun 2011
Thailand
226658 Posts |
Quote:
Quote:
Quote:
, if you give me the shift and the history (to be able to resume efficiently and to convince PrimeNet to accept my DCs, otherwise I only get candies for half of the effort, and waste the other half, hehe. And when I'll catch you in Thai, after all this craziness with corona ends, I will force you to drink all the beer I will find in the fridge. Last fiddled with by LaurV on 2020-05-20 at 14:14 |
|||
|
|
|
|
|
#2198 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001010112 Posts |
Quote:
But the realities are that a great deal of LL is still being done by various programs. https://www.mersenne.org/assignments...&exp1=1&extf=1 is almost all LL first tests. A check I did several months ago showed a nearly 50/50 mix of PRP and LL in recent results. And a great deal of LL was done in the past. Even in PRP, shift has advantages. Suppose that the exponent under test is close enough to the limit of an fft length that roundoff error becomes an issue. A different shift may avoid the case where roundoff error repeatedly generates a Gerbicz error. I think the case for preferring or requiring PRP first testing is stronger when pseudorandom or specifiable nonzero shift becomes available in gpuowl. DC is still necessary for PRP for multiple reasons. (Errors have been observed outside the GEC check; users make manual reporting errors, and there is no PrimeNet API connection for gpu programs; there is no reliable built-in validation code to confirm actual work done; some rare few users submit falsified results intentionally.) Gpuowl can't DC gpuowl without differing shift. The result is not accepted by the server. As I recall, Ernst opined that adding shift has little if any effect on performance (from his mlucas development). It may be that Mihai and George choose to spend their time now on obtaining further performance. (And we are very appreciative of their efforts and results in this area.) Diminishing returns will occur. Perhaps they'll add shift later. When they do, I hope it is to both LL and PRP in gpuowl. I think the ideal situation would be the default is pseudorandom shift, and the user could specify a specific shift for QA test purposes. Last fiddled with by kriesel on 2020-05-20 at 16:13 |
|
|
|
|
|
|
#2199 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,419 Posts |
Two gpuowl instances running on the same gpu reportedly helps total testing throughput. But not always. Test what you run. GTX10x0 seems not to benefit in PRP.
A particularly severe case of lowered throughput I saw recently on a Radeon VII follows. 1 instance, 48M fft PRP, 10510 us/iter, 95.15 iter/sec; 1 instance, 8M fft LL, 1382 us/iter, 723.6 iter/sec; These two run together, 8M fft LL 6610. us/iter (151.3 iter/sec, 20.9% of solo throughput), + 48M fft PRP, 52438. us/iter (19.07 iter/sec, 20.04% of solo throughput), combined for just 40.94% of solo throughput. It's probably best to run same computation type, same fft size, or perhaps very similar size. Last fiddled with by kriesel on 2020-05-20 at 16:14 |
|
|
|
|
|
#2200 |
|
Romulan Interpreter
Jun 2011
Thailand
965310 Posts |
My bad wording. Sorry. I meant 2 instances, each in its own card.
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |