mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

Dubslow 2012-04-11 21:01

LaurV and flash and me etc. prefer to be notified, so that 1) the TC is run on P95 (would probably happen anyways, but just to be sure) and with the various changes and bad residues in 1.55-1.6x they weren't very trustworthy of the prog, though I think it's gotten better since 2.00. (And besides, if you look back to the original post, I definitely said it was overkill :razz:)

Batalov 2012-04-11 21:17

I am simply observing that this is a fractionism not very much dissimilar from rcv's: "I'll take the glue and the plywood and strings and I'll build my own Wright Stuff plane (or my own catapult) at home and I won't show you guys anything --::raspberry::-- until the school competition" ...and then at the school competition <left for the reader to fill in>.

It is a bit different from John Galsworthy/Groucho Marx's "I don’t care to belong to any club that will have me as a member". But I may be wrong, who knows.

frmky 2012-04-11 21:34

[QUOTE=Batalov;296169]All this double accounting is dubious. [/QUOTE]
I agree, there [URL="http://www.mersenne.org/report_exponent/?exp_lo=26556359&exp_hi=&B1=Get+status"]are[/URL] [URL="http://www.mersenne.org/report_exponent/?exp_lo=36000199&exp_hi=&B1=Get+status"]instances[/URL] where I've submitted a CUDALucas residue that didn't agree with a previous residue, but were later confirmed, or one of two previous differing residues were confirmed correct. Then there are a [URL="http://www.mersenne.org/report_exponent/?exp_lo=36500089&exp_hi=&B1=Get+status"]couple[/URL] [URL="http://www.mersenne.org/report_exponent/?exp_lo=36500119&exp_hi=&B1=Get+status"]where[/URL] the CUDALucas residue was wrong. I believe that if you've got a residue, submit it!

msft 2012-04-12 08:21

[QUOTE=Dubslow;296171]and bad residues in 1.55-1.6x[/QUOTE]
All causes was hardware.

LaurV 2012-04-12 12:28

1 Attachment(s)
[QUOTE=Batalov;296169]All this double accounting is dubious. Who is served by having some unobservable wrong or right residue? (Perhaps some misplaced pride? Well, in that case, it would be better served by tuning the card to work right, not just "look ma! no hands! 5GHz!!")

I've submitted a non-matching residue long ago and never thought twice about it but bookmarked the result to revisit later. [URL="http://www.mersenne.org/report_exponent/?exp_lo=27402559&exp_hi=10000&B1=Get+status"]Et voila[/URL] - CUDA was right. (I've looked back at the version, it was CUDALucas v1.48.)[/QUOTE]
There is nothing about pride (well... a little bit :P) but about altruism :smile:. If I submit a DC with CL which is not matching a previous P95 FIRST check, the exponent will CONTINUE to be assigned to other [U]CudaLucas[/U] workers, they will NOT notify, and - in case [B]MY residue was correct[/B] - WASTE their time and resources. There is not about "setting the card right". If I overclock and my residue was wrong, then the only one wasting time is ME. Because the third worker will have his TC [B]accepted[/B] (matching original [U]P95[/U] residue). The real problem -- you as a TC worker, wasting your time -- is when I DO NOT overclock (or I use the Teslas, I have 2 of them) and my rsidues ARE correct. That is why we have (had) threads as "do not DC them with CL" etc. To avoid wasting the time of the TC-ers.

You don't know how many others wasted their time between you reporting the mismatched DC and the "et voila". Maybe other 1, 2, 10 etc tried it with CL, reported, but their result refused by the PrimeNet DB and it is nowhere recorded. And yes, I also bookmark my mismatches, and always revisit them to actualize the bookmark list, and if one stays there too long I will queue it myself in P95 or a CL TC just to make sure. See former discussions here around. And here is a screen snap to prove it:

bcp19 2012-04-12 14:44

[QUOTE=LaurV;296227]There is nothing about pride (well... a little bit :P) but about altruism :smile:. If I submit a DC with CL which is not matching a previous P95 FIRST check, the exponent will CONTINUE to be assigned to other [U]CudaLucas[/U] workers, they will NOT notify, and - in case [B]MY residue was correct[/B] - WASTE their time and resources. There is not about "setting the card right". If I overclock and my residue was wrong, then the only one wasting time is ME. Because the third worker will have his TC [B]accepted[/B] (matching original [U]P95[/U] residue). The real problem -- you as a TC worker, wasting your time -- is when I DO NOT overclock (or I use the Teslas, I have 2 of them) and my rsidues ARE correct. That is why we have (had) threads as "do not DC them with CL" etc. To avoid wasting the time of the TC-ers.

You don't know how many others wasted their time between you reporting the mismatched DC and the "et voila". Maybe other 1, 2, 10 etc tried it with CL, reported, but their result refused by the PrimeNet DB and it is nowhere recorded. And yes, I also bookmark my mismatches, and always revisit them to actualize the bookmark list, and if one stays there too long I will queue it myself in P95 or a CL TC just to make sure. See former discussions here around. And here is a screen snap to prove it:[/QUOTE]

I'm pretty sure the server records all bad residues, example: [URL]http://www.mersenne.org/report_exponent/?exp_lo=22545883&exp_hi=22545883&B1=Get+status[/URL], where it took a QC(quadruple check) in order to get a match. So, turning in your result, whether it matches or not, is as Batalov says. After all, there really is not a difference if you overclock and get a bad residue and you don't overclock and get a good residue while the original residue is bad, you either have LL=good, DC=bad, TC=good or LL=bad, DC=good, TC=good. If you do not turn in, waiting for someone else to run it to 'double check you' you could run into: LL=bad, DC=good, TC=bad, QC=good or LL=good, DC=bad, TC=bad, QC=good or LL=bad, DC=bad, TC=good, QC=good, etc. If, in the example I pasted above, the original run was bad, you would have ended up with LL=bad, DC=bad, TC=bad, QC=good, QTC(Quintuple check)=good. Primenet will hand the exponent out until it gets a match, and record all attempts. IMO, it's better to submit regardless of match/mismatch, and let primenet take care of it automatically, which is less time consuming.

LaurV 2012-04-12 16:10

That is all gibberish. LL bad, DC (by CL) good, then all the other TC, QC whatever, by CL, is WASTE of time, they are either bad (which means waste of time, but you get credit) or good (matching the DC), which means they are refused by the server (as "same result by third party program", no credit is given, the report is NOWHERE recorded. Try reporting same CL result two times and see what's happening, then talk about the subject when you know it).

bcp19 2012-04-12 17:47

[QUOTE=LaurV;296251]That is all gibberish. LL bad, DC (by CL) good, then all the other TC, QC whatever, by CL, is WASTE of time, they are either bad (which means waste of time, but you get credit) or good (matching the DC), which means they are refused by the server (as "same result by third party program", no credit is given, the report is NOWHERE recorded. Try reporting same CL result two times and see what's happening, then talk about the subject when you know it).[/QUOTE]

Maybe I am misunderstanding you due to the differences in language understanding. You are either saying 1) you tested and submitted a CL result that was a mismatch, so you reran the test and resubmitted a matching result and the server refused to take it "same result by third party program", or 2) you tested and submitted a mismatch and someone else using the same program tested it and submitted a matching result to your and was told "same result by third party program".

If case 1 is correst, I would expect this from the server, since it should NOT accept the same result more than once from the same person, regardless of the program used. It would be too easy for a person to take the results from P95 and create a CL result and get double credit.

If case 2 is correct, then it sounds lilke a problem, because 2 separate people submitting matching results and happening to use the same program would, to me anyway, not make sense.

frmky 2012-04-12 19:29

[QUOTE=LaurV;296251]That is all gibberish. LL bad, DC (by CL) good, then all the other TC, QC whatever, by CL, is WASTE of time, they are either bad (which means waste of time, but you get credit) or good (matching the DC), which means they are refused by the server (as "same result by third party program", no credit is given, the report is NOWHERE recorded. Try reporting same CL result two times and see what's happening, then talk about the subject when you know it).[/QUOTE]

I still don't get the point. Let's assume the first run is by P95 and the second run is by CL and doesn't match. Now there are 4 cases, the CL run is submitted or not, and valid or not:

1. Don't submit and not valid: A TC can be done and submitted by either CL or P95.

2. Submit and not valid: Again, a TC can be done and submitted by either CL or P95.

So if the residue is not valid, it makes [B]no difference[/B] whether you submit or not.

3. Don't submit and valid: A TC must be done by P95. Although it can be run again with CL, the server will reject the TC when both valid CL runs are submitted.

4. Submit and valid: A TC must be done by P95. Although it can be run again with CL, the server will reject it.

So if the residue is valid, it makes [B]no difference[/B] whether you submit or not.

Therefore, always submit. :smile:

Now, the interesting point is that if a number has two non-matching residues, one done by CL, the number should not be run again with CL. It could be either case 2 or 4, so the CL run may be wasted. But if you have already completed a run with CL, there is no reason to not submit it.

Greg

Dubslow 2012-04-14 09:05

In the vein of other mostly useless projects...
 
1 Attachment(s)
I hacked some mfatkc code so that CUDALucas can read "standard" GIMPS assignment lines ("Test=AID,exponent" and "DoubleCheck=..."), and remove each one from worktodo.txt as they're completed. What this really means is that I copied Christenson's (?) code, modified it slightly, and pasted it into CUDALucas.cu. At any rate, it works for me :smile:

It's now a few hundred lines longer, but *in theory* (in the vaguest sense of the phrase) it means that CUDALucas will be easier to automate, if Christenson ever decides to put his precious efforts towards that task :smile:.
It also means that you can now copy and paste work straight from PrimeNet/GPU272 without having to delete all the information except the exponent.

On the other hand, it might require a re-licensing; mfaktc is under the GPL, though I'm not aware what the current CUDALucas license is (if any), or if anybody cares enough to bother :razz:. At the very least, I'm proud that it works, even if I hardly wrote anything :razz::razz: (thank you Christenson!)

It is compatible with 2.00, meaning that you call it with exactly the same command, and it'll resume just the same as before. It (temporarily) renders cudalucas.ini useless; this was the first half of my project to hack even more mfaktc code (:smile:) to get basic .ini functionality, and perhaps be able to specify FFT length in the worktodo line. (Note that the version string is "lol" at the moment :smile:)

I also modified some of the messages printed to be slightly more grammatically correct; that's the only liberty I took with the existing code, besides modifying main(). Fortunately for me, everything like resuming/writing checkpoints etc. was abstracted from main() (in check()) so I didn't mess with anything critical. The only change to main was the part the reads in assignments. I did add a few declarations above main() (but below the rest of the existing code); all other additions are below main(), which is to say below all previously existing code. For convenience of anybody checking my hacking (it hardly qualifies as coding) all my comments are preceded by hashes, e.g. "//#" or "/*#" so that a Ctrl+F should be sufficient to find all my comments, and therefore all my changes. New CUDALucas.cu is attached; it should compile just fine with the old Makefile. (I can't test Windows compiling, but g++/nvcc didn't complain at me.)

(I did test that resuming 2.00 stuff works, because I'm running my current expo with this new version, while it was started with 2.00.)

[/more useless spam]

Edit: Here's a copy/paste of my "production" terminal:
[code]Iteration 16330000 M( 26273341 )C, 0x66b743a75bcbccea, n = 1474560, CUDALucas v2.00 err = 0.1162 (0:54 real, 5.4614 ms/iter, ETA 15:04:46)
Iteration 16340000 M( 26273341 )C, 0x780b400cb7e3ef0b, n = 1474560, CUDALucas v2.00 err = 0.1162 (0:56 real, 5.5237 ms/iter, ETA 15:14:10)
Iteration 16350000 M( 26273341 )C, 0xe1cba399ba32200e, n = 1474560, CUDALucas v2.00 err = 0.1162 (0:57 real, 5.7331 ms/iter, ETA 15:47:52)
^C^C caught. Writing checkpoint.
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -c 10000 -f 1474560 -polite 64 worktodo.txt
WARNING: ignoring line 1 in "worktodo.txt"! Reason: doesn't begin with Test= or DoubleCheck=
WARNING: ignoring line 2 in "worktodo.txt"! Reason: doesn't begin with Test= or DoubleCheck=
WARNING: ignoring line 3 in "worktodo.txt"! Reason: doesn't begin with Test= or DoubleCheck=
WARNING: ignoring line 4 in "worktodo.txt"! Reason: doesn't begin with Test= or DoubleCheck=
WARNING: ignoring line 5 in "worktodo.txt"! Reason: doesn't begin with Test= or DoubleCheck=
WARNING: ignoring line 6 in "worktodo.txt"! Reason: doesn't begin with Test= or DoubleCheck=
No valid assignment found.
bill@Gravemind:~/CUDALucas∰∂ nano worktodo.txt
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -c 10000 -f 1474560 -polite 64 worktodo.txt

continuing work from a partial result M26273341 fft length = 1474560 iteration = 16358145
Iteration 16360000 M( 26273341 )C, 0x523ba68f8a9962ce, n = 1474560, CUDALucas vlol err = 0.09326 (0:12 real, 1.1720 ms/iter, ETA 3:13:34)
Iteration 16370000 M( 26273341 )C, 0x2e8afa1230a7ce30, n = 1474560, CUDALucas vlol err = 0.09766 (0:56 real, 5.6254 ms/iter, ETA 15:28:11)
Iteration 16380000 M( 26273341 )C, 0x22b7d6757e8729a1, n = 1474560, CUDALucas vlol err = 0.1016 (0:55 real, 5.4859 ms/iter, ETA 15:04:15)[/code]
Where is says "ignoring line..." is where I forgot to take the list of exponents and convert them to proper GIMPS format :smile:

kladner 2012-04-14 16:05

@Dubslow- I'm impressed. You seem to have progressed a lot in code tweaking.

I may have missed something in the last few days, but what does "-polite 64" indicate? Does the '64' relate to '-polite', or not? :question:


All times are UTC. The time now is 23:14.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.