![]() |
The -polite 0 option seems to be the actual culprit.
[QUOTE=Dubslow;295890]Hmm, I've turned off -k but I'm still seeing 15-20% CPU usage. [code]LD_LIBRARY_PATH=~/CUDALucas/lib ~/CUDALucas/CUDALucas -c 10000 -f 1474560 -polite 0 worktodo.txt[/code][/QUOTE] |
[QUOTE=zs6nw;295931]The -polite 0 option seems to be the actual culprit.[/QUOTE]
Indeed, I can confirm this. msft, do you know why it does that? (Btw, if you DLed the script between 10 minutes before this post and ~1 hr after the original post, there was a typo that is now fixed that prevented it from working properly.) |
[QUOTE=Dubslow;295933]Indeed, I can confirm this. msft, do you know why it does that?[/QUOTE]
Side effect. |
[QUOTE=msft;295965]Side effect.[/QUOTE]
I suppose the better question is, is it possible to get aggressive (p=0) performance without adding the extra cpu time, or is that just the way it is? |
[QUOTE=Dubslow;295971]I suppose the better question is, is it possible to get aggressive (p=0) performance without adding the extra cpu time, or is that just the way it is?[/QUOTE]
I haven't looked carefully at the CUDALucas code. However, this sounds very similar to the standard NVIDIA problem of using spin-loops. @msft: I sent a patch to Cyril for the GPU version of gmp-ecm that cures this problem for his application. xilman acknowledged in another thread ( [URL]http://www.mersenneforum.org/showpost.php?p=295541&postcount=63[/URL] ) that it worked for him. It's only a dozen+ lines. Since you are using the CUFFT library, there may be some complications, but if you are interested and if you think it may be the NVIDIA spin-loops, PM me, and I'll send you the technique. |
[QUOTE=Dubslow;295971]I suppose the better question is, is it possible to get aggressive (p=0) performance without adding the extra cpu time, or is that just the way it is?[/QUOTE]
"-polite 64" Good balance on my linux box. |
[QUOTE=msft;295993]"-polite 64" Good balance on my linux box.[/QUOTE]
Well I'll be. What exactly does the 64 mean? I thought it was just a binary switch. |
1 Attachment(s)
Graph of GTX460 cufftbench timing data.
[QUOTE=Prime95;294530]Attached is my cufftbench for a GTX460. I flagged with a "Y" the FFT sizes that make sense.[/QUOTE] |
[QUOTE=Dubslow;295994]Well I'll be. What exactly does the 64 mean? I thought it was just a binary switch.[/QUOTE]
"-polite x" will be polite... every x iterations. It says in the help. :P There was an example with 100 earlier in this thread. So, the higher the number, the more aggressive it becomes. Use 0 for "infinite aggressive" (never do the wait loop). Use 1 for the most polite (do the wait loop every "1" iterations). |
[QUOTE=Dubslow;295786]CuLu Spider...
If an exponent result is correctly parsed, it's passed into the "decide" function, which decides what's appropriate; it has the following logic: <snip> Unfortunately, due to a paucity of exponents, I have not tested every case; I do know that the "Verified LL" portion works as advertised (thanks msft for posting that 2.00 result you had :smile:), and that it works in the basic case of a match with no prior CuLu result, however the other scenarios remain untested (but *should* work).[/QUOTE] I can now confirm that it will successfully notice a previous CUDALucas test, and will therefore not submit such an exponent. The underlined parts of the logic have now been tested at least once: [code] [U]if( 2*"Verified LL" is in the expo status page) { then submit anyways, just in case[/U] } [U]else if( there is a string of 14 lowercase hex digits [or all decimals][/U] { if(all decimals) { print warning; ask user to check exponent manually } [U]else { we know there's a CuLu test; exponent will not be submitted if(your residue matches another) { print "match"; do not submit }[/U] else { print no current matches, use Prime95! do not submit } } else {[U] no previous CuLu, and not DCed if( there is a matching residue ) {submit!} else { print warning: no match; do not submit }[/U] }[/code] Edit: Version 0.02 now available; fixed a logging bug; no change to major functionality. I have now also tested that a mismatch with no previous CuLu result is properly detected. Chart above updated as such. ([url]http://dubslow.tk/gimps/CuLuSpider.txt[/url] View in browser) ([url]http://dubslow.tk/gimps/CuLuSpider.py[/url] Download) |
All this double accounting is dubious. Who is served by having some unobservable wrong or right residue? (Perhaps some misplaced pride? Well, in that case, it would be better served by tuning the card to work right, not just "look ma! no hands! 5GHz!!")
I've submitted a non-matching residue long ago and never thought twice about it but bookmarked the result to revisit later. [URL="http://www.mersenne.org/report_exponent/?exp_lo=27402559&exp_hi=10000&B1=Get+status"]Et voila[/URL] - CUDA was right. (I've looked back at the version, it was CUDALucas v1.48.) |
| All times are UTC. The time now is 23:14. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.