![]() |
1 Attachment(s)
OpenCLucas ?
HD7750: [QUOTE] $ time ./a.out 216091 Using device: Capeverde 1 16384 --- 215001 32768 216001 32768 M( 216091 )P, n = 32768, MacLucasFFTW v8.1 Ballester real 12m17.124s user 5m10.547s sys 3m3.279s 2048k FFT 94 msec [/QUOTE] |
Or CLLucas :smile:
I might install linux just for tinkering with it... :razz: |
OCLucas:smile:
|
I am having trouble with exponent 58715819. I have run it several times using p95v279win64 on different processors and got the following results.
M58715819 is not prime. Res64: A25E1DD191A340D0. We4: 36299458,27352921,00000000, AID: C19BB90856CE3B67E66D58C6682991CB M58715819 is not prime. Res64: A25E1DD191A340D0. We4: 3BB22B6F,53977644,01000200, AID: C19BB90856CE3B67E66D58C6682991CB M58715819 is not prime. Res64: A25E1DD191A340D0. We4: 349B894F,51331660,00000000, AID: C19BB90856CE3B67E66D58C6682991CB One gave errors which is why the multiple runs but, they all gave the same Res64. I ran CUDALucas 2.01 on a 590 (each half) and a 560Ti and got the following results. M( 58715819 )C, 0x0ac290f866461586, n = 3670016, CUDALucas v2.01 M( 58715819 )C, 0x0ac290f866461586, n = 3670016, CUDALucas v2.01 M( 58715819 )C, 0x0ac290f866461586, n = 3670016, CUDALucas v2.01 I thought I had found the solution by moving to CUDALucas 2.03 on the 590 and got a Res64 that did not match anything previously. M( 58715819 )C, 0x0be4ee92a264fc3b, n = 3670016, CUDALucas v2.03 What is going on? Here are the last few screen prints of the final run in case it helps. Very low err. Iteration 58600000 M( 58715819 )C, 0xabbbf85c1135a70a, n = 3670016, CUDALucas v2.03 err = 0.0166 (12:55 real, 7.7443 ms/iter, ETA 12:54) Iteration 58700000 M( 58715819 )C, 0x5b8436b841f162ef, n = 3670016, CUDALucas v2.03 err = 0.0166 (13:03 real, 7.8299 ms/iter, ETA 0:00) M( 58715819 )C, 0x0be4ee92a264fc3b, n = 3670016, CUDALucas v2.03 Best regards, David P.S. My Res64's have always matched before and after on other similar exponents between p95 and CUDALucas. |
Due to [url=http://mersenneforum.org/showthread.php?t=18443]this thread[/url], can someone please compile a 64 bit CUDA 5.5 binary for sm_35 arch?
|
1 Attachment(s)
[QUOTE=Karl M Johnson;348828]Due to [URL="http://mersenneforum.org/showthread.php?t=18443"]this thread[/URL], can someone please compile a 64 bit CUDA 5.5 binary for sm_35 arch?[/QUOTE]
Sure thing :smile: Presently using Win8 x64 compiled on MSVS 12 update 3, using CuLu 2.03 source. |
2.55 ms/iter vs 2.94 ms/iter(M47).
I'm using the latest and greatest WHQL of 320.49. It must be the toolkit, not the drivers. Neat, thanks! Notice the difference, btw: [CODE]26.09.2012 00:46 26,093,928 cufft64_50_35.dll 11.07.2013 14:06 74,730,784 cufft64_55.dll[/CODE] |
[QUOTE=Karl M Johnson;348836]2.55 ms/iter vs 2.94 ms/iter(M47).
I'm using the latest and greatest WHQL of 320.49. It must be the toolkit, not the drivers. Neat, thanks! Notice the difference, btw: [CODE]26.09.2012 00:46 26,093,928 cufft64_50_35.dll 11.07.2013 14:06 74,730,784 cufft64_55.dll[/CODE][/QUOTE] Yeah, I'm not that surprised regarding the latest Cuda toolkit official support of arch sm_35 would be a factor in increased throughput. :smile: |
There's a 327.23 Forceware available with a WHQL cert.
So far so good. |
Where did you download "cufft64_55.dll" ?
|
[QUOTE=Nipal;354280]Where did you download "cufft64_55.dll" ?[/QUOTE]They are contained in the [url=https://developer.nvidia.com/cuda-downloads]CUDA Toolkit[/url], but for end users the DLLs are more readily available from my site:
[url]http://download.mersenne.ca/CUDAPm1/[/url] |
Is it just me or did LL iteration timings got very stable?
Check the several hundred thousand iterations from my latest assignment, where the difference between each ten thousand iterations is up to +-0.0002ms. Of course, it only happens when the computer is idle, but still hooked up to the monitor. [CODE]Iteration 100000 M( 64805113 )C, 0x18b0f00abb99d3f4, n = 3686400, CUDALucas v2.03 err = 0.1333 (0:30 real, 3.0452 ms/iter, ETA 54:43:44) Iteration 110000 M( 64805113 )C, 0x6f27dd99b13938b5, n = 3686400, CUDALucas v2.03 err = 0.1333 (0:31 real, 3.0452 ms/iter, ETA 54:43:12) Iteration 120000 M( 64805113 )C, 0xcf4a8317507eeaf3, n = 3686400, CUDALucas v2.03 err = 0.1333 (0:30 real, 3.0452 ms/iter, ETA 54:42:41) Iteration 130000 M( 64805113 )C, 0x86e6484ad79b494c, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:42:13) Iteration 140000 M( 64805113 )C, 0x19fed478e449b2e2, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0450 ms/iter, ETA 54:41:29) Iteration 150000 M( 64805113 )C, 0x9c5d589f19c5503d, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0453 ms/iter, ETA 54:41:17) Iteration 160000 M( 64805113 )C, 0x825888f60d078fcd, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:40:47) Iteration 170000 M( 64805113 )C, 0x37beacd26114c04d, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0453 ms/iter, ETA 54:40:16) Iteration 180000 M( 64805113 )C, 0x16f5a8d8c22484dc, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:39:52) Iteration 190000 M( 64805113 )C, 0x3fa0368e6bf8340a, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0453 ms/iter, ETA 54:39:17) Iteration 200000 M( 64805113 )C, 0x2d6de102809a2b23, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:38:52) Iteration 210000 M( 64805113 )C, 0x53d3437a65c5a3e4, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:38:19) Iteration 220000 M( 64805113 )C, 0x5faf1dab8c8b256c, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:37:49) Iteration 230000 M( 64805113 )C, 0xdc1482a76e83f687, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0451 ms/iter, ETA 54:37:04) Iteration 240000 M( 64805113 )C, 0xec301d099bf46f2a, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:36:45) Iteration 250000 M( 64805113 )C, 0x02d98303e5aadc2f, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:36:20) Iteration 260000 M( 64805113 )C, 0xe09ece2eb63e9cbd, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:35:38) Iteration 270000 M( 64805113 )C, 0x2c62ce5814d75190, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0450 ms/iter, ETA 54:34:53) Iteration 280000 M( 64805113 )C, 0x1fc0351a4a9109a4, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:34:49) Iteration 290000 M( 64805113 )C, 0xc25a5b393753c4ff, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:34:13) Iteration 300000 M( 64805113 )C, 0xbfccde3394e09673, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:33:32) Iteration 310000 M( 64805113 )C, 0x7350af823bd9ed75, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:33:15) Iteration 320000 M( 64805113 )C, 0xcf8b1ba62275c510, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:32:48) Iteration 330000 M( 64805113 )C, 0xff0296223c6986f9, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0452 ms/iter, ETA 54:32:06) Iteration 340000 M( 64805113 )C, 0x4f8495853deb6417, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:31:31) Iteration 350000 M( 64805113 )C, 0xcdd6e1bd0ecef59d, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:31:06) Iteration 360000 M( 64805113 )C, 0xaea20be130c9dc7b, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:30:46) Iteration 370000 M( 64805113 )C, 0x289389b1890ed2fa, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0455 ms/iter, ETA 54:30:21) Iteration 380000 M( 64805113 )C, 0xcace87ee23554ad5, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:29:46) Iteration 390000 M( 64805113 )C, 0xa06e9fc2bc3ab339, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0452 ms/iter, ETA 54:28:59)[/CODE] |
[QUOTE=Karl M Johnson;354483]Is it just me or did LL iteration timings got very stable?
Check the several hundred thousand iterations from my latest assignment, where the difference between each ten thousand iterations is up to +-0.0002ms. Of course, it only happens when the computer is idle, but still hooked up to the monitor. [CODE]Iteration 100000 M( 64805113 )C, 0x18b0f00abb99d3f4, n = 3686400, CUDALucas v2.03 err = 0.1333 (0:30 real, 3.0452 ms/iter, ETA 54:43:44) Iteration 110000 M( 64805113 )C, 0x6f27dd99b13938b5, n = 3686400, CUDALucas v2.03 err = 0.1333 (0:31 real, 3.0452 ms/iter, ETA 54:43:12) Iteration 120000 M( 64805113 )C, 0xcf4a8317507eeaf3, n = 3686400, CUDALucas v2.03 err = 0.1333 (0:30 real, 3.0452 ms/iter, ETA 54:42:41) Iteration 130000 M( 64805113 )C, 0x86e6484ad79b494c, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:42:13) Iteration 140000 M( 64805113 )C, 0x19fed478e449b2e2, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0450 ms/iter, ETA 54:41:29) Iteration 150000 M( 64805113 )C, 0x9c5d589f19c5503d, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0453 ms/iter, ETA 54:41:17) Iteration 160000 M( 64805113 )C, 0x825888f60d078fcd, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:40:47) Iteration 170000 M( 64805113 )C, 0x37beacd26114c04d, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0453 ms/iter, ETA 54:40:16) Iteration 180000 M( 64805113 )C, 0x16f5a8d8c22484dc, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:39:52) Iteration 190000 M( 64805113 )C, 0x3fa0368e6bf8340a, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0453 ms/iter, ETA 54:39:17) Iteration 200000 M( 64805113 )C, 0x2d6de102809a2b23, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:38:52) Iteration 210000 M( 64805113 )C, 0x53d3437a65c5a3e4, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:38:19) Iteration 220000 M( 64805113 )C, 0x5faf1dab8c8b256c, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:37:49) Iteration 230000 M( 64805113 )C, 0xdc1482a76e83f687, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0451 ms/iter, ETA 54:37:04) Iteration 240000 M( 64805113 )C, 0xec301d099bf46f2a, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:36:45) Iteration 250000 M( 64805113 )C, 0x02d98303e5aadc2f, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:36:20) Iteration 260000 M( 64805113 )C, 0xe09ece2eb63e9cbd, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:35:38) Iteration 270000 M( 64805113 )C, 0x2c62ce5814d75190, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0450 ms/iter, ETA 54:34:53) Iteration 280000 M( 64805113 )C, 0x1fc0351a4a9109a4, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:34:49) Iteration 290000 M( 64805113 )C, 0xc25a5b393753c4ff, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:34:13) Iteration 300000 M( 64805113 )C, 0xbfccde3394e09673, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:33:32) Iteration 310000 M( 64805113 )C, 0x7350af823bd9ed75, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0454 ms/iter, ETA 54:33:15) Iteration 320000 M( 64805113 )C, 0xcf8b1ba62275c510, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:32:48) Iteration 330000 M( 64805113 )C, 0xff0296223c6986f9, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0452 ms/iter, ETA 54:32:06) Iteration 340000 M( 64805113 )C, 0x4f8495853deb6417, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0452 ms/iter, ETA 54:31:31) Iteration 350000 M( 64805113 )C, 0xcdd6e1bd0ecef59d, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0453 ms/iter, ETA 54:31:06) Iteration 360000 M( 64805113 )C, 0xaea20be130c9dc7b, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:30:46) Iteration 370000 M( 64805113 )C, 0x289389b1890ed2fa, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0455 ms/iter, ETA 54:30:21) Iteration 380000 M( 64805113 )C, 0xcace87ee23554ad5, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:31 real, 3.0454 ms/iter, ETA 54:29:46) Iteration 390000 M( 64805113 )C, 0xa06e9fc2bc3ab339, n = 3686400, CUDALucas v2.03 err = 0.1367 (0:30 real, 3.0452 ms/iter, ETA 54:28:59)[/CODE][/QUOTE] I see the same thing. Timings are very stable when not using the machine. |
It was like that from the beginning, even for versions with powers of two. If you have patience to go through the last 100 pages of this topic, you will find repeated discussions where I was arguing that "non-constant" times means that something is wrong with your system (from the simple "non well-balanced", like CPU-bottle-necked, too much GPU power for the CPU you have, bad settings for priorities and/or affinities, to the more serious things like heat problems, throttling, etc).
|
[QUOTE=LaurV;354554]It was like that from the beginning, even for versions with powers of two. If you have patience to go through the last 100 pages of this topic, you will find repeated discussions where I was arguing that "non-constant" times means that something is wrong with your system (from the simple "non well-balanced", like CPU-bottle-necked, too much GPU power for the CPU you have, bad settings for priorities and/or affinities, to the more serious things like heat problems, throttling, etc).[/QUOTE]
Must have been the throttling in my case. |
By the way, why is this thread not sticky?:smile:
Also, I've noticed a curious behavior of CUDALucas: after you properly close it and restart from the latest checkpoint, the error rate is lower than it was. Are there any explanations to this? |
That is normal. The error rate you see is the MAXIMUM registered since the program started. So that some iteration producing crap will not escape un-notified. Also, the error is only checked after a bunch of iterations, and not at each iteration, as checking the error at every iteration is costly. To check error at every iteration (I always highly recommended it!), launch the program with "-t" switch. You will notify a time penalty of 1% to 10% (depending on your card), and you will also notify a "curious behavior": your error grows faster in the beginning (because it is checked at every iteration, and the maximum error is kept on screen, at every 10k or so, when you display the log). Launch with "-c 100" to see what's going on (only for fun, or didactic purpose, otherwise printing on the screen wastes a lot of time): the error never "decrease", only "increase", because always the maximum error is kept. This is normal.
edit: To clarify: the "-t" is highly recommended for first time tests. It will save you headache later. I don't recommend "-t" for DC, (especially on fast cards) where the best "error check" would be the final residue matching the original test. In case of a "non-matching", you didn't lose much time (~15 hours for a 30M DC on a gtx580, for example). On long run, "-t" for DC on fast cards may be counter-productive. Example: if your penalty with -t is about 7%, and you run one DC in x hours, then you will run 20 DCs without -t in 20x hours. If one DC produced a mismatch and you re-run a test watching the residues you had at the initial run, you will lose another x hours, so at the end you spent 21x hours to clear 20 exponents. If you run with -t, you may catch the error immediately, therefore resuming and don't lose any time, but because each test is 7% slower, you will need 20*(x+7%) hours to clear 20 exponents, which is about 21.4x, of course longer. And - like it or nor - you WILL produce bad residues, no matter how good your card you think it is. One in 20 is reasonable, statistically. Therefore, for first-time tests, you will be sorry when I will find that prime that you missed because your card produced a bad residue, which escaped undetected because you didn't want to lose time with "-t" :razz: |
Good explanation, [B]LaurV[/B].
I always run with the -t flag, as both the memory and the GPU is overclocked. |
[QUOTE=Karl M Johnson;355209]Good explanation, [B]LaurV[/B].
I always run with the -t flag, as both the memory and the GPU is overclocked.[/QUOTE] For me, the performance hit for -t seems very small on a 580. Just guestimating ~5%, if that. Also, CPU use hovers around 1% or a bit more, even with Polite=70. Doesn't change much toggling 70-0-70. The 580 seems to react differently than the 570. I don't have a lot of hard figures as I mostly prefer factoring work. |
[QUOTE=kladner;355414]For me, the performance hit for -t seems very small on a 580. Just guestimating ~5%, if that. Also, CPU use hovers around 1% or a bit more, even with Polite=70. Doesn't change much toggling 70-0-70. The 580 seems to react differently than the 570. I don't have a lot of hard figures as I mostly prefer factoring work.[/QUOTE]
That is exactly the behavior I see for all my 580's (different brands), with the only difference that I use polite 1-0-1 (when I need the cards, I switch to full polite, when not, to aggressive; I don't "game" but some CAD software needs the card sometime). So, if you do DC and get a mismatch every 20 tests, and you rerun (that) one to get the right residue (caution! you can get a mismatch not because the card is bad nor because something goes wrong during testing - you also can get a mismatch because, simple, the residue in the DB is wrong; in any case, IF you rerun the test, to "be sure"), than you are even with using -t switch. You will get mismatches more often, you are better with -t. You get mismatches seldom, you waste time with -t (see my former post). |
I read this on cuda fft 5.5 today:
On the GK110, the kernel occasionally may produce incorrect results. This happens when, either by loop unrolling or straight lines of code, there are more than 63 outstanding texture/LDG instructions at one point during the program execution. "Outstanding" in this case means none of the results of these instructions have been used. The underlying cause is that the texture barrier can track at most 63 outstanding texture/LDG instructions. If there are more than 63 such instructions, the texture barrier can no longer be relied on to ensure that any instruction's result is correct. This issue can be worked around by adding -maxrregcount 63 to ptxas. This guarantees there are at most 63 outstanding texture instructions because each texture/LDG will write at least one register. However, this may downgrade performance because it limits the maximum number of registers. (This issue has been fixed for CUDA 6.0.) [url]http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/[/url] |
CUDALucas 2.05 beta and "CUDALucas Road Map"
[FONT=System]CUDALucas v2.05 Beta is posted to Sourceforge. The Windows executable is [/FONT][URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"][FONT=System][COLOR=#0000ff]here[/COLOR][/FONT][/URL][FONT=System]. Linux is coming shortly.[/FONT]
[FONT=System]Windows .exe is compiled for CUDA 5.5 x64 with sm_13,sm_20,sm_30 & sm_35. If you need the library files, they're [/FONT][URL="https://sourceforge.net/projects/cudalucas/files/CUDA%20Libs/"][FONT=System][COLOR=#0000ff]here[/COLOR][/FONT][/URL][FONT=System]. If you want a different CUDA|sm version, let me know and I'll see if I can compile it. I'm still unable to compile a debug version. I'll keep working on it (anyone gotten a debug version out of the cmd window of MSVS using make?)[/FONT] [FONT=System]This is an [U]extensive[/U] update with code for:[/FONT] [FONT=System]- bit shift (for DoubleCheck capability)[/FONT] [FONT=System]- on-the-fly FFT increase/decrease[/FONT] [FONT=System]- CheckRoundoffAllIterations is no longer an option, it does it every time now[/FONT] [FONT=System]- and other updates listed [/FONT][URL="http://www.mersenneforum.org/showthread.php?t=18791&page=2"][FONT=System][COLOR=#0000ff]here[/COLOR][/FONT][/URL][FONT=System] and on SourceForge and in this thread.[/FONT] [FONT=System][U]We need testing please[/U]. Make a list of things that:[/FONT] [FONT=System]1) Don't work at all[/FONT] [FONT=System]2) Need work or fail after some testing[/FONT] [FONT=System]3) You would like to see added/modified[/FONT] [FONT=System]This is a quote from owftheevil about the updated FFT code:[QUOTE]Sorry, this is a bit too much for a diff file. Read the change log at the beginning of the file for a summary of what has been done. I have given the fft change routines a thorough checking, starting and later resuming a test with a much too small fft (273xxxxx exponent with 1200k fft). It quickly increases the fft length to the appropriate size and gives good residues. I ran a test of a 274xxxxx exponent with 1440k fft and the error limit set to 0.38. It increased and then decreased the fft 9 times during the test and ended with a matching residue. I also used the F and f options liberally with no ill effects on the residues.[/QUOTE]Some things I have noted that need work:[/FONT] [FONT=System]- Error recovery is a very serious issue. When P95 encounters a rounding error, it doesn't give up right away (usually). CUDALucas needs the same functionality as there is no reason to just quit after one rounding error.[/FONT] [FONT=System]-- Possible solution: Multiple save files (like P95) and when an error occurs, it needs to try the older file(s) before quitting. Possibly just jump to the oldest file and start comparing residues with the ones done before failure and as long as they match, keep going?[/FONT] [FONT=System]- Currently when running CUDALucas the program just stops. I'm still working to track down the cause, but I think it may be related to the rounding issue.[/FONT] [FONT=System]- How does P95 know to turn in a result with or without error codes? CUDALucas could use this functionality.[/FONT] [FONT=System]- Getting everyone to understand that CUDALucas is VERY different from mfaktc/o. [U]OC'ing is a CUDALucas killer on cards with no ECC memory.[/U][/FONT] [FONT=System]- CUDALucas needs to be tested and tested and tested to optimize exponents with FFTs for different GPUs. All that code should be in CUDALucas or an external fft.txt file. I know some of this functionality is in [/FONT][URL="http://www.mersenneforum.org/showthread.php?p=359102#post359102"][FONT=System][COLOR=#0000ff]CUDApm1[/COLOR][/FONT][/URL][FONT=System] (and already in CUDALucas's code). It just needs work.[/FONT] [FONT=System]- Early detection of overclocked/defective cards: CUDALucas needs a robust self test that runs on first-time cards, just like P95 runs a benchmark (on demand also). CUDALucas could be coded to recognize cards and their 'normal' clock and warn users that the OC is not a good idea. Moreover, I think that having a really good memory test in CUDALucas will allow users to test their cards properly. I know I read in the forum that someone developed a good GPU memory test, I just can't find the thread. Is that code open so we can add it to CUDALucas? EDIT: I found it [/FONT][URL="http://www.mersenneforum.org/showthread.php?p=339265#post339265"][FONT=System][COLOR=#0000ff]here[/COLOR][/FONT][/URL][FONT=System], written by owftheevil, and probably already in the code?[/FONT] [FONT=System]EDIT2: done :smile:[/FONT] |
CuLu doesn't sound very good in Spanish or Portugese. Or Italian.
Could we please limit the use of this silly abbreviation? It is called CUDALucas, as far as I know. |
Hi Jerry,
Compliments for a really informative, and link-filled roll out of this update. I'll give it a try on a 570 and a 580. I'm pretty confident running the Gigabyte 570 VRAM at 1600 MHz. It is rated for 1900, but is shaky running CL even at 1800. It might work at 1700, but I don't want to potentially waste a little over a day's work to find out. On the other hand, this particular 570 GPU is factory OC'd at 844 MHz, and has not given trouble once I found a safe range to run the VRAM. The memory on the Asus 580 has given good DC's at stock speed of 2004. GPUz reports default clock as 782 which I think is a 10 Mhz OC. It runs TF happily at 844 MHz, but I'd probably throttle back to 830, max, for CLucas. If I encounter problems, my first response will be to slow something down some more and try again. |
1 Attachment(s)
Here are a variety of (disorganized) results.
I will email the text output files to James, anon. |
Can we change the intermediate output (example below) so that it does not look very much like the final result lines?
[CODE]Iteration 54710000 M( 62807803 )C, 0x91c985b44391452b, n = 3670016, CUDALucas v2.03 err = 0.0735 (5:04 real, 30.3250 ms/iter, ETA 68:08:49) Iteration 54720000 M( 62807803 )C, 0xece05e44bdde87f2, n = 3670016, CUDALucas v2.03 err = 0.0735 (5:03 real, 30.3262 ms/iter, ETA 68:03:55) [/CODE] When "Lan_Party" submitted these lines, the manual web page gave CPU credit for each intermediate result. James, can the PHP script be modified to distinguish between the intermediate and final result lines? |
No problem. I can't believe you posted that because literally, I was just sitting and looking at the output line on the screen and working it so it will only take up one line. I'll get the code changed and uploaded.
|
[QUOTE=Prime95;359314]James, can the PHP script be modified to distinguish between the intermediate and final result lines?[/QUOTE]Shouldn't be a problem, I'll look at it when I get back on Sunday.
|
CUDALucas seems to think my GT640 has a hole where it's memory should be, it says I have [COLOR=#ff0000](minus)[/COLOR]2GiB totalGlobalmem!
I find this strangely disquieting for software which is dealing with large numbers :smile: |
Well, if it gives me negative credit numbers when I report the results*, I won't object too much :razz:
----- * like instead getting 30GHzDays when I report a DC, I would get a -30GHzDays, which in hex would be (use windows calculator) FFFF FFFF FFFF FFE2, or 18446744073709551586 GHzDays... In fact I would be ok even with a 32 bit value... (4294967266 GHzDays) :smile: |
[QUOTE=LaurV;359338]Well, if it gives me negative credit numbers when I report the results*, I won't object too much :razz:
----- * like instead getting 30GHzDays when I report a DC, I would get a -30GHzDays, which in hex would be (use windows calculator) FFFF FFFF FFFF FFE2, or 18446744073709551586 GHzDays... In fact I would be ok even with a 32 bit value... (4294967266 GHzDays) :smile:[/QUOTE] 2^32 - 6GB = -2Gb Luigi |
How much RAM does the card actually have? 2GB? 4GB?
|
[QUOTE=Antonio;359326]CUDALucas seems to think my GT640 has a hole where it's memory should be, it says I have [COLOR=#ff0000](minus)[/COLOR]2GiB totalGlobalmem!
I find this strangely disquieting for software which is dealing with large numbers :smile:[/QUOTE] This should be reported correctly in 2.05. |
[QUOTE=owftheevil;359371]This should be reported correctly in 2.05.[/QUOTE]
The -2GiB is reported by 2.05-Beta-x64, downloaded today. The card has +2GiB of memory installed. |
I am having a pretty rough time with 2.05 on the GTX 580.
'CUDALucas -cufftbench 1 8192 1' crashes on GTX 580, brings down graphic driver 327.23 (which restarts). 782 MHz core, 1600 VRAM This is just the latest test of many. Occasionally, the test completes. Rolling back the driver from 331.65 to 327.23 made no difference. 2.04-beta successfully completes 'CUDALucas -cufftbench 32768 3276800 32768' and has turned in good DCs at 830 MHz core, 1600 VRAM. I haven't yet tried running a DC on 2.05-beta. I have the card throttled back from where it normally runs mfaktc to stock: from 844 MHz to 782 MHz. The RAM is 400 MHz below stock. Any suggestions would be appreciated. EDIT: Tried running an exponent, 30651xxx on 2.05, 830 MHz core, 1600 VRAM. Started with 1728K, instead of stepping up to it from 1600K, as 2.04 did. Crashed a bit after the 40,000th iteration. [CODE]Iteration 10000 M( 30651671 )C, 0x6b79bd6d5adfb7de, n = 1728K, CUDALucas v2.05 Beta err = 0.05396 (0:26 real, 2.8857 ms/iter, ETA 24:33:42) Iteration 20000 M( 30651671 )C, 0x53064732900985e9, n = 1728K, CUDALucas v2.05 Beta err = 0.06055 (0:26 real, 2.6133 ms/iter, ETA 22:14:09) Iteration 30000 M( 30651671 )C, 0xe85abecfe0f40dce, n = 1728K, CUDALucas v2.05 Beta err = 0.05469 (0:26 real, 2.6123 ms/iter, ETA 22:13:12) Iteration 40000 M( 30651671 )C, 0xa4208cf27dd73713, n = 1728K, CUDALucas v2.05 Beta err = 0.06250 (0:26 real, 2.6123 ms/iter, ETA 22:12:48) CUDALucas.cu(310) : cudaSafeCall() Runtime API error 30: unknown error.[/CODE] |
[QUOTE=kladner;359469]I am having a pretty rough time with 2.05 on the GTX 580.
'CUDALucas -cufftbench 1 8192 1' crashes on GTX 580, brings down graphic driver 327.23 (which restarts). 782 MHz core, 1600 VRAM This is just the latest test of many. Occasionally, the test completes. Rolling back the driver from 331.65 to 327.23 made no difference. 2.04-beta successfully completes 'CUDALucas -cufftbench 32768 3276800 32768' and has turned in good DCs at 830 MHz core, 1600 VRAM. I haven't yet tried running a DC on 2.05-beta. I have the card throttled back from where it normally runs mfaktc to stock: from 844 MHz to 782 MHz. The RAM is 400 MHz below stock. Any suggestions would be appreciated. EDIT: Tried running an exponent, 30651xxx on 2.05, 830 MHz core, 1600 VRAM. Started with 1728K, instead of stepping up to it from 1600K, as 2.04 did. Crashed a bit after the 40,000th iteration. [/QUOTE] I tried your exponent on my GT 640, graphics driver 331.65, and it ran ok past your fail point (see below). So it looks like you may still have a hardware problem. [CODE]Iteration 10000 M( 30651671 )C, 0x6b79bd6d5adfb7de, n = 1728K, CUDALucas v2.05 Beta err = 0.05859 (2:21 real, 15.7012 ms/iter, ETA 133:38:30) Iteration 20000 M( 30651671 )C, 0x53064732900985e9, n = 1728K, CUDALucas v2.05 Beta err = 0.05664 (2:37 real, 15.6991 ms/iter, ETA 133:34:50) Iteration 30000 M( 30651671 )C, 0xe85abecfe0f40dce, n = 1728K, CUDALucas v2.05 Beta err = 0.05469 (2:37 real, 15.6950 ms/iter, ETA 133:30:06) Iteration 40000 M( 30651671 )C, 0xa4208cf27dd73713, n = 1728K, CUDALucas v2.05 Beta err = 0.05640 (2:37 real, 15.6912 ms/iter, ETA 133:25:34) Iteration 50000 M( 30651671 )C, 0x056716c2203b5c29, n = 1728K, CUDALucas v2.05 Beta err = 0.05859 (2:37 real, 15.6935 ms/iter, ETA 133:24:08) Iteration 60000 M( 30651671 )C, 0x5da5d75c80f2587c, n = 1728K, CUDALucas v2.05 Beta err = 0.05469 (2:37 real, 15.6927 ms/iter, ETA 133:21:05)[/CODE] ( I know it's slow - but that DDR3 memory is incredibly reliable :smile: ) |
Of course, it could always be hardware, but I did subsequently run the exponent up to ~750K it. with more aggressive core clock (830 MHz). As mentioned previously, the VRAM at 1600 has performed well in the past with 2.04-beta. I have the libraries up through 5.5.
Still just tossing things out there. |
[QUOTE=kladner;359480]Of course, it could always be hardware, but I did subsequently run the exponent up to ~750K it. with more aggressive core clock (830 MHz). As mentioned previously, the VRAM at 1600 has performed well in the past with 2.04-beta. I have the libraries up through 5.5.
Still just tossing things out there.[/QUOTE] Just a thought: are you using this card to drive your display as well as run CUDALucas? I've had problems in the past with (if my somewhat old and dimmed memory serves me correctly) very similar error reports, when I've tried using my display card for various CUDA work. Could it be a memory conflict when the screen is updated? (The GT 640 I'm using is not driving my display, I have a GTX 650 Ti for that). |
[QUOTE=Antonio;359482]Just a thought: are you using this card to drive your display as well as run CUDALucas?
I've had problems in the past with (if my somewhat old and dimmed memory serves me correctly) very similar error reports, when I've tried using my display card for various CUDA work. Could it be a memory conflict when the screen is updated? (The GT 640 I'm using is not driving my display, I have a GTX 650 Ti for that).[/QUOTE] Good point! Yes, the 580 is driving the display. I'll have to have a look at that. I'll try switching the display to the GTX 570. For that matter, I'll have another shot at running CL 2.05-beta on the 570. I'm currently back running the 331.65 driver, since the roll back didn't seem to make any difference. Thanks for the suggestion! I needed a new lead to follow. I currently have both cards back to doing LL-TF, but I do like to figure out things which don't work as they should. EDIT: Clarification of fuzzy thoughts from late night experiments: [QUOTE=kladner;359480]Of course, it could always be hardware, but I did subsequently run the exponent up to ~750K it. with more aggressive core clock (830 MHz). As mentioned previously, the VRAM at 1600 has performed well in the past with 2.04-beta. [/QUOTE] The above refers to running 2.04-beta on the 580 card. EDIT2: I also tried running with threads at 512 instead of the default 256, but it did not seem to make any difference. |
I just completed a DC on M57885161 with CUDALucas 2.05-Beta-x64. It completed without error and I even switched FFT sizes a few times. Since I have the full run of residues from the first time I ran it, I was able to check progress along the way.
The only issue I found so far was keyboard input. If Interactive=n is set to 1 in the .ini file then anytime I pressed a key the program would stop progress. GPU usage dropped to about 50% but ^c still stopped the run. I could restart with no problems. Anyone else seen this in Windows or Linux? Can some others test this to see if it's working or not in Windows and Linux? I haven't run all the FFT benchmarks yet, I'll do that now. Anyone else having a problem with the amount of memory reported by CUDALucas? |
Memory size is reported correctly on the 570 and 580.
The 580 has completed 'CUDALucas -cufftbench 1 8192 1' at least once, running at 810 MHz core, 1700 MHz VRAM. |
1 Attachment(s)
Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?
|
[QUOTE=kladner;359541]Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?[/QUOTE]
I have not looked at new code, but I guess the new code compresses the savefiles. |
Running the cudalucas 2.05 beta, the one downloaded from sourceforge, and my own built version I get wrong residues running the selftest.
Running 2.03 with the same parametersm everything is fine: Starting self test M43112609 fft length = 2304K Running careful round off test for 1000 iterations. If average error > 0.25, or maximum error > 0.35, the test will restart with a longer FFT. Iteration 100, average error = 0.17969, max error = 0.28125 Iteration 200, average error = 0.20398, max error = 0.26563 Iteration 300, average error = 0.21162, max error = 0.27344 Iteration 400, average error = 0.21489, max error = 0.28125 Iteration 500, average error = 0.21730, max error = 0.28125 Iteration 600, average error = 0.21847, max error = 0.26563 Iteration 700, average error = 0.21941, max error = 0.25781 Iteration 800, average error = 0.22026, max error = 0.25879 Iteration 900, average error = 0.22068, max error = 0.26172 Iteration 1000, average error = 0.22089 <= 0.25 (max error = 0.28125), continuin g test. Iteration 10000 M( 43112609 )C, 0x62871c7027ff12c8, n = 2304K, CUDALucas v2.05 B eta err = 0.50000 (0:21 real, 2.0989 ms/iter) Expected residue [e86891ebf6cd70c4] does not match actual residue [62871c7027ff1 2c8] Starting self test M57885161 fft length = 3136K Running careful round off test for 1000 iterations. If average error > 0.25, or maximum error > 0.35, the test will restart with a longer FFT. Iteration 100, average error = 0.15905, max error = 0.22656 Iteration 200, average error = 0.18188, max error = 0.23438 Iteration 300, average error = 0.19004, max error = 0.24219 Iteration 400, average error = 0.19322, max error = 0.23438 Iteration 500, average error = 0.19489, max error = 0.22021 Iteration 600, average error = 0.19627, max error = 0.22852 Iteration 700, average error = 0.19733, max error = 0.24023 Iteration 800, average error = 0.19812, max error = 0.25000 Iteration 900, average error = 0.19867, max error = 0.24219 Iteration 1000, average error = 0.19901 <= 0.25 (max error = 0.25000), continuin g test. Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3136K, CUDALucas v2.05 B eta err = 0.26953 (0:26 real, 2.5361 ms/iter) This residue is correct. I havent figured out whats going on yet, just wanted to let ppl know. I have both compiled it from sourcecode, and tested it, and downloaded the prebuild from sourceforge. I think more people should run the selftest: cudalucas2.05beta.exe -r (tested on titan) |
FFT too large
1 Attachment(s)
I get a different error result. The attached has been consistent at a variety of speeds.
|
[QUOTE=kladner;359642]I get a different error result. The attached has been consistent at a variety of speeds.[/QUOTE]
Delete or rename you GeForce GTX --- fft.txt file and try again. |
An issue:
When running CUDALucas -r with a GeForce GTX --- fft.txt you [I]may[/I] get the error: [CODE]The fft length 32K is too large for the exponent 216091. Restart with smaller fft.[/CODE] Removing the file, as noted above, fixes the error. So when -cufftbench is run and the .txt file is gereated I presume the FFTs are tuned correctly. However, the new 'less tolerant' code won't accept those values for use in the self test. Also, can someone explain the updated threads in 2.05. Is it necessary to have .ini file threads anymore? Why three values instead of 1. What is the interaction with the new .txt file? Thanks |
[QUOTE=flashjh;359644]Delete or rename you GeForce GTX --- fft.txt file and try again.[/QUOTE]
I now get this- [CODE]Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved. E:\CUDA\2.05-BETA>CUDALucas -r ------- DEVICE 0 ------- name GeForce GTX 580 Compatibility 2.0 clockRate (MHz) 1564 memClockRate (MHz) 1600 totalGlobalMem 1610612736 totalConstMem 65536 l2CacheSize 786432 sharedMemPerBlock 49152 regsPerBlock 32768 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsPerMP 1536 multiProcessorCount 16 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 65535,65535,65535 textureAlignment 512 deviceOverlap 1 Starting self test M86243 fft length = 8K Running careful round off test for 1000 iterations. If average error > 0.25, or maximum error > 0.35, the test will restart with a longer FFT. Iteration 100, average error = 0.00000, max error = 0.00000 Iteration 200, average error = 0.00000, max error = 0.00000 Iteration 300, average error = 0.00000, max error = 0.00000 Iteration 400, average error = 0.00000, max error = 0.00000 Iteration 500, average error = 0.00000, max error = 0.00000 Iteration 600, average error = 0.00000, max error = 0.00000 Iteration 700, average error = 0.00000, max error = 0.00000 Iteration 800, average error = 0.00000, max error = 0.00000 Iteration 900, average error = 0.00000, max error = 0.00000 Iteration 1000, average error = 0.00000 <= 0.25 (max error = 0.00000), continuing test. Iteration 10000 M( 86243 )C, 0x23992ccd735a03d9, n = 8K, CUDALucas v2.05 Beta err = 0.00000 (0:01 real, 0.0812 ms/iter) This residue is correct. Starting self test M132049 fft length = 8K Running careful round off test for 1000 iterations. If average error > 0.25, or maximum error > 0.35, the test will restart with a longer FFT. Iteration 100, average error = 0.00024, max error = 0.00037 Iteration 200, average error = 0.00026, max error = 0.00035 Iteration 300, average error = 0.00027, max error = 0.00037 Iteration 400, average error = 0.00027, max error = 0.00037 Iteration 500, average error = 0.00027, max error = 0.00035 Iteration 600, average error = 0.00027, max error = 0.00034 Iteration 700, average error = 0.00027, max error = 0.00037 Iteration 800, average error = 0.00027, max error = 0.00037 Iteration 900, average error = 0.00027, max error = 0.00037 Iteration 1000, average error = 0.00028 <= 0.25 (max error = 0.00037), continuing test. Iteration 10000 M( 132049 )C, 0x4c52a92b54635f9e, n = 8K, CUDALucas v2.05 Beta err = 0.00044 (0:01 real, 0.0844 ms/iter) This residue is correct. fft length 14336 must be divisible by 4 * mult threads 1024 E:\CUDA\2.05-BETA>[/CODE] |
check your .ini file, is it 1024 threads ?
if so, change it down to 256 for residue tests. |
[QUOTE=Manpowre;359651]check your .ini file, is it 1024 threads ?
if so, change it down to 256 for residue tests.[/QUOTE] Bingo! Changed 1024 to 256 and 'cudalucas -r' completed successfully. Thanks, Manpowre! |
My fft.txt file says that Threads=512 256 256 is the best setting for me to use for the current FFT range I'm in, so I leave it there and that works for the -r test too.
|
[QUOTE=flashjh;359655]My fft.txt file says that Threads=512 256 256 is the best setting for me to use for the current FFT range I'm in, so I leave it there and that works for the -r test too.[/QUOTE]
Thanks, Jerry. That seems to complete the puzzle. "Threads=512 256 256" seems to have let me complete 'CUDALucas -cufftbench 1 8192 1' twice, when mostly it would not complete with 1024 1024 1024, or 256 256 256. I found the threads file which CUDAPm1 generated for my 580, and it agrees with your numbers. The card is still throttled way back. I'm going to see if it will run with at least the core clocked up a bit. EDIT: Partial correction is in order. Before the last few runs I also switched the display from the 580 to the GTX 570. This was at Antonio's suggestion, and also seemed to play a role in stabilizing the 580. EDIT2: I declared victory prematurely. The latest attempt with -r yielded-[CODE]E:\CUDA\2.05-BETA>CUDALucas -r ------- DEVICE 0 ------- name GeForce GTX 580 Compatibility 2.0 clockRate (MHz) 1564 memClockRate (MHz) 1600 totalGlobalMem 1610612736 totalConstMem 65536 l2CacheSize 786432 sharedMemPerBlock 49152 regsPerBlock 32768 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsPerMP 1536 multiProcessorCount 16 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 65535,65535,65535 textureAlignment 512 deviceOverlap 1 Starting self test M86243 fft length = 4K Running careful round off test for 1000 iterations. If average error > 0.25, or maximum error > 0.35, the test will restart with a longer FFT. Iteration 100, average error = 0.15317, max error = 0.23438 Iteration 200, average error = 0.16318, max error = 0.24521 Iteration 300, average error = 0.16738, max error = 0.23047 Iteration 400, average error = 0.17024, max error = 0.25000 Iteration 500, average error = 0.17168, max error = 0.25000 Iteration 600, average error = 0.17195, max error = 0.23438 Iteration 700, average error = 0.17195, max error = 0.20947 Iteration 800, average error = 0.17240, max error = 0.21875 Iteration 900, average error = 0.17285, max error = 0.25000 Iteration 1000, average error = 0.17264 <= 0.25 (max error = 0.25000), continuing test. Iteration 10000 M( 86243 )C, 0x23992ccd735a03d9, n = 4K, CUDALucas v2.05 Beta err = 0.28125 (0:01 real, 0.0825 ms/iter) This residue is correct. The fft length 16K is too large for the exponent 132049. Restart with smaller fft.[/CODE] The above happens regardless of which card the monitor is connected to. |
[QUOTE=Antonio;359427]The -2GiB is reported by 2.05-Beta-x64, downloaded today.
The card has +2GiB of memory installed.[/QUOTE] My mistake. I fixed the code, but only put it into CUDAPm1. I put it into CUDALucas this weekend. That version of the code will be up soon. |
[QUOTE=kladner;359469]I am having a pretty rough time with 2.05 on the GTX 580.
'CUDALucas -cufftbench 1 8192 1' crashes on GTX 580, brings down graphic driver 327.23 (which restarts). 782 MHz core, 1600 VRAM This is just the latest test of many. Occasionally, the test completes. Rolling back the driver from 331.65 to 327.23 made no difference. 2.04-beta successfully completes 'CUDALucas -cufftbench 32768 3276800 32768' and has turned in good DCs at 830 MHz core, 1600 VRAM. I haven't yet tried running a DC on 2.05-beta. I have the card throttled back from where it normally runs mfaktc to stock: from 844 MHz to 782 MHz. The RAM is 400 MHz below stock. Any suggestions would be appreciated. EDIT: Tried running an exponent, 30651xxx on 2.05, 830 MHz core, 1600 VRAM. Started with 1728K, instead of stepping up to it from 1600K, as 2.04 did. Crashed a bit after the 40,000th iteration. [CODE]Iteration 10000 M( 30651671 )C, 0x6b79bd6d5adfb7de, n = 1728K, CUDALucas v2.05 Beta err = 0.05396 (0:26 real, 2.8857 ms/iter, ETA 24:33:42) Iteration 20000 M( 30651671 )C, 0x53064732900985e9, n = 1728K, CUDALucas v2.05 Beta err = 0.06055 (0:26 real, 2.6133 ms/iter, ETA 22:14:09) Iteration 30000 M( 30651671 )C, 0xe85abecfe0f40dce, n = 1728K, CUDALucas v2.05 Beta err = 0.05469 (0:26 real, 2.6123 ms/iter, ETA 22:13:12) Iteration 40000 M( 30651671 )C, 0xa4208cf27dd73713, n = 1728K, CUDALucas v2.05 Beta err = 0.06250 (0:26 real, 2.6123 ms/iter, ETA 22:12:48) CUDALucas.cu(310) : cudaSafeCall() Runtime API error 30: unknown error.[/CODE][/QUOTE] This is an Nvidia driver error. I used to think it only occured when the card was also driving the display. Recently I got this error on a 570 which was not driving the display (although it didn't report an error, it just hung indefinitely). It seems to have been introduced as of the 300+ drivers. |
[QUOTE=flashjh;359527]I just completed a DC on M57885161 with CUDALucas 2.05-Beta-x64. It completed without error and I even switched FFT sizes a few times. Since I have the full run of residues from the first time I ran it, I was able to check progress along the way.
The only issue I found so far was keyboard input. If Interactive=n is set to 1 in the .ini file then anytime I pressed a key the program would stop progress. GPU usage dropped to about 50% but ^c still stopped the run. I could restart with no problems. Anyone else seen this in Windows or Linux? Can some others test this to see if it's working or not in Windows and Linux? I haven't run all the FFT benchmarks yet, I'll do that now. Anyone else having a problem with the amount of memory reported by CUDALucas?[/QUOTE] I have seen this keyboard input problem. If I run cmd.exe (?? I think that's its name) keyboard input dosen't work. The other console program, whatever its called, does work with keyboard input. |
[QUOTE=kladner;359541]Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?[/QUOTE]
Yes they should be. |
[QUOTE=Manpowre;359641]
. . . Starting self test M43112609 fft length = 2304K Running careful round off test for 1000 iterations. If average error > 0.25, or maximum error > 0.35, the test will restart with a longer FFT. Iteration 100, average error = 0.17969, max error = 0.28125 Iteration 200, average error = 0.20398, max error = 0.26563 Iteration 300, average error = 0.21162, max error = 0.27344 Iteration 400, average error = 0.21489, max error = 0.28125 Iteration 500, average error = 0.21730, max error = 0.28125 Iteration 600, average error = 0.21847, max error = 0.26563 Iteration 700, average error = 0.21941, max error = 0.25781 Iteration 800, average error = 0.22026, max error = 0.25879 Iteration 900, average error = 0.22068, max error = 0.26172 Iteration 1000, average error = 0.22089 <= 0.25 (max error = 0.28125), continuin g test. Iteration 10000 M( 43112609 )C, 0x62871c7027ff12c8, n = 2304K, CUDALucas v2.05 B eta [COLOR=Olive]err = 0.50000[/COLOR] (0:21 real, 2.0989 ms/iter) Expected residue [e86891ebf6cd70c4] does not match actual residue [62871c7027ff1 2c8] . . . (tested on titan)[/QUOTE] Notice the round-off error. Which driver are you using? |
[QUOTE=flashjh;359647]An issue:
When running CUDALucas -r with a GeForce GTX --- fft.txt you [I]may[/I] get the error: [CODE]The fft length 32K is too large for the exponent 216091. Restart with smaller fft.[/CODE]Removing the file, as noted above, fixes the error. So when -cufftbench is run and the .txt file is gereated I presume the FFTs are tuned correctly. However, the new 'less tolerant' code won't accept those values for use in the self test. Also, can someone explain the updated threads in 2.05. Is it necessary to have .ini file threads anymore? Why three values instead of 1. What is the interaction with the new .txt file? Thanks[/QUOTE] Or instead of deleting the threads.txt file, insert a line with 16 as its only entry before the line with 32 on it. This is a lack of foresight on my part. Even though 32k ffts are faster than 16k or other smaller ffts big enough to handle 216091, some of those smaller ffts are still needed. I'll think about how to fix this. As for threads in the ini file, there are three kernels whose performance depends on the number of threads they are invoked with. 2.04 and earlier fixed the threads on two of them at 128, which is a good compromise. Those values should be the defaults in the ini file. I don't know how 1024 snuck its way in as the default.The values in threads.txt override the ini values. |
[QUOTE=owftheevil;359697]Notice the round-off error. Which driver are you using?[/QUOTE]
Latest nvidia 331.65 cudalucas 2.03 doesnt give me wrong residue. I took down clock on gpu with 200mhz, and it run fine again with 2.05. |
[QUOTE=Manpowre;359724]Latest nvidia 331.65
cudalucas 2.03 doesnt give me wrong residue. I took down clock on gpu with 200mhz, and it run fine again with 2.05.[/QUOTE] Good to hear. What clocks are you running at now? |
[QUOTE=owftheevil;359736]Good to hear. What clocks are you running at now?[/QUOTE]
780 mhz, stock memory clock, I put the residue test up again for 20 repeats an hour ago, so I will complete this, then take it back up to stock clock at 880 and retest 20 residue runs again. |
The new code is compiled and the windows binaries (release/debug) are posted on SourceForge.
@owftheevil: The -memtest functions, but something isn't right with the iterations. For example 56 1000 1 on my 580 says ETA 12181:18:07 :smile: I posted a working memtest.zip to [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/?"]sourceforge[/URL] EDIT: Please only use 2.05 Beta .exe files for testing the code. It is not ready for production use yet. Thanks! |
[QUOTE=flashjh;359750]The new code is compiled and the windows binaries (release/debug) are posted on SourceForge.
@owftheevil: The -memtest functions, but something isn't right with the iterations. For example 56 1000 1 on my 580 says ETA 12181:18:07 :smile: I posted a working memtest.zip to [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/?"]sourceforge[/URL] EDIT: Please only use 2.05 Beta .exe files for testing the code. It is not ready for production use yet. Thanks![/QUOTE] That does seem a bit slow. Usage: [CODE]./CUDALucas -memtest k n[/CODE] where k * 25 MB of memory are tested, n * 10000 iterations are done for each of 5 data types at each of the k positions. So with k = 56, n = 1000 you are reading 75MB and writing 25 MB 2.8 billion times. Only ~39GB/s bandwidth on the reads. I'll take a look. |
That same test before only took a few seconds.
|
[QUOTE=owftheevil;359754]That does seem a bit slow.
Usage: [CODE]./CUDALucas -memtest k n[/CODE] where k * 25 MB of memory are tested, n * 10000 iterations are done for each of 5 data types at each of the k positions. So with k = 56, n = 1000 you are reading 75MB and writing 25 MB 2.8 billion times. Only ~39GB/s bandwidth on the reads. I'll take a look.[/QUOTE] hmm, it is x 10k iterations.. oo.. thats different than from memtest right ? second parameter was not multiplied with 10k ? |
Looked at the ETA code for memtest last night. Didn't find anything wrong, but changed the formula to smooth out the results. Its working as expected on a 570 and 560 ti. New code up at sourceforge.
[CODE]./CUDALucas -memtest 35 10[/CODE] gives an ETA of just over 4 hours on the 560 ti. [CODE]./CUDALucas -memtest 28 2000[/CODE] gives an ETA of just over 1200 hours on the 570 while it is simultaneously running stage 1 of CUDAPm1. |
Those of you having the fft too big problem while running the self check, could you please post the *fft.txt files, at least up to the line with fft 32 in it? I need to make sure I understand what the problem is.
|
[QUOTE=owftheevil;359803]gives an ETA of just over 1200 hours on the 570 while it is simultaneously running stage 1 of CUDAPm1.[/QUOTE]
Maybe I missed some discussion about the memtest -- is it meant to run for 50 days in conjunction with CUDALucas or CUDAPm1? |
[QUOTE=flashjh;359808]Maybe I missed some discussion about the memtest -- is it meant to run for 50 days in conjunction with CUDALucas or CUDAPm1?[/QUOTE]
No, I was just trying to guess at what might have given you such a large ETA. Could you please try running CUDALucas with [CODE]-memtest 56 1 [/CODE] Add the -d 1 if tyou want it to run on device 1. |
Recompiled from r43.
-memtest 56 1: [CODE] Initializing memory test using 1400MB of memory on device 0 Beginning test. Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.36%, Read 4.73G B/s, Write 1.58GB/s, ETA 11:59:48) [/CODE] -memtest 35 1:[CODE]Initializing memory test using 875MB of memory on device 0 Beginning test. Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.06%, Read 117.1 4GB/s, Write 39.05GB/s, ETA 3:02:15) Position 0, Data Type 0, Iteration 20000, Errors: 0, completed 0.11%, Read 117.0 8GB/s, Write 39.03GB/s, ETA 3:02:12) Position 0, Data Type 0, Iteration 30000, Errors: 0, completed 0.17%, Read 117.0 9GB/s, Write 39.03GB/s, ETA 3:02:06) Position 0, Data Type 0, Iteration 40000, Errors: 0, completed 0.23%, Read 117.0 8GB/s, Write 39.03GB/s, ETA 3:02:00) Position 0, Data Type 0, Iteration 50000, Errors: 0, completed 0.29%, Read 117.0 7GB/s, Write 39.02GB/s, ETA 3:01:55)[/CODE] Maybe I'm asking the wrong question -- On the original memtest you wrote, -memtest 56 1 only took a few seconds. Did you re-wrote the code to take 12 hours on purpose? Is the test 'updated' to run the way you think it needs to be written for a proper test? |
[QUOTE=flashjh;359825]Recompiled from r43.
-memtest 56 1: [CODE] Initializing memory test using 1400MB of memory on device 0 Beginning test. Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.36%, Read 4.73G B/s, Write 1.58GB/s, ETA 11:59:48) [/CODE] -memtest 35 1:[CODE]Initializing memory test using 875MB of memory on device 0 Beginning test. Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.06%, Read 117.1 4GB/s, Write 39.05GB/s, ETA 3:02:15) Position 0, Data Type 0, Iteration 20000, Errors: 0, completed 0.11%, Read 117.0 8GB/s, Write 39.03GB/s, ETA 3:02:12) Position 0, Data Type 0, Iteration 30000, Errors: 0, completed 0.17%, Read 117.0 9GB/s, Write 39.03GB/s, ETA 3:02:06) Position 0, Data Type 0, Iteration 40000, Errors: 0, completed 0.23%, Read 117.0 8GB/s, Write 39.03GB/s, ETA 3:02:00) Position 0, Data Type 0, Iteration 50000, Errors: 0, completed 0.29%, Read 117.0 7GB/s, Write 39.02GB/s, ETA 3:01:55)[/CODE] Maybe I'm asking the wrong question -- On the original memtest you wrote, -memtest 56 1 only took a few seconds. Did you re-wrote the code to take 12 hours on purpose? Is the test 'updated' to run the way you think it needs to be written for a proper test?[/QUOTE] Yes, kind of. Too few iterations, like 1000, will miss errors in marginal cases, so I made sure enough iterations are done on each part of the memory chunk. However, its not supposed to last as long on those settings as it is. Also, somethings wrong with your output. With 1 as the parameter for iterations, it should not be repeating Data Type 0. And for some reason, its not reading or writing very fast with 56 for the size of the memory chunk its testing. Thanks for posting this. Now I have something to look at. Edit: How much real time is it taking between screen updates in those two cases? Edit 2: I have a new version up with a diagnostic line. Could you please try the same thing with the new version when you get a chance? |
r46: -memtest 56 1
[CODE]C:\CUDA\CuLu\test>CUDALucas_205Betar46 -memtest 56 1 ------- DEVICE 0 ------- name GeForce GTX 580 Initializing memory test using 1400MB of memory on device 0... Input: size = 56, iterations = 1 Beginning test. Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.36%, Read 4.63G B/s, Write 1.54GB/s, ETA 12:14:50) Position 0, Data Type 1, Iteration 20000, Errors: 0, completed 0.71%, Read 4.64G B/s, Write 1.55GB/s, ETA 12:12:05) Position 0, Data Type 2, Iteration 30000, Errors: 0, completed 1.07%, Read 4.61G B/s, Write 1.54GB/s, ETA 12:10:50) Position 0, Data Type 3, Iteration 40000, Errors: 0, completed 1.43%, Read 4.62G B/s, Write 1.54GB/s, ETA 12:08:32) Position 0, Data Type 4, Iteration 50000, Errors: 0, completed 1.79%, Read 4.63G B/s, Write 1.54GB/s, ETA 12:05:44)[/CODE]Observations: Before the GPU would stay at 100% usage, now every few seconds it drops down to between 20% to 80% and then goes back to 100% I timed the last group. CUDALucas says 2:48 elapsed, real time was 2:37.9 -memtest 35 1 [CODE]C:\CUDA\CuLu\test>CUDALucas_205Betar46 -memtest 35 1 ------- DEVICE 0 ------- name GeForce GTX 580 Initializing memory test using 875MB of memory on device 0... Input: size = 35, iterations = 1 Beginning test. Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.57%, Read 125.0 3GB/s, Write 41.68GB/s, ETA 16:59) Position 0, Data Type 1, Iteration 20000, Errors: 0, completed 1.14%, Read 124.9 4GB/s, Write 41.65GB/s, ETA 16:53) Position 0, Data Type 2, Iteration 30000, Errors: 0, completed 1.71%, Read 124.6 9GB/s, Write 41.56GB/s, ETA 16:48) Position 0, Data Type 3, Iteration 40000, Errors: 0, completed 2.29%, Read 125.0 6GB/s, Write 41.69GB/s, ETA 16:42) Position 0, Data Type 4, Iteration 50000, Errors: 0, completed 2.86%, Read 124.7 3GB/s, Write 41.58GB/s, ETA 16:36) Position 1, Data Type 0, Iteration 60000, Errors: 0, completed 3.43%, Read 124.8 9GB/s, Write 41.63GB/s, ETA 16:31)[/CODE]Observations: Usage stays at 100% CUDALucas Time: 5 sec, Timed: 5.8 sec |
I was able to get into windows last night to run some tests. I'm seeing the same thing you are. On a 570 with 1250MB of memory,
-memtest 41 1 runs normally, from 42 up to 46 its very slow like what you see with 56, at 47 it can't allocate all the memory and throws a cuda error. On Linux, everything is as expected. Up to 47, it runs full speed with no problems, at 48 it can't allocate the memory. |
Ideas?
|
[QUOTE=flashjh;359875]r46: -memtest 56 1
. . . I timed the last group. CUDALucas says 2:48 elapsed, real time was 2:37.9 . . . CUDALucas Time: 5 sec, Timed: 5.8 sec[/QUOTE] Its not quite reporting the elapsed time, but rather, how long to finish the test if the rest is done at the same rate as what has already been done, so you will see some differences here, especially at the beginning. |
New version up at sourceforge. The memory allocations are split into two chunks. On Windows, ~15% slowdown occurs only at the first report after switching chunks.
Also, the problem with gaps in the *fft.txt files is fixed. There should not be anymore "fft to large" errors. |
[QUOTE=owftheevil;359988]New version up at sourceforge. The memory allocations are split into two chunks. On Windows, ~15% slowdown occurs only at the first report after switching chunks.
Also, the problem with gaps in the *fft.txt files is fixed. There should not be anymore "fft to large" errors.[/QUOTE] you are amazing! my titan node is back up. corsair 860w modular PSU.. pretty good one. I found a screw in the old psu. no wonder why it stopped working! lol Ill download the latest code and give it a good go tonight! |
I'm having a problem with CUDALucas stopping:
[CODE] C:/CUDA/CuLu/src/CUDALucas.cu(372) : cudaSafeCall() Runtime API error 30: unknow n error. [/CODE] No specific error. Sometimes it goes for hours, sometimes minutes. Any ideas? |
[QUOTE=flashjh;360337]I'm having a problem with CUDALucas stopping:
[CODE] C:/CUDA/CuLu/src/CUDALucas.cu(372) : cudaSafeCall() Runtime API error 30: unknow n error. [/CODE] No specific error. Sometimes it goes for hours, sometimes minutes. Any ideas?[/QUOTE] On my new node, which had heat issues (Not in production yet), when the temperature gets close to 90 degrees or above 90 degrees, cuda just dont want to calculate anymore. could it be that ? checked your gpu temperature recently ? dust ? just ideas, as I cleaned my titans and 590s during weekend to prevent this from happening. ( I have put the new codebase on test on the new node).. |
Everything is clean. Temps stay at 72 under full load.
With MSVS, the debug build won't run from MSVS because it doesn't find the worktodo.txt file. Have you run debug from MSVS? Since it doesn't actually crash I haven't had the option to debug. I know it says unknown error, but what line can we add to output more error info from safecall? |
[QUOTE=flashjh;360340]Everything is clean. Temps stay at 72 under full load.
With MSVS, the debug build won't run from MSVS because it doesn't find the worktodo.txt file. Have you run debug from MSVS? Since it doesn't actually crash I haven't had the option to debug. I know it says unknown error, but what line can we add to output more error info from safecall?[/QUOTE] Yes, there is a parameter you either need to change to your directory with worktodo, or put worktodo in the project directory, same dir as the sln file I guess.. just try the different directories.. also the .ini file needs to go there.. but there is a setting for default working directory... |
I put the files there, but it still doesn't find them... something I'm doing wrong. My classes all end this week as the students take finals, hopefully I can find some time to look into it.
|
[QUOTE=flashjh;360342]I put the files there, but it still doesn't find them... something I'm doing wrong. My classes all end this week as the students take finals, hopefully I can find some time to look into it.[/QUOTE]
Ill check it when I get home later today.. |
[QUOTE=flashjh;360337]I'm having a problem with CUDALucas stopping:
[CODE] C:/CUDA/CuLu/src/CUDALucas.cu(372) : cudaSafeCall() Runtime API error 30: unknow n error. [/CODE]No specific error. Sometimes it goes for hours, sometimes minutes. Any ideas?[/QUOTE] Thats the memcopy timing out because one of the previous cufft calls has hung. A problem since 300++ drivers came about. |
Can it be detected before it errors out and stops?
Possible fix after detection: Stop workers, move back to last save and continue? It has happened for a long time, but I never knew if it was my system or not and with all the other issues I just restarted every time it stopped. I have tested on enough systems to know that the systems are not problem. Now, CUDALucas seems stable enough for a full run in beta testing, but I keep having to restart after the error. It ends up wasting a lot of time and prevents me from knowing if it will compete a good run all the way through. |
Yes, when detecting the error it doesn't have to exit the program. It could reset the device and restart from the last checkpoint. I tried to do this way back in February, but then my cuda skills were only a small part of the meager set I have now, and I couldn't get it to work. Maybe its time to retry that.
In the meantime, I just run from a shell script that loops on a non zero exit value. |
Good idea, I'll setup one of those tonight.
Did you read specific books or just learn it? |
I learned much of it by studying the cudalucas code, the rest by reading online articles and forums, and playing around a lot.
I'm not a programmer, just a mathematician with a complier and internet access. |
I also run a shell script.. works just fine.
|
[QUOTE=Manpowre;360365]I also run a shell script.. works just fine.[/QUOTE]
On Windows: make a .bat file in the same folder as your cudalucas .exe file. Then put the following inside it: :loop echo "Starting Cudalucas:" CudaLucas.exe -d 1 GOTO loop This will ensure cudalucas starts again if it crashes. |
I did get one working a little more complicated for the command line because I want to increase a counter to keep track of how many times it restarts. Thanks.
|
[QUOTE=flashjh;360425]I did get one working a little more complicated for the command line because I want to increase a counter to keep track of how many times it restarts. Thanks.[/QUOTE]
No problem, this will have number of restarts to the log.txt file aswell as to the console. Set count=0 :loop Set /A count+=1 echo "Starting Cudalucas: " echo %count% > log.txt echo %count% CudaLucas2.03.exe -d 1 GOTO loop |
What is the current format recognized for CUDALucas results?
EDIT: For example, with this is the worktodo: Test=10061 Test=10061 DoubleCheck=10061 DoubleCheck=10061 I get this, [CODE]M10061, 0x56eb9bb91825b188, offset = 9029, n = 1K, CUDALucas v2.05 Beta M10061, 0x56eb9bb91825b188, offset = 9029, n = 1K, CUDALucas v2.05 Beta M10061, 0x56eb9bb91825b188, offset = 4000, n = 1K, CUDALucas v2.05 Beta M10061, 0x56eb9bb91825b188, offset = 4000, n = 1K, CUDALucas v2.05 Beta[/CODE]then again: [CODE]M10061, 0x56eb9bb91825b188, offset = 4052, n = 1K, CUDALucas v2.05 Beta M10061, 0x56eb9bb91825b188, offset = 4054, n = 1K, CUDALucas v2.05 Beta M10061, 0x56eb9bb91825b188, offset = 4054, n = 1K, CUDALucas v2.05 Beta M10061, 0x56eb9bb91825b188, offset = 9086, n = 1K, CUDALucas v2.05 Beta [/CODE]Also, I did a DC on another exponent and after [I]many [/I]stops due to the memory error, restarts, troubleshooting, recompiles, and FFT length changes (up & down) the result is a match. I would like to see DoubleCheck results from others. As it looks like things are stable now, can we move to allow DoubleChecks from CUDALuas now? |
[QUOTE=flashjh;360454] can we move to allow DoubleChecks from CUDALuas now?[/QUOTE]
:shock: They are allowed for ages, since 1.48 (the first stable one), few years ago. How do you think we made that huge credits? I think you may be missing a couple of parenthesis from the report, which might confuse James' script on PrimeNet Server. I have to go home to make sure (no reports here at job), but someone else may confirm meantime. Edit: sorry, let me be stupid few minutes each day... No coffee yet, this morning. I thought you are trying to send a report and the server refuse it. After more reading and trying to understand, I think you were talking about new feature implementing the "shifting", weren't you? Well.. I didn't move to 2.05 yet, as the 2.04 works better and a bit faster with cc 2.0 cards. Beside of "shifts", any reasons to switch? Edit 2: some simple mechanism to protect against fraud is still missing, I would[U] vote against[/U] accepting "first-time LL" [B][U]and[/U][/B] "DC" from cudaLucas, for the same exponent. What stops me to edit the "offset" parameter, to get the credit two times? You will find after 20 years that we missed a prime because some idiot credit-whore (I learned the word here on the forum, as someone called it, sorry). At least, with P95 is not so easy for childish individuals to fake a report, due to the we1 checksum, etc. Some simple security mechanism should be implemented, beside of shifting, to make it safer. Don't get me wrong, no disrespect for your work, shifting is an [B][U]immense[/U][/B] improvement to guard against software (FFT bugs), for which I am very grateful. Edit 3: (BTW, after updating the drivers, I am also getting negative iteration times and negative ETA's too, which are very accurate if you multiply them with (about) minus 28 (!?!??!), and consider them in minutes, not in hours :smile:, using the "old good version" 2.04, untouched since Dubslow made it. But the residues are right, and it is about 1% faster, so I let it be). |
That's why I was asking about the correct format. I don't want to change the result line, but I can update CUDALucas to output the correct format.
|
[QUOTE=flashjh;360490]That's why I was asking about the correct format. I don't want to change the result line, but I can update CUDALucas to output the correct format.[/QUOTE]
I only have results from 2.04 beta. Not sure if these are helpful. [QUOTE]GTX 460 Match M( 29862949 )C, 0x0f45a041ecb72f25, n = 1600K, CUDALucas v2.04 Beta, AID: 2D10C4DC57AA33BE93C85980189E980A GTX 570 Match! 810/1600 MHz M( 30147757 )C, 0x7f9545e16b069466, n = 1600K, CUDALucas v2.04 Beta, AID: 5F99F53583D4C3E63D3B644739AB6C37 GTX 570SO Match! 810/1600 MHz M( 28581239 )C, 0xfa28ab555b8bceea, n = 1568K, CUDALucas v2.04 Beta, AID: C278B9C208B1E929C04BCB7EC5DCD91A GTX 580 Match! 830/1800 Mhz M( 29691247 )C, 0xd1f3b18c4ba2384c, n = 1728K, CUDALucas v2.04 Beta, AID: AD65555AAC5AF9E5B7128182CF7158F8 GTX 580 Match! 830/2004 Mhz M( 30078823 )C, 0xc26f399dad087941, n = 1600K, CUDALucas v2.04 Beta, AID: 6BEFF1DF745E6BF8AE8350760B042A78 [/QUOTE] |
Yes, I'll need to talk with James to have the PHP code updated to recognize the 2.05 format.
If anyone sees any changes that need to be made to the result output of 2.05, let us know now so changes are made before James updates the code (hopefully) :smile: |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.