mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   The P-1 factoring CUDA program (https://www.mersenneforum.org/showthread.php?t=17835)

James Heinrich 2013-11-23 14:29

[QUOTE=ET_;360005]Sadly, I always get "[FONT="Courier New"][COLOR="Red"]CUDAPm1.cu(2163) : cufftSafeCall() CUFFT error 6: CUFFT_EXEC_FAILED[/COLOR][/FONT]" with r between 1 and 5 and Threads=128 or 256.[/QUOTE]I just tried running the FFT benchmark (CudaPm1 -cufftbench 1 8192 1) on my new GTX 580, and I also got failure:[code]...
fft size = 3645K, ave time = 6.4376 msec, max-ave = 0.00000
fft size = 3675K, ave time = 6.9818 msec, max-ave = 0.00000
fft size = 3750K, ave time = 6.7061 msec, max-ave = 0.00000
C:/Users/filbert/Documents/Visual Studio 2010/Projects/CUDAPm1/CUDAPm1.cu(2279)
: cudaSafeCall() Runtime API error 30: unknown error.[/code]Screen went black for a second as the NVIDIA drivers recovered from the crash.
Win7, drivers v331.82, GTX 580 3GB
Line number is slightly different, but this is the 24-Sep-2013 Windows binary if that helps.

[i]edit: but a second attempt at running the same command, with no changes, resulted in success.[/i] :cmd:

kladner 2013-11-23 15:46

Owftheevil has said that some errors like this one are caused by a problem in the nVidia drivers starting with the 3xx series. While I have 64 bit drivers going back to 285.62, I have assumed that it is not worth trying to install anything that old as they are probably not compatible with current CUDA libraries.

ET_ 2013-11-23 16:35

[QUOTE=James Heinrich;360071]I just tried running the FFT benchmark (CudaPm1 -cufftbench 1 8192 1) on my new GTX 580, and I also got failure:[code]...
fft size = 3645K, ave time = 6.4376 msec, max-ave = 0.00000
fft size = 3675K, ave time = 6.9818 msec, max-ave = 0.00000
fft size = 3750K, ave time = 6.7061 msec, max-ave = 0.00000
C:/Users/filbert/Documents/Visual Studio 2010/Projects/CUDAPm1/CUDAPm1.cu(2279)
: cudaSafeCall() Runtime API error 30: unknown error.[/code]Screen went black for a second as the NVIDIA drivers recovered from the crash.
Win7, drivers v331.82, GTX 580 3GB
Line number is slightly different, but this is the 24-Sep-2013 Windows binary if that helps.

[i]edit: but a second attempt at running the same command, with no changes, resulted in success.[/i] :cmd:[/QUOTE]

I got the error while running [COLOR="Red"]Cudapm1 -cufftbench 4096 4096 3[/COLOR] and reaching [COLOR="Red"]Mult threads 1024[/COLOR].

My run of [COLOR="SeaGreen"]Cudapm1 -cufftbench 1 8192 1[/COLOR] ran smoothly :smile:

Luigi

owftheevil 2013-11-25 15:36

Revision 52, up at sourceforge now has a partial fix. I haven't tested this, power was off due to a snowstorm this weekend. It might not even compile. But barring any stupid mistakes, it should allow you to run that benchmark. Looks like 4.1 is not as good at optimizing register use as 5.5. It will still fail in stage 2 if you try to test with mult threads = 1024, but I will wait until I can test it too make all the other necessary changes.

ET_ 2013-11-25 19:04

[QUOTE=owftheevil;360263]Revision 52, up at sourceforge now has a partial fix. I haven't tested this, power was off due to a snowstorm this weekend. It might not even compile. But barring any stupid mistakes, it should allow you to run that benchmark. Looks like 4.1 is not as good at optimizing register use as 5.5. It will still fail in stage 2 if you try to test with mult threads = 1024, but I will wait until I can test it too make all the other necessary changes.[/QUOTE]

Same error. :no:

I'm planning on updating to 5.5 (although I'd rather wait for 6.0...)

Luigi

flashjh 2013-11-25 19:59

[QUOTE=ET_;360278]Same error. :no:

I'm planning on updating to 5.5 (although I'd rather wait for 6.0...)

Luigi[/QUOTE]
It's going to be a while before 6 comes out.

ET_ 2013-11-26 09:18

[QUOTE=flashjh;360282]It's going to be a while before 6 comes out.[/QUOTE]

Thanks for the hint. Now I have no reason to wait furhter...

Luigi

firejuggler 2014-04-14 19:38

speed test on a GTX 750 ti
[code]
Device GeForce GTX 750 Ti
Compatibility 5.0
clockRate (MHz) 1110
memClockRate (MHz) 2700

fft max exp ms/iter
4 85933 0.0725
8 169409 0.1211
16 333803 0.1354
18 374587 0.1556
20 415253 0.1585
25 516589 0.1904
28 577177 0.1980
32 657719 0.1988
36 738083 0.2152
40 818239 0.2225
48 978041 0.2775
50 1017889 0.2837
56 1137271 0.2938
64 1296011 0.3271
72 1454273 0.3766
80 1612249 0.4374
81 1631969 0.4650
84 1691093 0.4911
90 1809193 0.5352
96 1927129 0.5400
100 2005673 0.5505
112 2240863 0.5863
128 2553659 0.6611
135 2690201 0.7673
144 2865601 0.7755
160 3176779 0.8483
168 3332107 0.9370
180 3564823 1.0058
200 3951977 1.0640
216 4261051 1.1387
224 4415431 1.1826
225 4434721 1.2089
256 5031737 1.2806
288 5646379 1.4322
320 6259537 1.7545
324 6336103 1.7849
360 7024163 1.9349
392 7634537 2.0081
400 7786967 2.1736
432 8395997 2.2762
448 8700169 2.3520
450 8738161 2.4762
512 9914521 2.4840
576 11125619 3.0138
588 11352347 3.4336
640 12333809 3.4786
648 12484649 3.4864
720 13840423 3.8105
729 14009689 3.9701
800 15343429 4.1448
864 16543493 4.5113
896 17142793 4.7358
900 17217653 5.1151
1024 19535569 5.1737
1080 20580341 5.9243
1152 21921901 6.0684
1280 24302527 6.8055
1296 24599717 7.0482
1344 25490893 7.6906
1350 25602229 7.7762
1440 27271147 7.9282
1512 28604657 8.2026
1568 29640913 8.3392
1600 30232693 8.4548
1728 32597297 9.1322
1792 33778141 9.5213
1800 33925711 10.2388
2048 38492887 10.4488
2304 43194913 11.8395
2560 47885689 13.6211
2592 48471289 14.2237
2688 50227213 15.4735
2880 53735041 16.1879
2916 54392209 16.8134
3072 57237889 17.1524
3136 58404433 17.1638
3200 59570449 18.1494
3240 60298969 18.7758
3584 66556463 19.2476
4096 75846319 21.2812
4608 85111207 25.4287
4800 88579669 28.4094
5120 94353877 28.4263
5184 95507747 29.3381
5376 98967641 32.0367
5600 103000823 32.5258
5760 105879517 33.4296
5832 107174381 33.9940
6048 111056879 35.1400
6144 112781477 35.5020
6272 115080019 35.8715
6400 117377567 38.3167
6912 126558077 38.6704
7168 131142761 40.1696
7200 131715607 41.7772
8192 149447533 44.0675
[/code]
strange thing is that the mem speed is half of what ist is supposed to be.

James Heinrich 2014-04-14 20:02

[QUOTE=firejuggler;371177]strange thing is that the mem speed is half of what ist is supposed to be.[/QUOTE]Not uncommon to see that. There's a subtle distinction between the clock frequency of the memory and the rate of data transfers. In the good old days there was one transaction per clock cycle. Then they invented [url=http://en.wikipedia.org/wiki/Double_data_rate]DDR = Double Data Rate[/url] where data is transferred twice per clock cycle. Often utilities will report the memory clock frequency (for modern GDDR5 video cards that's usually in the 2.5-3.0GHz range) whereas marketing materials will report the number of transactions per second (double the clock rate), usually mislabeled with "GHz" (billion cycles per second) rather than GT/s (billion transactions per second).

houding 2014-07-22 07:18

Is there a way to specify the B1 and B2 values manually?

Currently I get a manual assignment, put it in the worktodo txt file and run the program (using 0.20). The software decides on the B1 and B2 values.

Adolf

LaurV 2014-07-22 08:30

One way to increase the limits for all assignments, but still let the program calculate them optimally for each exponent, is to specify a higher number of "LL tests saved", like substituting the default "1" or "2" at the end of the line with "3"... "9" (it can be higher, but it is not effective, and generally higher values are waste of time).


All times are UTC. The time now is 23:19.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.