![]() |
Hi,
>clFFT problems? Yes. >What does OOURA code? Only checks rounding errors? Translation from Complex FFT to HalfComplex FFT. |
I have calculated my first LLR test on ATI gpu:
M( 64847711 )C, 0xffffffff80000000, n = 131072, clLucas v1.02 My card is Gigabyte R9 290. Test was 5h 30m long. |
[QUOTE=AK76;386081]M( 64847711 )C, [COLOR=Red]0xffffffff80000000[/COLOR], n = 131072, clLucas v1.02
[/QUOTE] This is not a valid result, though. I may want to tune down OC on your card, check for dust etc etc. The usual things. |
More likely, the FFT size is way too small. Let clLucas autoselect the FFT size, then use the next bigger power of two.
It is a fast card, but 5 hours can't be real (unfortunately). |
The fft is too small. For an exponent of that size, you should be using n=4194304.
clLucas still only works with 2^k ffts, right? |
[QUOTE=owftheevil;386087]The fft is too small. For an exponent of that size, you should be using n=4194304.
clLucas still only works with 2^k ffts, right?[/QUOTE] Well, yes and no. Non 2^k FFT sizes work, but are slow. EDIT: I'm surprised clLucas didn't stop right away... |
I do not set -f parameter myself. clLucas set this number. I do the same test one more time. My ATI is not overclocked.
|
For me, it looks like this (HD7950 OC'd to the max (1100/1400) for this test):
[code] C:\projects\clLucas>clLucas_x64.exe -aggressive 64847711 Platform :Advanced Micro Devices, Inc. Device 0 : Tahiti Build Options are : -D KHR_DP_EXTENSION start M64847711 fft length = 3276800 err = 0.5, increasing n from 3276800 start M64847711 fft length = 3538944 Iteration 10000 0xf063d79e2bcf83d7, n = 3538944 err = 0.2813 (2:50 real, 16.9916 ms/iter, ETA 305:59:27) ^C caught. Writing checkpoint. C:\projects\clLucas>rm c64847711 C:\projects\clLucas>clLucas_x64.exe -f 4194304 -aggressive 64847711 Platform :Advanced Micro Devices, Inc. Device 0 : Tahiti Build Options are : -D KHR_DP_EXTENSION start M64847711 fft length = 4194304 Iteration 10000 0xf063d79e2bcf83d7, n = 4194304 err = 0.005371 (1:04 real, 6.3979 ms/iter, ETA 115:12:58) Iteration 20000 0x929ed99e32ae2e48, n = 4194304 err = 0.005371 (1:04 real, 6.3386 ms/iter, ETA 114:07:48) Iteration 30000 0x20c4df779543306b, n = 4194304 err = 0.005371 (1:03 real, 6.3383 ms/iter, ETA 114:06:26) ^C caught. Writing checkpoint. [/code]Without "-aggressive" it is a bit slower, but the FFT selection is the same. AK76, did you maybe experiment with the -f parameter and this exponent at some time so that there were checkpoint files containing the bad FFT size? |
I set FFT manually at 4194304 ande result is:
clLucas_x64 -f 4194304 64847711 Iteration 100000 0x8bc61577dc9b0813, n = 4194304 err = 0.004395 (1:06 real, 6.6892 ms/iter, ETA 120:17:37) ==================================== When run on automatic settings: clLucas_x64 64847711 Iteration 20000 0x929ed99e32ae2e48, n = 3538944 err = 0.2344 (4:05 real, 24.5069 ms/iter, ETA 441:15:37) The different is biiiig. |
[QUOTE=AK76;386141]I do not set -f parameter myself. clLucas set this number. I do the same test one more time. My ATI is not overclocked.[/QUOTE]
Very strange... :loco: |
[QUOTE=AK76;386147]I set FFT manually at 4194304 ande result is:
clLucas_x64 -f 4194304 64847711 Iteration 100000 0x8bc61577dc9b0813, n = 4194304 err = 0.004395 (1:06 real, 6.6892 ms/iter, ETA 120:17:37) ==================================== When run on automatic settings: clLucas_x64 64847711 Iteration 20000 0x929ed99e32ae2e48, n = 3538944 err = 0.2344 (4:05 real, 24.5069 ms/iter, ETA 441:15:37) The different is biiiig.[/QUOTE] Yes, the difference is very big, but it compares well to my test results. You can try to set "-aggressive" to improve the timings, but apart from that, "-f 4194304" is the best thing to do. With this FFT size, you can even run tests with slightly bigger exponents as [URL="http://www.mersenneforum.org/showpost.php?p=366726&postcount=294"]LaurV explained[/URL] very nicely. |
| All times are UTC. The time now is 22:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.