mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

kladner 2013-11-16 17:28

Memory size is reported correctly on the 570 and 580.

The 580 has completed 'CUDALucas -cufftbench 1 8192 1' at least once, running at 810 MHz core, 1700 MHz VRAM.

kladner 2013-11-16 19:13

1 Attachment(s)
Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?

Manpowre 2013-11-17 11:21

[QUOTE=kladner;359541]Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?[/QUOTE]

I have not looked at new code, but I guess the new code compresses the savefiles.

Manpowre 2013-11-17 22:13

Running the cudalucas 2.05 beta, the one downloaded from sourceforge, and my own built version I get wrong residues running the selftest.

Running 2.03 with the same parametersm everything is fine:

Starting self test M43112609 fft length = 2304K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration 100, average error = 0.17969, max error = 0.28125
Iteration 200, average error = 0.20398, max error = 0.26563
Iteration 300, average error = 0.21162, max error = 0.27344
Iteration 400, average error = 0.21489, max error = 0.28125
Iteration 500, average error = 0.21730, max error = 0.28125
Iteration 600, average error = 0.21847, max error = 0.26563
Iteration 700, average error = 0.21941, max error = 0.25781
Iteration 800, average error = 0.22026, max error = 0.25879
Iteration 900, average error = 0.22068, max error = 0.26172
Iteration 1000, average error = 0.22089 <= 0.25 (max error = 0.28125), continuin
g test.
Iteration 10000 M( 43112609 )C, 0x62871c7027ff12c8, n = 2304K, CUDALucas v2.05 B
eta err = 0.50000 (0:21 real, 2.0989 ms/iter)
Expected residue [e86891ebf6cd70c4] does not match actual residue [62871c7027ff1
2c8]

Starting self test M57885161 fft length = 3136K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration 100, average error = 0.15905, max error = 0.22656
Iteration 200, average error = 0.18188, max error = 0.23438
Iteration 300, average error = 0.19004, max error = 0.24219
Iteration 400, average error = 0.19322, max error = 0.23438
Iteration 500, average error = 0.19489, max error = 0.22021
Iteration 600, average error = 0.19627, max error = 0.22852
Iteration 700, average error = 0.19733, max error = 0.24023
Iteration 800, average error = 0.19812, max error = 0.25000
Iteration 900, average error = 0.19867, max error = 0.24219
Iteration 1000, average error = 0.19901 <= 0.25 (max error = 0.25000), continuin
g test.
Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3136K, CUDALucas v2.05 B
eta err = 0.26953 (0:26 real, 2.5361 ms/iter)
This residue is correct.


I havent figured out whats going on yet, just wanted to let ppl know. I have both compiled it from sourcecode, and tested it, and downloaded the prebuild from sourceforge.

I think more people should run the selftest:
cudalucas2.05beta.exe -r

(tested on titan)

kladner 2013-11-17 22:41

FFT too large
 
1 Attachment(s)
I get a different error result. The attached has been consistent at a variety of speeds.

flashjh 2013-11-17 23:01

[QUOTE=kladner;359642]I get a different error result. The attached has been consistent at a variety of speeds.[/QUOTE]
Delete or rename you GeForce GTX --- fft.txt file and try again.

flashjh 2013-11-18 00:59

An issue:

When running CUDALucas -r with a GeForce GTX --- fft.txt

you [I]may[/I] get the error:
[CODE]The fft length 32K is too large for the exponent 216091. Restart with smaller fft.[/CODE]

Removing the file, as noted above, fixes the error. So when -cufftbench is run and the .txt file is gereated I presume the FFTs are tuned correctly. However, the new 'less tolerant' code won't accept those values for use in the self test.

Also, can someone explain the updated threads in 2.05. Is it necessary to have .ini file threads anymore? Why three values instead of 1. What is the interaction with the new .txt file?

Thanks

kladner 2013-11-18 01:26

[QUOTE=flashjh;359644]Delete or rename you GeForce GTX --- fft.txt file and try again.[/QUOTE]

I now get this- [CODE]Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.

E:\CUDA\2.05-BETA>CUDALucas -r

------- DEVICE 0 -------
name GeForce GTX 580
Compatibility 2.0
clockRate (MHz) 1564
memClockRate (MHz) 1600
totalGlobalMem 1610612736
totalConstMem 65536
l2CacheSize 786432
sharedMemPerBlock 49152
regsPerBlock 32768
warpSize 32
memPitch 2147483647
maxThreadsPerBlock 1024
maxThreadsPerMP 1536
multiProcessorCount 16
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] 65535,65535,65535
textureAlignment 512
deviceOverlap 1

Starting self test M86243 fft length = 8K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration 100, average error = 0.00000, max error = 0.00000
Iteration 200, average error = 0.00000, max error = 0.00000
Iteration 300, average error = 0.00000, max error = 0.00000
Iteration 400, average error = 0.00000, max error = 0.00000
Iteration 500, average error = 0.00000, max error = 0.00000
Iteration 600, average error = 0.00000, max error = 0.00000
Iteration 700, average error = 0.00000, max error = 0.00000
Iteration 800, average error = 0.00000, max error = 0.00000
Iteration 900, average error = 0.00000, max error = 0.00000
Iteration 1000, average error = 0.00000 <= 0.25 (max error = 0.00000), continuing test.
Iteration 10000 M( 86243 )C, 0x23992ccd735a03d9, n = 8K, CUDALucas v2.05 Beta err = 0.00000 (0:01 real, 0.0812 ms/iter)
This residue is correct.

Starting self test M132049 fft length = 8K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration 100, average error = 0.00024, max error = 0.00037
Iteration 200, average error = 0.00026, max error = 0.00035
Iteration 300, average error = 0.00027, max error = 0.00037
Iteration 400, average error = 0.00027, max error = 0.00037
Iteration 500, average error = 0.00027, max error = 0.00035
Iteration 600, average error = 0.00027, max error = 0.00034
Iteration 700, average error = 0.00027, max error = 0.00037
Iteration 800, average error = 0.00027, max error = 0.00037
Iteration 900, average error = 0.00027, max error = 0.00037
Iteration 1000, average error = 0.00028 <= 0.25 (max error = 0.00037), continuing test.
Iteration 10000 M( 132049 )C, 0x4c52a92b54635f9e, n = 8K, CUDALucas v2.05 Beta err = 0.00044 (0:01 real, 0.0844 ms/iter)
This residue is correct.

fft length 14336 must be divisible by 4 * mult threads 1024

E:\CUDA\2.05-BETA>[/CODE]

Manpowre 2013-11-18 01:32

check your .ini file, is it 1024 threads ?

if so, change it down to 256 for residue tests.

kladner 2013-11-18 02:20

[QUOTE=Manpowre;359651]check your .ini file, is it 1024 threads ?

if so, change it down to 256 for residue tests.[/QUOTE]

Bingo! Changed 1024 to 256 and 'cudalucas -r' completed successfully. Thanks, Manpowre!

flashjh 2013-11-18 02:28

My fft.txt file says that Threads=512 256 256 is the best setting for me to use for the current FFT range I'm in, so I leave it there and that works for the -r test too.


All times are UTC. The time now is 23:11.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.