mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-11-16, 17:28   #1981
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

Memory size is reported correctly on the 570 and 580.

The 580 has completed 'CUDALucas -cufftbench 1 8192 1' at least once, running at 810 MHz core, 1700 MHz VRAM.
kladner is offline   Reply With Quote
Old 2013-11-16, 19:13   #1982
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

27AE16 Posts
Default

Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?
Attached Thumbnails
Click image for larger version

Name:	CL_savefiles.PNG
Views:	89
Size:	122.1 KB
ID:	10462  

Last fiddled with by kladner on 2013-11-16 at 19:14
kladner is offline   Reply With Quote
Old 2013-11-17, 11:21   #1983
Manpowre
 
"Svein Johansen"
May 2013
Norway

3·67 Posts
Default

Quote:
Originally Posted by kladner View Post
Another observation/question- should the savefiles of v 2.04beta be more than three times as large as those of v 2.05beta? .....EDIT: for the same exponent?
I have not looked at new code, but I guess the new code compresses the savefiles.
Manpowre is offline   Reply With Quote
Old 2013-11-17, 22:13   #1984
Manpowre
 
"Svein Johansen"
May 2013
Norway

3118 Posts
Default

Running the cudalucas 2.05 beta, the one downloaded from sourceforge, and my own built version I get wrong residues running the selftest.

Running 2.03 with the same parametersm everything is fine:

Starting self test M43112609 fft length = 2304K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration 100, average error = 0.17969, max error = 0.28125
Iteration 200, average error = 0.20398, max error = 0.26563
Iteration 300, average error = 0.21162, max error = 0.27344
Iteration 400, average error = 0.21489, max error = 0.28125
Iteration 500, average error = 0.21730, max error = 0.28125
Iteration 600, average error = 0.21847, max error = 0.26563
Iteration 700, average error = 0.21941, max error = 0.25781
Iteration 800, average error = 0.22026, max error = 0.25879
Iteration 900, average error = 0.22068, max error = 0.26172
Iteration 1000, average error = 0.22089 <= 0.25 (max error = 0.28125), continuin
g test.
Iteration 10000 M( 43112609 )C, 0x62871c7027ff12c8, n = 2304K, CUDALucas v2.05 B
eta err = 0.50000 (0:21 real, 2.0989 ms/iter)
Expected residue [e86891ebf6cd70c4] does not match actual residue [62871c7027ff1
2c8]

Starting self test M57885161 fft length = 3136K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration 100, average error = 0.15905, max error = 0.22656
Iteration 200, average error = 0.18188, max error = 0.23438
Iteration 300, average error = 0.19004, max error = 0.24219
Iteration 400, average error = 0.19322, max error = 0.23438
Iteration 500, average error = 0.19489, max error = 0.22021
Iteration 600, average error = 0.19627, max error = 0.22852
Iteration 700, average error = 0.19733, max error = 0.24023
Iteration 800, average error = 0.19812, max error = 0.25000
Iteration 900, average error = 0.19867, max error = 0.24219
Iteration 1000, average error = 0.19901 <= 0.25 (max error = 0.25000), continuin
g test.
Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3136K, CUDALucas v2.05 B
eta err = 0.26953 (0:26 real, 2.5361 ms/iter)
This residue is correct.


I havent figured out whats going on yet, just wanted to let ppl know. I have both compiled it from sourcecode, and tested it, and downloaded the prebuild from sourceforge.

I think more people should run the selftest:
cudalucas2.05beta.exe -r

(tested on titan)

Last fiddled with by Manpowre on 2013-11-17 at 22:23
Manpowre is offline   Reply With Quote
Old 2013-11-17, 22:41   #1985
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default FFT too large

I get a different error result. The attached has been consistent at a variety of speeds.
Attached Files
File Type: txt self-test.txt (2.6 KB, 69 views)
kladner is offline   Reply With Quote
Old 2013-11-17, 23:01   #1986
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

100011000112 Posts
Default

Quote:
Originally Posted by kladner View Post
I get a different error result. The attached has been consistent at a variety of speeds.
Delete or rename you GeForce GTX --- fft.txt file and try again.
flashjh is offline   Reply With Quote
Old 2013-11-18, 00:59   #1987
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

An issue:

When running CUDALucas -r with a GeForce GTX --- fft.txt

you may get the error:
Code:
The fft length 32K is too large for the exponent 216091. Restart with smaller fft.
Removing the file, as noted above, fixes the error. So when -cufftbench is run and the .txt file is gereated I presume the FFTs are tuned correctly. However, the new 'less tolerant' code won't accept those values for use in the self test.

Also, can someone explain the updated threads in 2.05. Is it necessary to have .ini file threads anymore? Why three values instead of 1. What is the interaction with the new .txt file?

Thanks

Last fiddled with by flashjh on 2013-11-18 at 01:00
flashjh is offline   Reply With Quote
Old 2013-11-18, 01:26   #1988
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

1015810 Posts
Default

Quote:
Originally Posted by flashjh View Post
Delete or rename you GeForce GTX --- fft.txt file and try again.
I now get this-
Code:
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

E:\CUDA\2.05-BETA>CUDALucas -r

------- DEVICE 0 -------
name                GeForce GTX 580
Compatibility       2.0
clockRate (MHz)     1564
memClockRate (MHz)  1600
totalGlobalMem      1610612736
totalConstMem       65536
l2CacheSize         786432
sharedMemPerBlock   49152
regsPerBlock        32768
warpSize            32
memPitch            2147483647
maxThreadsPerBlock  1024
maxThreadsPerMP     1536
multiProcessorCount 16
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      65535,65535,65535
textureAlignment    512
deviceOverlap       1

Starting self test M86243 fft length = 8K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration  100, average error = 0.00000, max error = 0.00000
Iteration  200, average error = 0.00000, max error = 0.00000
Iteration  300, average error = 0.00000, max error = 0.00000
Iteration  400, average error = 0.00000, max error = 0.00000
Iteration  500, average error = 0.00000, max error = 0.00000
Iteration  600, average error = 0.00000, max error = 0.00000
Iteration  700, average error = 0.00000, max error = 0.00000
Iteration  800, average error = 0.00000, max error = 0.00000
Iteration  900, average error = 0.00000, max error = 0.00000
Iteration 1000, average error = 0.00000 <= 0.25 (max error = 0.00000), continuing test.
Iteration 10000 M( 86243 )C, 0x23992ccd735a03d9, n = 8K, CUDALucas v2.05 Beta err = 0.00000 (0:01 real, 0.0812 ms/iter)
This residue is correct.

Starting self test M132049 fft length = 8K
Running careful round off test for 1000 iterations.
If average error > 0.25, or maximum error > 0.35,
the test will restart with a longer FFT.
Iteration  100, average error = 0.00024, max error = 0.00037
Iteration  200, average error = 0.00026, max error = 0.00035
Iteration  300, average error = 0.00027, max error = 0.00037
Iteration  400, average error = 0.00027, max error = 0.00037
Iteration  500, average error = 0.00027, max error = 0.00035
Iteration  600, average error = 0.00027, max error = 0.00034
Iteration  700, average error = 0.00027, max error = 0.00037
Iteration  800, average error = 0.00027, max error = 0.00037
Iteration  900, average error = 0.00027, max error = 0.00037
Iteration 1000, average error = 0.00028 <= 0.25 (max error = 0.00037), continuing test.
Iteration 10000 M( 132049 )C, 0x4c52a92b54635f9e, n = 8K, CUDALucas v2.05 Beta err = 0.00044 (0:01 real, 0.0844 ms/iter)
This residue is correct.

fft length 14336 must be divisible by 4 * mult threads 1024

E:\CUDA\2.05-BETA>
kladner is offline   Reply With Quote
Old 2013-11-18, 01:32   #1989
Manpowre
 
"Svein Johansen"
May 2013
Norway

C916 Posts
Default

check your .ini file, is it 1024 threads ?

if so, change it down to 256 for residue tests.
Manpowre is offline   Reply With Quote
Old 2013-11-18, 02:20   #1990
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

100111101011102 Posts
Default

Quote:
Originally Posted by Manpowre View Post
check your .ini file, is it 1024 threads ?

if so, change it down to 256 for residue tests.
Bingo! Changed 1024 to 256 and 'cudalucas -r' completed successfully. Thanks, Manpowre!
kladner is offline   Reply With Quote
Old 2013-11-18, 02:28   #1991
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

My fft.txt file says that Threads=512 256 256 is the best setting for me to use for the current FFT range I'm in, so I leave it there and that works for the -r test too.
flashjh is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 07:18.


Fri Aug 6 07:18:58 UTC 2021 up 14 days, 1:47, 1 user, load averages: 3.02, 2.80, 2.72

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.