mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-12-15, 03:40   #2102
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

5×11×137 Posts
Default

I just tried r52 - no luck. I get the error "device_number >= device_count". I'm presently running CUDALucas 2.00 without problems on this Windows 7 box with a GTX 460.
Prime95 is online now   Reply With Quote
Old 2013-12-15, 10:32   #2103
mognuts
 
mognuts's Avatar
 
Sep 2008
Bromley, England

43 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I just tried r52 - no luck. I get the error "device_number >= device_count". I'm presently running CUDALucas 2.00 without problems on this Windows 7 box with a GTX 460.
I get that error if I use version 2xx.xx drivers with r52. Upgrading the drivers solved this for me.
mognuts is offline   Reply With Quote
Old 2013-12-15, 11:45   #2104
mognuts
 
mognuts's Avatar
 
Sep 2008
Bromley, England

43 Posts
Default

I'm getting bad selftests on a GTX460 with r52. I have never had this before with earlier versions.

Code:
C:\Users\John\Desktop\cudalucas>CUDALucas_205Beta_x64_r52.exe -r
 ------- DEVICE 0 -------
name                GeForce GTX 460
Compatibility       2.1
clockRate (MHz)     1430
memClockRate (MHz)  1800
totalGlobalMem      1073741824
totalConstMem       65536
l2CacheSize         524288
sharedMemPerBlock   49152
regsPerBlock        32768
warpSize            32
memPitch            2147483647
maxThreadsPerBlock  1024
maxThreadsPerMP     1536
multiProcessorCount 7
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      65535,65535,65535
textureAlignment    512
deviceOverlap       1
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M86243 fft length = 4K
Iteration 10000 / 86243, 0x23992ccd735a03d9, 4K, CUDALucas v2.05 Beta err = 0.26563 (0:01 real, 0.0651 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M132049 fft length = 8K
Iteration 10000 / 132049, 0x4c52a92b54635f9e, 8K, CUDALucas v2.05 Beta err = 0.00046 (0:01 real, 0.0709 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M216091 fft length = 16K
Iteration 10000 / 216091, 0x30247786758b8792, 16K, CUDALucas v2.05 Beta err = 0.00001 (0:00 real, 0.0884 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M756839 fft length = 40K
Iteration 10000 / 756839, 0x5d2cbe7cb24a109a, 40K, CUDALucas v2.05 Beta err = 0.03320 (0:02 real, 0.1868 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M859433 fft length = 48K
Iteration 10000 / 859433, 0x3c4ad525c2d0aed0, 48K, CUDALucas v2.05 Beta err = 0.01074 (0:02 real, 0.1988 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M1257787 fft length = 64K
Iteration 10000 / 1257787, 0x3f45bf9bea7213ea, 64K, CUDALucas v2.05 Beta err = 0.10938 (0:03 real, 0.2440 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M1398269 fft length = 128K
Iteration 10000 / 1398269, 0xa4a6d2f0e34629db, 128K, CUDALucas v2.05 Beta err = 0.00000 (0:04 real, 0.4409 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M2976221 fft length = 256K
Iteration 10000 / 2976221, 0x2a7111b7f70fea2f, 256K, CUDALucas v2.05 Beta err = 0.00001 (0:09 real, 0.8995 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M3021377 fft length = 256K
Iteration 10000 / 3021377, 0x6387a70a85d46baf, 256K, CUDALucas v2.05 Beta err = 0.00001 (0:09 real, 0.8994 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M6972593 fft length = 512K
Iteration 10000 / 6972593, 0x88f1d2640adb89e1, 512K, CUDALucas v2.05 Beta err = 0.00011 (0:18 real, 1.7766 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M13466917 fft length = 1024K
Iteration 10000 / 13466917, 0x9fdc1f4092b15d69, 1024K, CUDALucas v2.05 Beta err = 0.00009 (0:37 real, 3.6937 ms/iter)
This residue is correct.
 The fft length 2048K is too large for exponent 20996011, decreasing to 1024K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M20996011 fft length = 1024K
Iteration 10000 / 20996011, 0x2a354d3a0f96e64e, 1024K, CUDALucas v2.05 Beta err = 0.50000 (0:37 real, 3.6876 ms/iter)
Expected residue [5fc58920a821da11] does not match actual residue [2a354d3a0f96e64e]
The fft length 2048K is too large for exponent 24036583, decreasing to 1024K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M24036583 fft length = 1024K
Iteration 10000 / 24036583, 0x47fba1785d32a924, 1024K, CUDALucas v2.05 Beta err = 1.00000 (0:51 real, 5.1785 ms/iter)
Expected residue [cbdef38a0bdc4f00] does not match actual residue [47fba1785d32a924]
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M25964951 fft length = 2048K
Iteration 10000 / 25964951, 0x62eb3ff0a5f6237c, 2048K, CUDALucas v2.05 Beta err = 0.00008 (1:14 real, 7.4363 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M30402457 fft length = 2048K
Iteration 10000 / 30402457, 0x0b8600ef47e69d27, 2048K, CUDALucas v2.05 Beta err = 0.00131 (1:15 real, 7.4195 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M32582657 fft length = 2048K
Iteration 10000 / 32582657, 0x02751b7fcec76bb1, 2048K, CUDALucas v2.05 Beta err = 0.00537 (1:14 real, 7.4358 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M37156667 fft length = 2048K
Iteration 10000 / 37156667, 0x67ad7646a1fad514, 2048K, CUDALucas v2.05 Beta err = 0.11719 (1:14 real, 7.4356 ms/iter)
This residue is correct.
 The fft length 4096K is too large for exponent 42643801, decreasing to 2048K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M42643801 fft length = 2048K
Iteration 10000 / 42643801, 0x93ec1e0141513b57, 2048K, CUDALucas v2.05 Beta err = 1.00000 (1:15 real, 7.4357 ms/iter)
Expected residue [8f90d78d5007bba7] does not match actual residue [93ec1e0141513b57]
The fft length 4096K is too large for exponent 43112609, decreasing to 2048K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M43112609 fft length = 2048K
Iteration 10000 / 43112609, 0x93f526f2d01c1686, 2048K, CUDALucas v2.05 Beta err = 1.00000 (1:14 real, 7.4352 ms/iter)
Expected residue [e86891ebf6cd70c4] does not match actual residue [93f526f2d01c1686]
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M57885161 fft length = 4096K
Iteration 10000 / 57885161, 0x76c27556683cd84d, 4096K, CUDALucas v2.05 Beta err = 0.00076 (2:37 real, 15.7022 ms/iter)
This residue is correct.
Error: There were 4 bad selftests!
C:\Users\John\Desktop\cudalucas>pause
Press any key to continue . . .
mognuts is offline   Reply With Quote
Old 2013-12-15, 12:03   #2105
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

I can't speak for the bad self test yet, but the other problems are probably from the driver version, as stated above. I build with CUDA 5.5 now. If you need a different version let me know and I'll try to build one. Otherwise, updating to the newest drivers should fix the problem.

The bad self test may have something to do with FFT selection. We'll look at it.

Last fiddled with by flashjh on 2013-12-15 at 12:05
flashjh is offline   Reply With Quote
Old 2013-12-15, 16:47   #2106
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

753510 Posts
Default

Quote:
Originally Posted by mognuts View Post
I get that error if I use version 2xx.xx drivers with r52. Upgrading the drivers solved this for me.
I'm using driver 311.06. I'll try a newer one.
Prime95 is online now   Reply With Quote
Old 2013-12-15, 18:00   #2107
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11101011011112 Posts
Default

Quote:
Originally Posted by mognuts View Post
I'm getting bad selftests on a GTX460 with r52. I have never had this before with earlier versions.
FWIW, my GTX460 passes the selftest.

I do have one minor bug. I ran "-cufftbench 2000 4100 1". It ran all the benches successfully, but the file to mail to james contained only one line for FFT length 2048K.
Prime95 is online now   Reply With Quote
Old 2013-12-15, 18:53   #2108
mognuts
 
mognuts's Avatar
 
Sep 2008
Bromley, England

1010112 Posts
Default

Quote:
Originally Posted by Prime95 View Post
FWIW, my GTX460 passes the selftest.

I do have one minor bug. I ran "-cufftbench 2000 4100 1". It ran all the benches successfully, but the file to mail to james contained only one line for FFT length 2048K.
-cufftbench is broken for me with r52. It crashes but doesn't bring down the driver. Makes no difference if I'm benchmarking a range of FFTs, or threads for a given FFT. r50 was fine.
mognuts is offline   Reply With Quote
Old 2013-12-15, 19:12   #2109
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

A lot of code was re written for r52. Will need to debugging. Keep posting errors and bugs, thanks
flashjh is offline   Reply With Quote
Old 2013-12-16, 14:48   #2110
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Quote:
Originally Posted by mognuts View Post
I'm getting bad selftests on a GTX460 with r52. I have never had this before with earlier versions.

Code:
C:\Users\John\Desktop\cudalucas>CUDALucas_205Beta_x64_r52.exe -r
 ------- DEVICE 0 -------
name                GeForce GTX 460
Compatibility       2.1
clockRate (MHz)     1430
memClockRate (MHz)  1800
totalGlobalMem      1073741824
totalConstMem       65536
l2CacheSize         524288
sharedMemPerBlock   49152
regsPerBlock        32768
warpSize            32
memPitch            2147483647
maxThreadsPerBlock  1024
maxThreadsPerMP     1536
multiProcessorCount 7
maxThreadsDim[3]    1024,1024,64
maxGridSize[3]      65535,65535,65535
textureAlignment    512
deviceOverlap       1
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M86243 fft length = 4K
Iteration 10000 / 86243, 0x23992ccd735a03d9, 4K, CUDALucas v2.05 Beta err = 0.26563 (0:01 real, 0.0651 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M132049 fft length = 8K
Iteration 10000 / 132049, 0x4c52a92b54635f9e, 8K, CUDALucas v2.05 Beta err = 0.00046 (0:01 real, 0.0709 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M216091 fft length = 16K
Iteration 10000 / 216091, 0x30247786758b8792, 16K, CUDALucas v2.05 Beta err = 0.00001 (0:00 real, 0.0884 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M756839 fft length = 40K
Iteration 10000 / 756839, 0x5d2cbe7cb24a109a, 40K, CUDALucas v2.05 Beta err = 0.03320 (0:02 real, 0.1868 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M859433 fft length = 48K
Iteration 10000 / 859433, 0x3c4ad525c2d0aed0, 48K, CUDALucas v2.05 Beta err = 0.01074 (0:02 real, 0.1988 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M1257787 fft length = 64K
Iteration 10000 / 1257787, 0x3f45bf9bea7213ea, 64K, CUDALucas v2.05 Beta err = 0.10938 (0:03 real, 0.2440 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M1398269 fft length = 128K
Iteration 10000 / 1398269, 0xa4a6d2f0e34629db, 128K, CUDALucas v2.05 Beta err = 0.00000 (0:04 real, 0.4409 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M2976221 fft length = 256K
Iteration 10000 / 2976221, 0x2a7111b7f70fea2f, 256K, CUDALucas v2.05 Beta err = 0.00001 (0:09 real, 0.8995 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M3021377 fft length = 256K
Iteration 10000 / 3021377, 0x6387a70a85d46baf, 256K, CUDALucas v2.05 Beta err = 0.00001 (0:09 real, 0.8994 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M6972593 fft length = 512K
Iteration 10000 / 6972593, 0x88f1d2640adb89e1, 512K, CUDALucas v2.05 Beta err = 0.00011 (0:18 real, 1.7766 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M13466917 fft length = 1024K
Iteration 10000 / 13466917, 0x9fdc1f4092b15d69, 1024K, CUDALucas v2.05 Beta err = 0.00009 (0:37 real, 3.6937 ms/iter)
This residue is correct.
 The fft length 2048K is too large for exponent 20996011, decreasing to 1024K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M20996011 fft length = 1024K
Iteration 10000 / 20996011, 0x2a354d3a0f96e64e, 1024K, CUDALucas v2.05 Beta err = 0.50000 (0:37 real, 3.6876 ms/iter)
Expected residue [5fc58920a821da11] does not match actual residue [2a354d3a0f96e64e]
The fft length 2048K is too large for exponent 24036583, decreasing to 1024K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M24036583 fft length = 1024K
Iteration 10000 / 24036583, 0x47fba1785d32a924, 1024K, CUDALucas v2.05 Beta err = 1.00000 (0:51 real, 5.1785 ms/iter)
Expected residue [cbdef38a0bdc4f00] does not match actual residue [47fba1785d32a924]
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M25964951 fft length = 2048K
Iteration 10000 / 25964951, 0x62eb3ff0a5f6237c, 2048K, CUDALucas v2.05 Beta err = 0.00008 (1:14 real, 7.4363 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M30402457 fft length = 2048K
Iteration 10000 / 30402457, 0x0b8600ef47e69d27, 2048K, CUDALucas v2.05 Beta err = 0.00131 (1:15 real, 7.4195 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M32582657 fft length = 2048K
Iteration 10000 / 32582657, 0x02751b7fcec76bb1, 2048K, CUDALucas v2.05 Beta err = 0.00537 (1:14 real, 7.4358 ms/iter)
This residue is correct.
 Using threads: norm1 256, mult 128, norm2 128.
Starting self test M37156667 fft length = 2048K
Iteration 10000 / 37156667, 0x67ad7646a1fad514, 2048K, CUDALucas v2.05 Beta err = 0.11719 (1:14 real, 7.4356 ms/iter)
This residue is correct.
 The fft length 4096K is too large for exponent 42643801, decreasing to 2048K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M42643801 fft length = 2048K
Iteration 10000 / 42643801, 0x93ec1e0141513b57, 2048K, CUDALucas v2.05 Beta err = 1.00000 (1:15 real, 7.4357 ms/iter)
Expected residue [8f90d78d5007bba7] does not match actual residue [93ec1e0141513b57]
The fft length 4096K is too large for exponent 43112609, decreasing to 2048K
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M43112609 fft length = 2048K
Iteration 10000 / 43112609, 0x93f526f2d01c1686, 2048K, CUDALucas v2.05 Beta err = 1.00000 (1:14 real, 7.4352 ms/iter)
Expected residue [e86891ebf6cd70c4] does not match actual residue [93f526f2d01c1686]
Using threads: norm1 256, mult 128, norm2 128.
Starting self test M57885161 fft length = 4096K
Iteration 10000 / 57885161, 0x76c27556683cd84d, 4096K, CUDALucas v2.05 Beta err = 0.00076 (2:37 real, 15.7022 ms/iter)
This residue is correct.
Error: There were 4 bad selftests!
C:\Users\John\Desktop\cudalucas>pause
Press any key to continue . . .
This should be fixed with r53. Forgot to reinitialize a pointer after freeing the memory.
owftheevil is offline   Reply With Quote
Old 2013-12-16, 15:00   #2111
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Quote:
Originally Posted by Prime95 View Post
FWIW, my GTX460 passes the selftest.

I do have one minor bug. I ran "-cufftbench 2000 4100 1". It ran all the benches successfully, but the file to mail to james contained only one line for FFT length 2048K.
Found the problem. I was making the silly assumption that limits would always be powers of 2. I should have the time to fix it tonight.
owftheevil is offline   Reply With Quote
Old 2013-12-16, 15:02   #2112
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Quote:
Originally Posted by mognuts View Post
-cufftbench is broken for me with r52. It crashes but doesn't bring down the driver. Makes no difference if I'm benchmarking a range of FFTs, or threads for a given FFT. r50 was fine.
Crashes how?
owftheevil is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 21:03.


Sun Aug 1 21:03:54 UTC 2021 up 9 days, 15:32, 0 users, load averages: 1.61, 1.53, 1.51

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.