![]() |
|
|
#1189 | |
|
Jul 2009
Tokyo
2·5·61 Posts |
Quote:
Code:
$ ./CUDALucas -threads 512 332220523 DEVICE:0------------------------ name GeForce GTX 550 Ti totalGlobalMem 1072889856 ... start M332220523 fft length = 18874368 err = 0.35937, increasing n from 18874368 start M332220523 fft length = 18874368 err = 0.35937, increasing n from 18874368 start M332220523 fft length = 20971520 Iteration 10000 M( 332220523 )C, 0x1a313d709bfa6663, n = 20971520, CUDALucas v1.66 err = 0.03358 (22:30 real, 134.9292 ms/iter, ETA 12451:20:29) Iteration 20000 M( 332220523 )C, 0x73dc7a5c8b839081, n = 20971520, CUDALucas v1.66 err = 0.03358 (22:26 real, 134.5456 ms/iter, ETA 12415:34:17) |
|
|
|
|
|
|
#1190 | |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
Quote:
|
|
|
|
|
|
|
#1191 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
3,407 Posts |
|
|
|
|
|
|
#1192 | ||
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Quote:
Quote:
So with threads >= 256, you must have multiples of 32K. (Threads lower than that would significantly impact performance, I would think.) |
||
|
|
|
|
|
#1193 | |
|
Jan 2011
Dudley, MA, USA
73 Posts |
Quote:
e.g. I just ran a quick test, and got a valid result for: ./CUDALucas -threads 512 -f 175616 2700067 -t M( 2700067 )C, 0x787c1272dc144ba2, n = 175616, CUDALucas v2.00 Granted, 175616 isn't one of the faster lengths, but it is (2^9)*(7^3) i.e. not a multiple of 32768 |
|
|
|
|
|
|
#1194 | |
|
"Jerry"
Nov 2011
Vancouver, WA
21438 Posts |
Quote:
My P95 run is complete: Code:
UID: flashjh/TF2, M28982959 is not prime. Res64: 5B3274500F7D17__. We4: 858095B2,16603096,00000000 Last fiddled with by flashjh on 2012-04-07 at 23:57 |
|
|
|
|
|
|
#1195 | |
|
Dec 2007
111002 Posts |
Aaron, thanks for that great insight, and would you believe, available for reading in a manual.
![]() This immediately interested me in FFT_size = 4194304 which is 95.87% efficient, but unfortunately the program terminated with error too large. Quote:
|
|
|
|
|
|
|
#1196 |
|
Dec 2007
22·7 Posts |
So I picked FFT_size = 2985984 (2^12 * 3^6).
Does 45ms/iteration look OK for a GT 430? Code:
>cudalucas.2.00]$ ./cul -d 0 -f 2985984 -t 49845883 DEVICE:0------------------------ name GeForce GT 430 clockRate 1400000 start M49845883 fft length = 2985984 Iteration 10000 M( 49845883 )C, 0xbb8661cd90463e94, n = 2985984, CUDALucas v2.00 err = 0.03711 (7:38 real, 45.7782 ms/iter, ETA 633:38:49) Iteration 20000 M( 49845883 )C, 0xf1d53981f966befa, n = 2985984, CUDALucas v2.00 err = 0.03711 (7:39 real, 45.8859 ms/iter, ETA 635:00:37) . |
|
|
|
|
|
#1197 | |
|
Jan 2011
Dudley, MA, USA
4916 Posts |
Quote:
I don't know why that FFT size is "too large" as it doesn't take much memory, and seems to work just fine for me. What's the specific error message? |
|
|
|
|
|
|
#1198 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
|
|
|
|
|
|
#1199 |
|
Dec 2007
22·7 Posts |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |