![]() |
|
|
#408 | |
|
May 2011
10002 Posts |
Quote:
For a precise example, using M38542223: Auto-Select: Iteration 300000 M( 38542223 )C, 0xfd1aedd779d62f55, n = 2240K, clLucas v1.04 err = 0.0308 (33:27 real, 20.0766 ms/iter, ETA 213:02:07) Manual Select: Iteration 300000 M( 38542223 )C, 0xfd1aedd779d62f55, n = 2560K, clLucas v1.04 err = 0.0019 (9:39 real, 5.7906 ms/iter, ETA 61:26:39) |
|
|
|
|
|
|
#409 | |
|
Jun 2003
23·683 Posts |
Quote:
2187 2304 2401 2500 2592 Last fiddled with by axn on 2016-05-03 at 03:15 |
|
|
|
|
|
|
#410 | |
|
May 2011
10002 Posts |
Quote:
**Update - Results of requested FFTs** 2187K: Starting M38542223 fft length = 2187K FFT length error. 2304K (thanks for the tip!): Iteration 30000 M( 38542223 )C, 0xdbd7530295df2924, n = 2304K, clLucas v1.04 err = 0.0175 (0:52 real, 5.1583 ms/iter, ETA 55:10:47) 2401K: Starting M38542223 fft length = 2401K FFT length error. 2500K (still faster than 2560K, but not as good as 2304): Iteration 30000 M( 38542223 )C, 0xdbd7530295df2924, n = 2500K, clLucas v1.04 err = 0.0028 (0:57 real, 5.6499 ms/iter, ETA 60:26:18) 2592K (faster than 2560K, but only slightly): Iteration 30000 M( 38542223 )C, 0xdbd7530295df2924, n = 2592K, clLucas v1.04 err = 0.0016 (0:57 real, 5.7399 ms/iter, ETA 61:24:04) clFFT-2.12.0 (if it matters, I have the sources to build 2.10.1 & 2.10.2 if you think those would show further improvement) Also, given 2187K & 2401K weren't even in terms of K, I tried 2176K & 2402K & 2410K (2048+128 & 2304+92 & 2304+96, could see any as a reasonable typo), but that returned an error of a different nature: Starting M38542223 fft length = 2176K OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (1753): Failed to clfftCreateDefaultPlan. terminate called after throwing an instance of 'std::runtime_error' what(): OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (1753): Failed to clfftCreateDefaultPlan. Aborted Starting M38542223 fft length = 2402K OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (1753): Failed to clfftCreateDefaultPlan. terminate called after throwing an instance of 'std::runtime_error' what(): OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (1753): Failed to clfftCreateDefaultPlan. Aborted Starting M38542223 fft length = 2410K OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (1753): Failed to clfftCreateDefaultPlan. terminate called after throwing an instance of 'std::runtime_error' what(): OPENCL_V_THROWERROR< CLFFT_NOTIMPLEMENTED > (1753): Failed to clfftCreateDefaultPlan. Aborted Last fiddled with by bverka86 on 2016-05-03 at 09:28 Reason: Added Results |
|
|
|
|
|
|
#411 | |
|
Jun 2003
23×683 Posts |
Quote:
FWIW, 2304 = 2^8*3^2, 2500 = 2^2*5^4, 2592 = 2^5*3^4 Here it the full list of even FFTs that follow same pattern (between 2048 & 4096K) 2048=2^11 2304=2^8*3^2 2500=2^2*5^4 2560=2^9*5^1 2592=2^5*3^4 2744=2^3*7^3 2916=2^2*3^6 3072=2^10*3^1 3136=2^6*7^2 3200=2^7*5^2 3456=2^7*3^3 3584=2^9*7^1 3888=2^4*3^5 4000=2^5*5^3 4096=2^12 |
|
|
|
|
|
|
#412 |
|
May 2011
23 Posts |
I'll keep that list of FFT's in mind.
The breakdown into each FFT's "smoothness" is helpful, so thank you! Sometimes I question using these R9 270X's for LL tests, but then I consider things like M72895313... R9 270X: Iteration 18000000 M( 72895313 )C, 0xebe0881ebbc59556, n = 4096K, clLucas v1.04 err = 0.1094 (13:33 real, 8.1293 ms/iter, ETA 123:44:44) Dual Xeon E5450 (6/8 cores): [Work thread May 3 10:47] Iteration: 3400000 / 72895313 [4.66%], ms/iter: 21.244, ETA: 17d 02:05 |
|
|
|
|
|
#413 |
|
Apr 2016
2 Posts |
On my 290, why does the ETA go from about 150Hours to 180Hours for cllucas when I increase the fanspeed of the GPU? The ETA is correctly adjusted based on the ms/iter, but why does it take a little longer?
All other programs I have seen are either quicker or not effected when I increase the fan speed. Note: Increasing the fanspeed does not effect the speed of the memory, but it does allow the GPU to run faster, longer. |
|
|
|
|
|
#414 |
|
Apr 2016
2 Posts |
When I increase the Fanspeed on my 290, why does the ETA increase from about 150 hours to 180 hours on cllucas for an LL test? On all other programs this either has no effect, or it makes the program run faster because it allows the GPU to run closer to full speed.
|
|
|
|
|
|
#415 | |
|
Romulan Interpreter
"name field"
Jun 2011
Thailand
41·251 Posts |
Quote:
Last fiddled with by LaurV on 2016-05-16 at 02:31 |
|
|
|
|
|
|
#416 |
|
"David"
Jul 2015
Ohio
11·47 Posts |
I did some testing with clFFT 2.12.1 and the latest Crimson drivers.
Nothing of significant note to report, although as of 2.10 Fury X cards can from > 16384K FFTs... Iteration 10000 M( 332213573 )C, 0x216c5a1819bd595d, n = 19200K, clLucas v1.04 err = 0.1367 (4:42 real, 28.1697 ms/iter, ETA 2599:26:26) Iteration 20000 M( 332213573 )C, 0xebc18094b00fad87, n = 19200K, clLucas v1.04 err = 0.1406 (4:42 real, 28.1144 ms/iter, ETA 2594:15:35) 108 days for a 100M test on a Fury X. ($629, 200/220W during test) |
|
|
|
|
|
#417 |
|
Jun 2003
23·683 Posts |
clLucas doesn't necessarily choose the best FFT sizes. There might be larger FFTs that can give much higher performance. I would, at minimum, try these alternate FFTs: 18432k, 19208k, 19600k, 20000k, 20480k, 20736k to see if there is a potential for higher performance.
Last fiddled with by axn on 2016-06-13 at 09:56 |
|
|
|
|
|
#418 |
|
Romulan Interpreter
"name field"
Jun 2011
Thailand
240638 Posts |
yes, correct! we would also suggest a try with 32768k, which is the next power of two (however a bit to the limit here) to see how the iteration times behave.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1724 | 2023-06-04 23:31 |
| Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS | VictordeHolland | Linux | 4 | 2018-04-11 13:44 |
| OpenCL accellerated lattice siever | pstach | Factoring | 1 | 2014-05-23 01:03 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| AMD's Graphics Core Next- a reason to accelerate towards OpenCL? | Belteshazzar | GPU Computing | 19 | 2012-03-07 18:58 |