![]() |
|
|
#12 |
|
Nov 2008
509 Posts |
|
|
|
|
|
|
#13 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts |
|
|
|
|
|
|
#14 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
|
|
|
|
|
|
#15 |
|
Nov 2008
1FD16 Posts |
|
|
|
|
|
|
#16 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
|
|
|
|
|
|
#17 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Quote:
Use the May 5 2016 beta of CUDALucas which catches such errors and others. If you haven't already, check what thread count is being used. Some card types and thread counts don't mix. (Check the ini file and the threads file and for any command line parameters.) Various errors will produce repeating residue series, which might have residue value repeatedly zero, oxfffffffffffffffd, or two. A real Mersenne prime has a residue series where only the last residue is zero. Sometimes these errors will lead to quicker than expected iteration times. Such runs should be abandoned as early as possible, and the issue resolved. An erroneous run may look something like this: Using threads: square 1024, splice 128. Starting M80381387 fft length = 4320K | Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done | | Mar 27 21:50:33 | M80381387 10000 0xfffffffffffffffd | 4320K 0.11852 0.6739 6.73s | 15:02:43 0.01% | | Mar 27 21:50:40 | M80381387 20000 0xfffffffffffffffd | 4320K 0.10074 0.6607 6.60s | 14:53:48 0.02% | where an expected ms/iter value for the FFT length is much larger. The fft.txt file for this gpu contained a record as follows, meaning an expected iteration time was 8.8msec not 0.7. 4374 80879779 8.8042 Relaunch after correcting whatever the problem was without getting rid of the bad checkpoint files produces bad following residues. Garbage in, garbage out, at expected speed. Using threads: square 256, splice 128. Continuing M80381387 @ iteration 246202 with fft length 4320K, 0.31% done | Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done | | Mar 28 00:22:29 | M80381387 250000 0x0000000000000002 | 4320K 0.000058.2352 31.27s | 17:22:30 0.31% | | Mar 28 00:23:51 | M80381387 260000 0x0000000000000002 | 4320K 0.000058.2653 82.65s | 23:46:47 0.32% | | Mar 28 00:25:14 | M80381387 270000 0x0000000000000002 | 4320K 0.000058.2623 82.62s | 1:05:42:21 0.33% | Normal function output for the same GPU looks like this: | Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done | | Mar 05 01:30:03 | M78157153 15410000 0xe5e7a3b4c1deab80 | 4320K 0.14337 8.2636 82.62s | 6:00:08:24 19.71% | | Mar 05 01:31:26 | M78157153 15420000 0x32f8bddda17ba94e | 4320K 0.14230 8.2600 82.60s | 6:00:07:01 19.72% | Some cards seem to produce bad residues if square 1024 is used, and work otherwise. A symptom of that case that shows up in thread benchmarking console output is a sharp discontinuity in per-iteration timings: fft = 4320K, ave time = 1.2804 ms, square: 32, splice: 128 fft = 4320K, ave time = 1.2819 ms, square: 64, splice: 128 fft = 4320K, ave time = 1.4082 ms, square: 128, splice: 128 fft = 4320K, ave time = 1.4996 ms, square: 256, splice: 128 fft = 4320K, ave time = 1.5713 ms, square: 512, splice: 128 fft = 4320K, ave time = 0.6497 ms, square: 1024, splice: 128 fft = 4320K, ave time = 0.6513 ms, square: 1024, splice: 32 fft = 4320K, ave time = 0.6498 ms, square: 1024, splice: 64 fft = 4320K, ave time = 0.6498 ms, square: 1024, splice: 128 fft = 4320K, ave time = 0.6500 ms, square: 1024, splice: 256 fft = 4320K, ave time = 0.6498 ms, square: 1024, splice: 512 fft = 4320K, ave time = 0.6533 ms, square: 1024, splice: 1024 fft = 4320K, min time = 0.6498 ms, square: 1024, splice: 64 Such a card should not be benchmarked for or used with 1024 squaring threads. It will produce incorrect residues. It will produce an incorrect threads file. See the m parameter of -threadbench to avoid 1024-thread in benchmarking. A similar issue with 32 threads exists for some cards. |
|
|
|
|
|
|
#18 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Quote:
In that case try an older driver if you can, that would put it back at CUDA 8 (or earlier). Last fiddled with by kriesel on 2018-02-02 at 17:18 |
|
|
|
|