![]() |
|
|
#1178 | |
|
"Composite as Heck"
Oct 2017
11001110112 Posts |
Quote:
Code:
2019-05-19 14:07:39 Note: no config.txt file found 2019-05-19 14:07:39 config: -prp 1275001 2019-05-19 14:07:39 1275001 FFT 64K: Width 8x8, Height 64x8; 19.45 bits/word 2019-05-19 14:07:39 using short carry kernels 2019-05-19 14:07:41 OpenCL compilation in 2079 ms, with "-DEXP=1275001u -DWIDTH=64u -DSMALL_HEIGHT=512u -DMIDDLE=1u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-19 14:07:41 1275001.owl not found, starting from the beginning. 2019-05-19 14:07:42 1275001 OK 2000 0.16%; 0.12 ms/sq; ETA 0d 00:03; d19a9c6b08d199b6 (check 0.13s) 2019-05-19 14:07:44 1275001 20000 1.57%; 0.13 ms/sq; ETA 0d 00:03; 65e3704fff61d046 2019-05-19 14:07:45 Stopping, please wait.. 2019-05-19 14:07:46 1275001 OK 31000 2.43%; 0.12 ms/sq; ETA 0d 00:03; 19d3b2da2559da70 (check 0.15s) 2019-05-19 14:07:46 Exiting because "stop requested" 2019-05-19 14:07:46 Bye Code:
2019-05-19 14:07:07 Note: no config.txt file found 2019-05-19 14:07:07 config: -prp 1275001 -fft 72K 2019-05-19 14:07:07 1275001 FFT 72K: Width 8x8, Height 8x8, Middle 9; 17.29 bits/word 2019-05-19 14:07:07 using short carry kernels 2019-05-19 14:07:10 OpenCL compilation in 1984 ms, with "-DEXP=1275001u -DWIDTH=64u -DSMALL_HEIGHT=64u -DMIDDLE=9u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-19 14:07:10 1275001.owl not found, starting from the beginning. 2019-05-19 14:07:10 1275001 EE loaded: 0, blockSize 1000, 0000000000000000 (expected 0000000000000003x) 2019-05-19 14:07:10 Exiting because "error on load" 2019-05-19 14:07:10 Bye Code:
2019-05-19 14:08:02 Note: no config.txt file found 2019-05-19 14:08:02 config: -prp 1275001 -fft 80K 2019-05-19 14:08:02 1275001 FFT 80K: Width 8x8, Height 8x8, Middle 10; 15.56 bits/word 2019-05-19 14:08:02 using short carry kernels 2019-05-19 14:08:04 OpenCL compilation in 1985 ms, with "-DEXP=1275001u -DWIDTH=64u -DSMALL_HEIGHT=64u -DMIDDLE=10u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-19 14:08:04 1275001.owl not found, starting from the beginning. 2019-05-19 14:08:05 1275001 EE loaded: 0, blockSize 1000, 0000000000000000 (expected 0000000000000003x) 2019-05-19 14:08:05 Exiting because "error on load" 2019-05-19 14:08:05 Bye Code:
2019-05-19 14:08:15 Note: no config.txt file found 2019-05-19 14:08:15 config: -prp 1275001 -fft 128K 2019-05-19 14:08:15 1275001 FFT 128K: Width 256x4, Height 8x8; 9.73 bits/word 2019-05-19 14:08:15 using long carry kernels 2019-05-19 14:08:17 OpenCL compilation in 1920 ms, with "-DEXP=1275001u -DWIDTH=1024u -DSMALL_HEIGHT=64u -DMIDDLE=1u -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2019-05-19 14:08:17 1275001.owl not found, starting from the beginning. 2019-05-19 14:08:18 1275001 OK 2000 0.16%; 0.15 ms/sq; ETA 0d 00:03; d19a9c6b08d199b6 (check 0.16s) 2019-05-19 14:08:20 1275001 20000 1.57%; 0.15 ms/sq; ETA 0d 00:03; 65e3704fff61d046 2019-05-19 14:08:23 1275001 40000 3.14%; 0.15 ms/sq; ETA 0d 00:03; ddca1e3b88d59ea2 2019-05-19 14:08:24 Stopping, please wait.. 2019-05-19 14:08:24 1275001 OK 44000 3.45%; 0.15 ms/sq; ETA 0d 00:03; 50e59fd6714c3a09 (check 0.16s) 2019-05-19 14:08:24 Exiting because "stop requested" 2019-05-19 14:08:24 Bye Code:
2019-05-19 14:23:05 1275001 710000 98.40%; 0.15 ms/sq; ETA 0d 00:00; 0000000000000000 2019-05-19 14:23:06 1275001 720000 99.79%; 0.15 ms/sq; ETA 0d 00:00; 0000000000000000 2019-05-19 14:25:19 Round 0 of 1: init 1.88 s; 0.17 ms/mul; 764090 muls 2019-05-19 14:25:19 1275001 P-1 stage1 GCD: no factor gpuowl: GmpUtil.cpp:25: std::__cxx11::string GCD(u32, const std::vector<unsigned int>&, u32): Assertion `mpz_cmp_ui(b, 0)' failed. Aborted (core dumped) |
|
|
|
|
|
|
#1179 | |
|
19×151 Posts |
Quote:
Sometimes I have the same all-zeroes residue, but I don't know if the issue is the same. Gpuowl should reload the last checkpoint after a check. |
|
|
|
|
#1180 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Quote:
Gpuowl blithely accepting and continuing on all-0 res64 values in P-1 is a missed opportunity for error detection. Printing that it completed stage one, when the interim res64s are all zeros is unfortunate. Zero and one are known error conditions in P-1 (CUDAPm1 for example). And the Gerbicz check is not applicable to P-1 computations, so adding that check back in for P-1 computations would be useful, in this otherwise unchecked run case. Per Preda, there was a zero check present in the PRP code a while ago. https://www.mersenneforum.org/showpo...&postcount=189 Last fiddled with by kriesel on 2019-05-19 at 14:22 |
|
|
|
|
|
|
#1181 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10101001111012 Posts |
Quote:
Gpuowl blithely accepting and continuing on all-0 res64 values is a missed opportunity for error detection. Zero and one are known error conditions in P-1 (CUDAPm1 for example). And the Gerbicz check is not applicable to P-1 computations, so adding that zero check back in for P-1 computations would be useful. Per Preda, there was a zero check present in the PRP code a while ago. https://www.mersenneforum.org/showpo...&postcount=189 |
|
|
|
|
|
|
#1182 | |
|
1010100001002 Posts |
Quote:
Absolutely. The fact that after an all-zeroes-residue the GEC fails and gpuowl reloads the last checkpoint file. For PRP of course. Last fiddled with by SELROC on 2019-05-19 at 14:52 |
|
|
|
|
#1183 | |
|
"Composite as Heck"
Oct 2017
14738 Posts |
Quote:
|
|
|
|
|
|
|
#1184 | ||
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
Quote:
In the case where a zero error occurs, if uniformly distributed over iteration numbers of first appearance, it can be detected on average console-output-interval/2 iterations later by a separate zero res64 check, while the Gerbicz error check would take on average blocksize-squared/2 iterations. For V6.5 default operation, those averages would be 10,000 and 500,000 iterations respectively. Per Preda and Ewmayer, res64 determination in gpuowl and mlucas are fast. And using the res64 determined already for console output makes even that small cost vanish, leaving only the very small cost of a 64-bit compare or 16-char string compare. A 490,000 iterations savings on my RX480 at 3.8ms/iter for current wavefront exponents is of order 1862 seconds, just over half an hour. (About 59 ppm per occurrence per year, so it would take 17 of them per year to accumulate to 0.1% performance difference.) But hopefully these zero errors are rare occurrences in PRP. They seem to be rare, from a casual look at my logs. I don't recall ever seeing a zero from gpuowl. Quote:
Last fiddled with by kriesel on 2019-05-19 at 15:42 |
||
|
|
|
|
|
#1185 | |
|
2,713 Posts |
Quote:
It occurs to me that the zero error is not often and I did not find a way to reproduce it reliably. It may happen two or three times one day, and the day after not happen at all. |
|
|
|
|
#1186 |
|
"Mihai Preda"
Apr 2015
137110 Posts |
I have a suspicion fft-64 is broken, and all the sizes that use it. I need to investigate. Give me a few days.
|
|
|
|
|
|
#1187 | |
|
32·23·43 Posts |
Quote:
With new version there is remarkable speedup on 332M exponent ! Went from 4.13 ms/sq to 3.7 ms/sq Good ! I did change the FFT however, from -fft +2 to normal fft without arguments. -fft +2 now fails to load. Last fiddled with by SELROC on 2019-05-20 at 05:14 |
|
|
|
|
#1188 | |
|
"Mihai Preda"
Apr 2015
3·457 Posts |
Quote:
Turns out in the current implementation, the MIDDLE step of the FFT can't be done correctly when H < 256. I think all your failing cases were in that situation. Anyway, I updated the FFTConfig to not generate the invalid size combinations anymore; please retry. |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |