- **Software**
(*https://www.mersenneforum.org/forumdisplay.php?f=10*)

- - **SkylakeX teasers (aka prime95 29.5)**
(*https://www.mersenneforum.org/showthread.php?t=23723*)

I found in some occasion roundoff error on Haswell I5 chip.
When I add in worktodo PRP=FFT2=[B][COLOR=Red]xxx[/COLOR][/B]K,x,x,xxxxx,x error will disappear But I never sow time difference . Lets say Prime95 say that is 384K in length, and I add that candidate is 400K in length. Will 400K increase time of PRP for that candidate? |

[QUOTE=Prime95;498680]I will investigate.[/QUOTE]
[QUOTE=Asimov]The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka!” (I found it!) but “That’s funny …”[/QUOTE] I tried posting just the above quotes, but the forum rejected it for lack of original content. Hopefully this paragraph will prove I'm sentient. |

Got my segfault while running a PRP.
The issue happened when a benchmark started in the middle of the PRP test. Nothing happened during the previous LL-D test. Now I will download the new build. [code] [Work thread Oct 26 11:22] Iteration: 14110000 / 81950377 [17.21%], ms/iter: 9.027, ETA: 7d 02:06 [Main thread Oct 26 11:23] Benchmarking multiple workers to tune FFT selection. [Work thread Oct 26 11:23] Stopping PRP test of M81950377 at iteration 14118224 [17.22%] [Work thread Oct 26 11:23] Worker stopped while running needed benchmarks. [Main thread Oct 26 11:23] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.80 ms. Total throughput: 113.64 iter/sec. [Main thread Oct 26 11:23] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.93 ms. Total throughput: 112.01 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.96 ms. Total throughput: 111.58 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.84 ms. Total throughput: 113.13 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.82 ms. Total throughput: 113.32 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.62 ms. Total throughput: 104.00 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.39 ms. Total throughput: 106.51 iter/sec. [Main thread Oct 26 11:25] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.50 ms. Total throughput: 105.24 iter/sec. *** Error in `./mprime': double free or corruption (!prev): 0x00007f7c100471c0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x81499)[0x7f7c18f3a499] ./mprime[0x45edbc] ./mprime[0x440a3a] ./mprime[0x441aa7] ./mprime[0x44986e] ./mprime[0x47cbca] /lib64/libpthread.so.0(+0x7de5)[0x7f7c1990fde5] /lib64/libc.so.6(clone+0x6d)[0x7f7c18fb7bad] ======= Memory map: ======== 00400000-026a0000 r-xp 00000000 103:02 18997 /home/ec2-user/mprime/29.5/mprime 0289f000-028a1000 r-xp 0229f000 103:02 18997 /home/ec2-user/mprime/29.5/mprime 028a1000-028dc000 rwxp 022a1000 103:02 18997 /home/ec2-user/mprime/29.5/mprime 028dc000-02903000 rwxp 00000000 00:00 0 036da000-036fb000 rwxp 00000000 00:00 0 [heap] 7f7bfc000000-7f7bfc4fe000 rwxp 00000000 00:00 0 7f7bfc4fe000-7f7c00000000 ---p 00000000 00:00 0 7f7c00000000-7f7c019e9000 rwxp 00000000 00:00 0 7f7c019e9000-7f7c04000000 ---p 00000000 00:00 0 7f7c075e6000-7f7c075fc000 r-xp 00000000 103:02 2338 /lib64/libresolv-2.17.so 7f7c075fc000-7f7c077fb000 ---p 00016000 103:02 2338 /lib64/libresolv-2.17.so 7f7c077fb000-7f7c077fc000 r-xp 00015000 103:02 2338 /lib64/libresolv-2.17.so 7f7c077fc000-7f7c077fd000 rwxp 00016000 103:02 2338 /lib64/libresolv-2.17.so 7f7c077fd000-7f7c077ff000 rwxp 00000000 00:00 0 7f7c077ff000-7f7c07800000 ---p 00000000 00:00 0 7f7c07800000-7f7c08000000 rwxp 00000000 00:00 0 7f7c08000000-7f7c0b9f5000 rwxp 00000000 00:00 0 7f7c0b9f5000-7f7c0c000000 ---p 00000000 00:00 0 7f7c0c000000-7f7c0d9e6000 rwxp 00000000 00:00 0 7f7c0d9e6000-7f7c10000000 ---p 00000000 00:00 0 7f7c10000000-7f7c11f7c000 rwxp 00000000 00:00 0 7f7c11f7c000-7f7c14000000 ---p 00000000 00:00 0 7f7c140e6000-7f7c140eb000 r-xp 00000000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c140eb000-7f7c142eb000 ---p 00005000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c142eb000-7f7c142ec000 r-xp 00005000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c142ec000-7f7c142ed000 rwxp 00006000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c142ed000-7f7c142f9000 r-xp 00000000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c142f9000-7f7c144f8000 ---p 0000c000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c144f8000-7f7c144f9000 r-xp 0000b000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c144f9000-7f7c144fa000 rwxp 0000c000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c144fa000-7f7c14500000 rwxp 00000000 00:00 0 7f7c14500000-7f7c14501000 ---p 00000000 00:00 0 7f7c14501000-7f7c14d01000 rwxp 00000000 00:00 0 7f7c174a0000-7f7c174b6000 r-xp 00000000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1 7f7c174b6000-7f7c176b5000 ---p 00016000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1 7f7c176b5000-7f7c176b6000 rwxp 00015000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1 7f7c176b6000-7f7c176b7000 ---p 00000000 00:00 0 7f7c176b7000-7f7c17eb7000 rwxp 00000000 00:00 0 7f7c17eb7000-7f7c17eb8000 ---p 00000000 00:00 0 7f7c17eb8000-7f7c186b8000 rwxp 00000000 00:00 0 7f7c186b8000-7f7c186b9000 ---p 00000000 00:00 0 7f7c186b9000-7f7c18eb9000 rwxp 00000000 00:00 0 7f7c18eb9000-7f7c1907c000 r-xp 00000000 103:02 2310 /lib64/libc-2.17.so 7f7c1907c000-7f7c1927b000 ---p 001c3000 103:02 2310 /lib64/libc-2.17.so 7f7c1927b000-7f7c1927f000 r-xp 001c2000 103:02 2310 /lib64/libc-2.17.so 7f7c1927f000-7f7c19281000 rwxp 001c6000 103:02 2310 /lib64/libc-2.17.so 7f7c19281000-7f7c19286000 rwxp 00000000 00:00 0 7f7c19286000-7f7c192fb000 r-xp 00000000 103:02 3420 /usr/lib64/libgmp.so.10.2.0 7f7c192fb000-7f7c194fa000 ---p 00075000 103:02 3420 /usr/lib64/libgmp.so.10.2.0 7f7c194fa000-7f7c194fc000 rwxp 00074000 103:02 3420 /usr/lib64/libgmp.so.10.2.0 7f7c194fc000-7f7c194fe000 r-xp 00000000 103:02 2316 /lib64/libdl-2.17.so 7f7c194fe000-7f7c196fe000 ---p 00002000 103:02 2316 /lib64/libdl-2.17.so 7f7c196fe000-7f7c196ff000 r-xp 00002000 103:02 2316 /lib64/libdl-2.17.so 7f7c196ff000-7f7c19700000 rwxp 00003000 103:02 2316 /lib64/libdl-2.17.so 7f7c19700000-7f7c19707000 r-xp 00000000 103:02 2340 /lib64/librt-2.17.so 7f7c19707000-7f7c19906000 ---p 00007000 103:02 2340 /lib64/librt-2.17.so 7f7c19906000-7f7c19907000 r-xp 00006000 103:02 2340 /lib64/librt-2.17.so 7f7c19907000-7f7c19908000 rwxp 00007000 103:02 2340 /lib64/librt-2.17.so 7f7c19908000-7f7c1991f000 r-xp 00000000 103:02 2336 /lib64/libpthread-2.17.so 7f7c1991f000-7f7c19b1e000 ---p 00017000 103:02 2336 /lib64/libpthread-2.17.so 7f7c19b1e000-7f7c19b1f000 r-xp 00016000 103:02 2336 /lib64/libpthread-2.17.so 7f7c19b1f000-7f7c19b20000 rwxp 00017000 103:02 2336 /lib64/libpthread-2.17.so 7f7c19b20000-7f7c19b24000 rwxp 00000000 00:00 0 7f7c19b24000-7f7c19c25000 r-xp 00000000 103:02 2318 /lib64/libm-2.17.so 7f7c19c25000-7f7c19e24000 ---p 00101000 103:02 2318 /lib64/libm-2.17.so 7f7c19e24000-7f7c19e25000 r-xp 00100000 103:02 2318 /lib64/libm-2.17.so 7f7c19e25000-7f7c19e26000 rwxp 00101000 103:02 2318 /lib64/libm-2.17.so 7f7c19e26000-7f7c19e48000 r-xp 00000000 103:02 2303 /lib64/ld-2.17.so 7f7c1a03c000-7f7c1a041000 rwxp 00000000 00:00 0 7f7c1a044000-7f7c1a047000 rwxp 00000000 00:00 0 7f7c1a047000-7f7c1a048000 r-xp 00021000 103:02 2303 /lib64/ld-2.17.so 7f7c1a048000-7f7c1a049000 rwxp 00022000 103:02 2303 /lib64/ld-2.17.so 7f7c1a049000-7f7c1a04a000 rwxp 00000000 00:00 0 7ffec9b5f000-7ffec9b80000 rwxp 00000000 00:00 0 [stack] 7ffec9bcd000-7ffec9bd0000 r--p 00000000 00:00 0 [vvar] 7ffec9bd0000-7ffec9bd2000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Annullato [/code] |

The EC2 instance I posted logs from earlier with all the roundoff errors finished the DC and it matched despite the errors.
It now got a new exponent 30K [I]higher[/I] than the last one, and it STILL chose 4M FFT, and already got 7 new roundoff errors: [URL="http://hoegge.dk/mersenne/295b3-2.txt"]295b3-2.txt[/URL] Setting it back to 4200K manually again. Added this to prime.txt to try and prevent this issue from now on: SoftCrossover=1.0 SoftCrossoverAdjust=-0.008 Edit: Another instance finished its DC successfully, got a new exponent: 77.97M and chose 4M FFT and quickly got 5 roundoff errors. It is now also set to 4200K manually. |

[QUOTE=ATH;498866]The EC2 instance I posted logs from earlier with all the roundoff errors finished the DC and it matched despite the errors.
[/QUOTE] From your posted file: [CODE] [Work thread Oct 26 14:42:43] Iteration: 77900000 / 77947687 [99.938821%], roundoff: 0.243, ms/iter: 13.860, ETA: 00:11:00 [Work thread Oct 26 14:42:43] Possible hardware errors have occurred during the test! 24 ROUNDOFF > 0.4. [Work thread Oct 26 14:42:43] Confidence in final result is excellent. [Work thread Oct 26 14:53:46] Gerbicz error check passed at iteration 77946729. [Work thread Oct 26 14:54:00] Gerbicz error check passed at iteration 77947629. [Work thread Oct 26 14:54:04] Gerbicz error check passed at iteration 77947678. [Work thread Oct 26 14:54:15] M77947687 is not prime. RES64: 4BCF9784E9A93DEE. Wh8: 34742F74,19637789,00001800 [/CODE] We see it differently, yes, that is really valid RES64 with incredible high probability because of my error checks. And for this you wouldn't even need to do/see roundoff checks in the run. If my check fails, then you need to fall back to a previous iteration, and you lost 1M iterations work (in your run), but the confidence is still high. How many times the check failed in that test? Seeing the roundoff errors could be very useful, when we decide the fft tablelimits (for a new code/processor ?). Ofcourse there is a trade off here: with higher FFT the iteration time is (in general) larger, but there is fewer number of fall backs. |

Yes there were 24 errors before I manually switched to 4200K FFT, it is very nice that it still works fine.
But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages. Full log since I switched to AVX-512 on that instance: [URL="http://hoegge.dk/mersenne/295b3.txt"]295b3.txt[/URL] [CODE][Work thread Oct 21 09:15:59] Iteration: 45256282/77947687, Possible error: round off (0.4344111008) > 0.42188 [Work thread Oct 21 11:26:42] Iteration: 45857274/77947687, Possible error: round off (0.4226175989) > 0.42188 [Work thread Oct 21 14:56:41] Iteration: 46817196/77947687, Possible error: round off (0.4234568715) > 0.42188 [Work thread Oct 21 16:00:42] Iteration: 47110891/77947687, Possible error: round off (0.4270896037) > 0.42188 [Work thread Oct 21 19:56:46] Iteration: 48197311/77947687, Possible error: round off (0.428642894) > 0.42188 [Work thread Oct 21 20:17:11] Iteration: 48291440/77947687, Possible error: round off (0.429588787) > 0.42188 [Work thread Oct 21 22:09:44] Iteration: 48809957/77947687, Possible error: round off (0.430043636) > 0.42188 [Work thread Oct 21 23:16:45] Iteration: 49117544/77947687, Possible error: round off (0.4256841092) > 0.42188 [Work thread Oct 22 04:15:09] Iteration: 50490754/77947687, Possible error: round off (0.4303004887) > 0.42188 [Work thread Oct 22 09:16:28] Iteration: 51847612/77947687, Possible error: round off (0.4338173701) > 0.42188 [Work thread Oct 22 09:55:51] Iteration: 52027255/77947687, Possible error: round off (0.4343059627) > 0.42188 [Work thread Oct 22 09:58:39] Iteration: 52040154/77947687, Possible error: round off (0.454604666) > 0.42188 [Work thread Oct 22 12:18:37] Iteration: 52683329/77947687, Possible error: round off (0.4277743109) > 0.42188 [Work thread Oct 22 13:44:48] Iteration: 53078067/77947687, Possible error: round off (0.4273455066) > 0.42188 [Work thread Oct 22 16:58:38] Iteration: 53968523/77947687, Possible error: round off (0.4307389637) > 0.42188 [Work thread Oct 22 19:02:21] Iteration: 54535700/77947687, Possible error: round off (0.4228008512) > 0.42188 [Work thread Oct 22 22:48:33] Iteration: 55573616/77947687, Possible error: round off (0.4268757585) > 0.42188 [Work thread Oct 23 01:15:29] Iteration: 56247746/77947687, Possible error: round off (0.4692008444) > 0.42188 [Work thread Oct 23 03:08:10] Iteration: 56738296/77947687, Possible error: round off (0.4224648114) > 0.42188 [Work thread Oct 23 04:05:36] Iteration: 57000989/77947687, Possible error: round off (0.4358626009) > 0.42188 [Work thread Oct 23 04:27:30] Iteration: 57101702/77947687, Possible error: round off (0.425993915) > 0.42188 [Work thread Oct 23 09:57:31] Iteration: 58617472/77947687, Possible error: round off (0.4493439792) > 0.42188 [Work thread Oct 23 11:18:36] Iteration: 58990146/77947687, Possible error: round off (0.4567632981) > 0.42188 [Work thread Oct 23 15:07:51] Iteration: 60041159/77947687, Possible error: round off (0.4326492791) > 0.42188 [/CODE] |

[QUOTE=ATH;498895]Yes there were 24 errors before I manually switched to 4200K FFT, it is very nice that it still works fine.
But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages. [/QUOTE] No, asked the number of lines, when the check failed, I think you should see such line: [CODE] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration ... [/CODE] with the iteration number. In that partial file I don't see such line. Only the number of roundoff errors doesn't matter, I think that with larger p we could see even more errors. What matters is how many times you need to rollback because that increase the additional overhead ( 0.2% ) of my check. And ofcourse there is a relation with these numbers (expected number of roundoff errors and the rollbacks for a given p,FFT), so these are not independent numbers. |

[QUOTE=ATH;498895]But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages.
[CODE][Work thread Oct 21 09:15:59] Iteration: 45256282/77947687, Possible error: round off (0.4344111008) > 0.42188[/QUOTE] Yes, this is unexpected. I used the same FFT crossovers as for AVX FFTs, which for a 4M FFT is 77990000. I do not see why AVX-512 FFTs have worse round-off behavior than AVX FFTs. More to investigate.... @Gerbicz: It is important for me to get the FFT crossovers right as the gwnum FFT routines are used for LL, LLR, PFGW etc. where Gerbicz error checking is not used. I could (should?) change prime95 to not even look for roundoff errors during a Gerbicz PRP test, esp. since calculating the roundoff error is not free. |

[QUOTE=Prime95;498906]
@Gerbicz: It is important for me to get the FFT crossovers right as the gwnum FFT routines are used for LL, LLR, PFGW etc. where Gerbicz error checking is not used. I could (should?) change prime95 to not even look for roundoff errors during a Gerbicz PRP test, esp. since calculating the roundoff error is not free.[/QUOTE] Yes, we don't need those roundoff error calculations at least for PRP, Preda's Gpu Owl has already removed it for Prp. As written you would still need only when you have new code/processor to get the code's new FFT crossovers (basically it doesn't change a lot). |

@Prime95 : if you erase those roundoff error from code: will that affect all PPR testing or only PRP testing on base2? ( since Gerbicz error test is only for base2)
I use Prime95 in CRUS searching as in my personal search for primes, on base 2 but also on any other base |

[QUOTE=pepi37;498910]@Prime95 : if you erase those roundoff error from code: will that affect all PPR testing or only PRP testing on base2? ( since Gerbicz error test is only for base2)
I use Prime95 in CRUS searching as in my personal search for primes, on base 2 but also on any other base[/QUOTE] Ofcourse I've spoken about prp with my error checking, For base!=2 you don't have this, hence for that you should keep those roundoff error checks. |

All times are UTC. The time now is 01:17. |

Powered by vBulletin® Version 3.8.11

Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.