![]() |
|
|
#34 |
|
Dec 2011
After milion nines:)
145110 Posts |
I found in some occasion roundoff error on Haswell I5 chip.
When I add in worktodo PRP=FFT2=xxxK,x,x,xxxxx,x error will disappear But I never sow time difference . Lets say Prime95 say that is 384K in length, and I add that candidate is 400K in length. Will 400K increase time of PRP for that candidate? |
|
|
|
|
|
#35 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
9,767 Posts |
Quote:
|
|
|
|
|
|
|
#36 |
|
Banned
"Luigi"
Aug 2002
Team Italia
2·3·11·73 Posts |
Got my segfault while running a PRP.
The issue happened when a benchmark started in the middle of the PRP test. Nothing happened during the previous LL-D test. Now I will download the new build. Code:
[Work thread Oct 26 11:22] Iteration: 14110000 / 81950377 [17.21%], ms/iter: 9.027, ETA: 7d 02:06 [Main thread Oct 26 11:23] Benchmarking multiple workers to tune FFT selection. [Work thread Oct 26 11:23] Stopping PRP test of M81950377 at iteration 14118224 [17.22%] [Work thread Oct 26 11:23] Worker stopped while running needed benchmarks. [Main thread Oct 26 11:23] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.80 ms. Total throughput: 113.64 iter/sec. [Main thread Oct 26 11:23] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.93 ms. Total throughput: 112.01 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.96 ms. Total throughput: 111.58 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.84 ms. Total throughput: 113.13 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.82 ms. Total throughput: 113.32 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.62 ms. Total throughput: 104.00 iter/sec. [Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.39 ms. Total throughput: 106.51 iter/sec. [Main thread Oct 26 11:25] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.50 ms. Total throughput: 105.24 iter/sec. *** Error in `./mprime': double free or corruption (!prev): 0x00007f7c100471c0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x81499)[0x7f7c18f3a499] ./mprime[0x45edbc] ./mprime[0x440a3a] ./mprime[0x441aa7] ./mprime[0x44986e] ./mprime[0x47cbca] /lib64/libpthread.so.0(+0x7de5)[0x7f7c1990fde5] /lib64/libc.so.6(clone+0x6d)[0x7f7c18fb7bad] ======= Memory map: ======== 00400000-026a0000 r-xp 00000000 103:02 18997 /home/ec2-user/mprime/29.5/mprime 0289f000-028a1000 r-xp 0229f000 103:02 18997 /home/ec2-user/mprime/29.5/mprime 028a1000-028dc000 rwxp 022a1000 103:02 18997 /home/ec2-user/mprime/29.5/mprime 028dc000-02903000 rwxp 00000000 00:00 0 036da000-036fb000 rwxp 00000000 00:00 0 [heap] 7f7bfc000000-7f7bfc4fe000 rwxp 00000000 00:00 0 7f7bfc4fe000-7f7c00000000 ---p 00000000 00:00 0 7f7c00000000-7f7c019e9000 rwxp 00000000 00:00 0 7f7c019e9000-7f7c04000000 ---p 00000000 00:00 0 7f7c075e6000-7f7c075fc000 r-xp 00000000 103:02 2338 /lib64/libresolv-2.17.so 7f7c075fc000-7f7c077fb000 ---p 00016000 103:02 2338 /lib64/libresolv-2.17.so 7f7c077fb000-7f7c077fc000 r-xp 00015000 103:02 2338 /lib64/libresolv-2.17.so 7f7c077fc000-7f7c077fd000 rwxp 00016000 103:02 2338 /lib64/libresolv-2.17.so 7f7c077fd000-7f7c077ff000 rwxp 00000000 00:00 0 7f7c077ff000-7f7c07800000 ---p 00000000 00:00 0 7f7c07800000-7f7c08000000 rwxp 00000000 00:00 0 7f7c08000000-7f7c0b9f5000 rwxp 00000000 00:00 0 7f7c0b9f5000-7f7c0c000000 ---p 00000000 00:00 0 7f7c0c000000-7f7c0d9e6000 rwxp 00000000 00:00 0 7f7c0d9e6000-7f7c10000000 ---p 00000000 00:00 0 7f7c10000000-7f7c11f7c000 rwxp 00000000 00:00 0 7f7c11f7c000-7f7c14000000 ---p 00000000 00:00 0 7f7c140e6000-7f7c140eb000 r-xp 00000000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c140eb000-7f7c142eb000 ---p 00005000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c142eb000-7f7c142ec000 r-xp 00005000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c142ec000-7f7c142ed000 rwxp 00006000 103:02 2326 /lib64/libnss_dns-2.17.so 7f7c142ed000-7f7c142f9000 r-xp 00000000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c142f9000-7f7c144f8000 ---p 0000c000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c144f8000-7f7c144f9000 r-xp 0000b000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c144f9000-7f7c144fa000 rwxp 0000c000 103:02 2328 /lib64/libnss_files-2.17.so 7f7c144fa000-7f7c14500000 rwxp 00000000 00:00 0 7f7c14500000-7f7c14501000 ---p 00000000 00:00 0 7f7c14501000-7f7c14d01000 rwxp 00000000 00:00 0 7f7c174a0000-7f7c174b6000 r-xp 00000000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1 7f7c174b6000-7f7c176b5000 ---p 00016000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1 7f7c176b5000-7f7c176b6000 rwxp 00015000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1 7f7c176b6000-7f7c176b7000 ---p 00000000 00:00 0 7f7c176b7000-7f7c17eb7000 rwxp 00000000 00:00 0 7f7c17eb7000-7f7c17eb8000 ---p 00000000 00:00 0 7f7c17eb8000-7f7c186b8000 rwxp 00000000 00:00 0 7f7c186b8000-7f7c186b9000 ---p 00000000 00:00 0 7f7c186b9000-7f7c18eb9000 rwxp 00000000 00:00 0 7f7c18eb9000-7f7c1907c000 r-xp 00000000 103:02 2310 /lib64/libc-2.17.so 7f7c1907c000-7f7c1927b000 ---p 001c3000 103:02 2310 /lib64/libc-2.17.so 7f7c1927b000-7f7c1927f000 r-xp 001c2000 103:02 2310 /lib64/libc-2.17.so 7f7c1927f000-7f7c19281000 rwxp 001c6000 103:02 2310 /lib64/libc-2.17.so 7f7c19281000-7f7c19286000 rwxp 00000000 00:00 0 7f7c19286000-7f7c192fb000 r-xp 00000000 103:02 3420 /usr/lib64/libgmp.so.10.2.0 7f7c192fb000-7f7c194fa000 ---p 00075000 103:02 3420 /usr/lib64/libgmp.so.10.2.0 7f7c194fa000-7f7c194fc000 rwxp 00074000 103:02 3420 /usr/lib64/libgmp.so.10.2.0 7f7c194fc000-7f7c194fe000 r-xp 00000000 103:02 2316 /lib64/libdl-2.17.so 7f7c194fe000-7f7c196fe000 ---p 00002000 103:02 2316 /lib64/libdl-2.17.so 7f7c196fe000-7f7c196ff000 r-xp 00002000 103:02 2316 /lib64/libdl-2.17.so 7f7c196ff000-7f7c19700000 rwxp 00003000 103:02 2316 /lib64/libdl-2.17.so 7f7c19700000-7f7c19707000 r-xp 00000000 103:02 2340 /lib64/librt-2.17.so 7f7c19707000-7f7c19906000 ---p 00007000 103:02 2340 /lib64/librt-2.17.so 7f7c19906000-7f7c19907000 r-xp 00006000 103:02 2340 /lib64/librt-2.17.so 7f7c19907000-7f7c19908000 rwxp 00007000 103:02 2340 /lib64/librt-2.17.so 7f7c19908000-7f7c1991f000 r-xp 00000000 103:02 2336 /lib64/libpthread-2.17.so 7f7c1991f000-7f7c19b1e000 ---p 00017000 103:02 2336 /lib64/libpthread-2.17.so 7f7c19b1e000-7f7c19b1f000 r-xp 00016000 103:02 2336 /lib64/libpthread-2.17.so 7f7c19b1f000-7f7c19b20000 rwxp 00017000 103:02 2336 /lib64/libpthread-2.17.so 7f7c19b20000-7f7c19b24000 rwxp 00000000 00:00 0 7f7c19b24000-7f7c19c25000 r-xp 00000000 103:02 2318 /lib64/libm-2.17.so 7f7c19c25000-7f7c19e24000 ---p 00101000 103:02 2318 /lib64/libm-2.17.so 7f7c19e24000-7f7c19e25000 r-xp 00100000 103:02 2318 /lib64/libm-2.17.so 7f7c19e25000-7f7c19e26000 rwxp 00101000 103:02 2318 /lib64/libm-2.17.so 7f7c19e26000-7f7c19e48000 r-xp 00000000 103:02 2303 /lib64/ld-2.17.so 7f7c1a03c000-7f7c1a041000 rwxp 00000000 00:00 0 7f7c1a044000-7f7c1a047000 rwxp 00000000 00:00 0 7f7c1a047000-7f7c1a048000 r-xp 00021000 103:02 2303 /lib64/ld-2.17.so 7f7c1a048000-7f7c1a049000 rwxp 00022000 103:02 2303 /lib64/ld-2.17.so 7f7c1a049000-7f7c1a04a000 rwxp 00000000 00:00 0 7ffec9b5f000-7ffec9b80000 rwxp 00000000 00:00 0 [stack] 7ffec9bcd000-7ffec9bd0000 r--p 00000000 00:00 0 [vvar] 7ffec9bd0000-7ffec9bd2000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] Annullato |
|
|
|
|
|
#37 |
|
Einyen
Dec 2003
Denmark
2·1,579 Posts |
The EC2 instance I posted logs from earlier with all the roundoff errors finished the DC and it matched despite the errors.
It now got a new exponent 30K higher than the last one, and it STILL chose 4M FFT, and already got 7 new roundoff errors: 295b3-2.txt Setting it back to 4200K manually again. Added this to prime.txt to try and prevent this issue from now on: SoftCrossover=1.0 SoftCrossoverAdjust=-0.008 Edit: Another instance finished its DC successfully, got a new exponent: 77.97M and chose 4M FFT and quickly got 5 roundoff errors. It is now also set to 4200K manually. Last fiddled with by ATH on 2018-10-27 at 01:41 |
|
|
|
|
|
#38 | |
|
"Robert Gerbicz"
Oct 2005
Hungary
148610 Posts |
Quote:
Code:
[Work thread Oct 26 14:42:43] Iteration: 77900000 / 77947687 [99.938821%], roundoff: 0.243, ms/iter: 13.860, ETA: 00:11:00 [Work thread Oct 26 14:42:43] Possible hardware errors have occurred during the test! 24 ROUNDOFF > 0.4. [Work thread Oct 26 14:42:43] Confidence in final result is excellent. [Work thread Oct 26 14:53:46] Gerbicz error check passed at iteration 77946729. [Work thread Oct 26 14:54:00] Gerbicz error check passed at iteration 77947629. [Work thread Oct 26 14:54:04] Gerbicz error check passed at iteration 77947678. [Work thread Oct 26 14:54:15] M77947687 is not prime. RES64: 4BCF9784E9A93DEE. Wh8: 34742F74,19637789,00001800 Ofcourse there is a trade off here: with higher FFT the iteration time is (in general) larger, but there is fewer number of fall backs. Last fiddled with by R. Gerbicz on 2018-10-27 at 09:14 Reason: typo |
|
|
|
|
|
|
#39 |
|
Einyen
Dec 2003
Denmark
61268 Posts |
Yes there were 24 errors before I manually switched to 4200K FFT, it is very nice that it still works fine.
But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages. Full log since I switched to AVX-512 on that instance: 295b3.txt Code:
[Work thread Oct 21 09:15:59] Iteration: 45256282/77947687, Possible error: round off (0.4344111008) > 0.42188 [Work thread Oct 21 11:26:42] Iteration: 45857274/77947687, Possible error: round off (0.4226175989) > 0.42188 [Work thread Oct 21 14:56:41] Iteration: 46817196/77947687, Possible error: round off (0.4234568715) > 0.42188 [Work thread Oct 21 16:00:42] Iteration: 47110891/77947687, Possible error: round off (0.4270896037) > 0.42188 [Work thread Oct 21 19:56:46] Iteration: 48197311/77947687, Possible error: round off (0.428642894) > 0.42188 [Work thread Oct 21 20:17:11] Iteration: 48291440/77947687, Possible error: round off (0.429588787) > 0.42188 [Work thread Oct 21 22:09:44] Iteration: 48809957/77947687, Possible error: round off (0.430043636) > 0.42188 [Work thread Oct 21 23:16:45] Iteration: 49117544/77947687, Possible error: round off (0.4256841092) > 0.42188 [Work thread Oct 22 04:15:09] Iteration: 50490754/77947687, Possible error: round off (0.4303004887) > 0.42188 [Work thread Oct 22 09:16:28] Iteration: 51847612/77947687, Possible error: round off (0.4338173701) > 0.42188 [Work thread Oct 22 09:55:51] Iteration: 52027255/77947687, Possible error: round off (0.4343059627) > 0.42188 [Work thread Oct 22 09:58:39] Iteration: 52040154/77947687, Possible error: round off (0.454604666) > 0.42188 [Work thread Oct 22 12:18:37] Iteration: 52683329/77947687, Possible error: round off (0.4277743109) > 0.42188 [Work thread Oct 22 13:44:48] Iteration: 53078067/77947687, Possible error: round off (0.4273455066) > 0.42188 [Work thread Oct 22 16:58:38] Iteration: 53968523/77947687, Possible error: round off (0.4307389637) > 0.42188 [Work thread Oct 22 19:02:21] Iteration: 54535700/77947687, Possible error: round off (0.4228008512) > 0.42188 [Work thread Oct 22 22:48:33] Iteration: 55573616/77947687, Possible error: round off (0.4268757585) > 0.42188 [Work thread Oct 23 01:15:29] Iteration: 56247746/77947687, Possible error: round off (0.4692008444) > 0.42188 [Work thread Oct 23 03:08:10] Iteration: 56738296/77947687, Possible error: round off (0.4224648114) > 0.42188 [Work thread Oct 23 04:05:36] Iteration: 57000989/77947687, Possible error: round off (0.4358626009) > 0.42188 [Work thread Oct 23 04:27:30] Iteration: 57101702/77947687, Possible error: round off (0.425993915) > 0.42188 [Work thread Oct 23 09:57:31] Iteration: 58617472/77947687, Possible error: round off (0.4493439792) > 0.42188 [Work thread Oct 23 11:18:36] Iteration: 58990146/77947687, Possible error: round off (0.4567632981) > 0.42188 [Work thread Oct 23 15:07:51] Iteration: 60041159/77947687, Possible error: round off (0.4326492791) > 0.42188 Last fiddled with by ATH on 2018-10-27 at 13:50 |
|
|
|
|
|
#40 | |
|
"Robert Gerbicz"
Oct 2005
Hungary
2·743 Posts |
Quote:
Code:
ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration ... Only the number of roundoff errors doesn't matter, I think that with larger p we could see even more errors. What matters is how many times you need to rollback because that increase the additional overhead ( 0.2% ) of my check. And ofcourse there is a relation with these numbers (expected number of roundoff errors and the rollbacks for a given p,FFT), so these are not independent numbers. Last fiddled with by R. Gerbicz on 2018-10-27 at 14:42 Reason: more info, typo |
|
|
|
|
|
|
#41 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
5·11·137 Posts |
Quote:
@Gerbicz: It is important for me to get the FFT crossovers right as the gwnum FFT routines are used for LL, LLR, PFGW etc. where Gerbicz error checking is not used. I could (should?) change prime95 to not even look for roundoff errors during a Gerbicz PRP test, esp. since calculating the roundoff error is not free. Last fiddled with by Prime95 on 2018-10-27 at 16:32 |
|
|
|
|
|
|
#42 | |
|
"Robert Gerbicz"
Oct 2005
Hungary
27168 Posts |
Quote:
|
|
|
|
|
|
|
#43 |
|
Dec 2011
After milion nines:)
1,451 Posts |
@Prime95 : if you erase those roundoff error from code: will that affect all PPR testing or only PRP testing on base2? ( since Gerbicz error test is only for base2)
I use Prime95 in CRUS searching as in my personal search for primes, on base 2 but also on any other base |
|
|
|
|
|
#44 | |
|
"Robert Gerbicz"
Oct 2005
Hungary
148610 Posts |
Quote:
|
|
|
|
|