![]() |
|
|
#1 |
|
Jul 2005
2668 Posts |
Hi,
I'm using 29.1.8 pre-release for TF and noticed an issue when writing the job state to disk. [Work thread Jan 30 19:34] Trial factoring M748000307 to 2^76 is 62.28% complete. Time: 1364.301 sec. [Work thread Jan 30 19:57] Trial factoring M748000307 to 2^76 is 70.06% complete. Time: 1363.816 sec. [Work thread Jan 30 20:19] Trial factoring M748000307 to 2^76 is 77.85% complete. Time: 1364.220 sec. [Work thread Jan 30 20:42] Trial factoring M748000307 to 2^76 is 85.64% complete. Time: 1364.398 sec. [Main thread Jan 30 21:04] Stopping all worker threads. [Work thread Jan 30 21:04] Error writing intermediate file: f748000307 [Work thread Jan 30 21:04] Worker stopped. [Main thread Jan 30 21:04] Execution halted. [Main thread Jan 30 21:04] Choose Test/Continue to restart. ### here I killed mprime with SIGTERM and started again [Main thread Jan 30 21:08] Mersenne number primality test program version 29.1 [Main thread Jan 30 21:08] Optimizing for CPU architecture: Core i3/i5/i7, L2 cache size: 256 KB, L3 cache size: 6 MB [Main thread Jan 30 21:08] Starting worker. [Work thread Jan 30 21:08] Worker starting [Work thread Jan 30 21:08] Setting affinity to run worker on CPU core #1 [Work thread Jan 30 21:08] Setting affinity to run helper thread 2 on CPU core #3 [Work thread Jan 30 21:08] Setting affinity to run helper thread 1 on CPU core #2 [Work thread Jan 30 21:08] Resuming trial factoring of M748000307 to 2^77 [Work thread Jan 30 21:08] Trial factoring M748000307 to 2^76 is 70.50% complete. So it lost the work of about one hour. In the working directory I have such ".write" file (never seen that before): Code:
-rw-r--r-- 1 rudi users 80 2017-01-30 19:58 f748000307 -rw-r--r-- 1 rudi users 80 2017-01-30 17:58 f748000307.bu -rw-r--r-- 1 rudi users 0 2017-01-30 21:04 f748000307.write Last fiddled with by rudi_m on 2017-01-30 at 20:18 |
|
|
|
|
|
#2 |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
947610 Posts |
The .write files are always created - you just cannot see them, because usually everything next happens very fast: Then, in sequence: .bu2 gets deleted, .bu get renamed into .bu2, "p* (or f*, or m*)" file get renamed into .bu and then .write renamed into "p* (or f*, or m*)" file.
Was the disk full, or did the anti-virus act up, or something like that? |
|
|
|
|
|
#3 | |
|
Jul 2005
18210 Posts |
Quote:
Moreover I've noticed another issue. In the log below you see a crash where a TF job was finishing and then continuing an LL job: Code:
[...] [Work thread Jan 30 21:45] Trial factoring M776000053 to 2^77 is 81.06% complete. Time: 1471.301 sec. [Comm thread Jan 30 22:09] Updating computer information on the server [Comm thread Jan 30 22:09] Sending expected completion date for M776000053: Jan 31 2017 [Comm thread Jan 30 22:09] Sending expected completion date for M39320129: Feb 2 2017 [Comm thread Jan 30 22:09] Sending expected completion date for M906000041: Feb 2 2017 [Comm thread Jan 30 22:09] Sending expected completion date for M907000063: Feb 3 2017 [Comm thread Jan 30 22:09] Sending expected completion date for M908000053: Feb 3 2017 [Comm thread Jan 30 22:09] Sending expected completion date for M909000061: Feb 4 2017 [Comm thread Jan 30 22:09] Done communicating with server. [Work thread Jan 30 22:10] Trial factoring M776000053 to 2^77 is 85.10% complete. Time: 1476.194 sec. [Work thread Jan 30 22:34] Trial factoring M776000053 to 2^77 is 89.14% complete. Time: 1468.951 sec. [Work thread Jan 30 22:59] Trial factoring M776000053 to 2^77 is 93.18% complete. Time: 1469.562 sec. [Work thread Jan 30 23:24] Trial factoring M776000053 to 2^77 is 97.22% complete. Time: 1513.716 sec. [Work thread Jan 30 23:41] M776000053 no factor from 2^76 to 2^77, Wg8: 17ED2BFC [Comm thread Jan 30 23:41] Sending result to server: UID: rudimeier/lakshmi, M776000053 no factor from 2^76 to 2^77, Wg8: 17ED2BFC, AID: FDFC9C4E4763840E571DB046AC77D795 [Comm thread Jan 30 23:41] [Work thread Jan 30 23:41] Setting affinity to run helper thread 2 on CPU core #3 [Work thread Jan 30 23:41] Setting affinity to run helper thread 1 on CPU core #2 [Work thread Jan 30 23:41] Starting primality test of M39320129 using FMA3 FFT length 2M, Pass1=512, Pass2=4K, 3 threads [Comm thread Jan 30 23:41] PrimeNet success code with additional info: [Comm thread Jan 30 23:41] CPU credit is 19.7219 GHz-days. [Comm thread Jan 30 23:41] Done communicating with server. [Work thread Jan 30 23:43] Iteration: 36117/39320129, Possible error: round off (7.220114937e+22) > 0.40625 [Work thread Jan 30 23:43] Continuing from last save file. *** Error in `./mprime': free(): invalid pointer: 0x00007f6e5c0123f0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7364f)[0x7f6e653ed64f] /lib64/libc.so.6(+0x78eae)[0x7f6e653f2eae] /lib64/libc.so.6(+0x79b87)[0x7f6e653f3b87] ./mprime[0x44dc49] ./mprime[0x44dd3e] ./mprime[0x436182] ./mprime[0x439155] ./mprime[0x439375] ./mprime[0x4394e8] ./mprime[0x463efa] /lib64/libpthread.so.0(+0x80db)[0x7f6e65e440db] /lib64/libc.so.6(clone+0x6d)[0x7f6e65460e3d] ======= Memory map: ======== 00400000-0263a000 r-xp 00000000 00:26 19924973 /home/rudi/MPrime/mprime-29.1.8/mprime 0283a000-0284c000 rwxp 0223a000 00:26 19924973 /home/rudi/MPrime/mprime-29.1.8/mprime 0284c000-02871000 rwxp 00000000 00:00 0 02af7000-02b18000 rwxp 00000000 00:00 0 [heap] 7f6e4afc3000-7f6e4c000000 rwxp 00000000 00:00 0 7f6e4c000000-7f6e4c021000 rwxp 00000000 00:00 0 7f6e4c021000-7f6e50000000 ---p 00000000 00:00 0 7f6e50000000-7f6e50086000 rwxp 00000000 00:00 0 7f6e50086000-7f6e54000000 ---p 00000000 00:00 0 7f6e54000000-7f6e54021000 rwxp 00000000 00:00 0 7f6e54021000-7f6e58000000 ---p 00000000 00:00 0 7f6e58000000-7f6e58024000 rwxp 00000000 00:00 0 7f6e58024000-7f6e5c000000 ---p 00000000 00:00 0 7f6e5c000000-7f6e5c05f000 rwxp 00000000 00:00 0 7f6e5c05f000-7f6e60000000 ---p 00000000 00:00 0 7f6e606dd000-7f6e61b34000 rwxp 00000000 00:00 0 7f6e61b34000-7f6e61b48000 r-xp 00000000 fe:02 1452658 /lib64/libresolv-2.18.so 7f6e61b48000-7f6e61d47000 ---p 00014000 fe:02 1452658 /lib64/libresolv-2.18.so 7f6e61d47000-7f6e61d48000 r-xp 00013000 fe:02 1452658 /lib64/libresolv-2.18.so 7f6e61d48000-7f6e61d49000 rwxp 00014000 fe:02 1452658 /lib64/libresolv-2.18.so 7f6e61d49000-7f6e61d4b000 rwxp 00000000 00:00 0 7f6e61d4b000-7f6e61d50000 r-xp 00000000 fe:02 1441945 /lib64/libnss_dns-2.18.so 7f6e61d50000-7f6e61f4f000 ---p 00005000 fe:02 1441945 /lib64/libnss_dns-2.18.so 7f6e61f4f000-7f6e61f50000 r-xp 00004000 fe:02 1441945 /lib64/libnss_dns-2.18.so 7f6e61f50000-7f6e61f51000 rwxp 00005000 fe:02 1441945 /lib64/libnss_dns-2.18.so 7f6e61f51000-7f6e61f5c000 r-xp 00000000 fe:02 1452653 /lib64/libnss_files-2.18.so 7f6e61f5c000-7f6e6215b000 ---p 0000b000 fe:02 1452653 /lib64/libnss_files-2.18.so 7f6e6215b000-7f6e6215c000 r-xp 0000a000 fe:02 1452653 /lib64/libnss_files-2.18.so 7f6e6215c000-7f6e6215d000 rwxp 0000b000 fe:02 1452653 /lib64/libnss_files-2.18.so 7f6e6215d000-7f6e6215e000 ---p 00000000 00:00 0 7f6e6215e000-7f6e6295e000 rwxp 00000000 00:00 0 7f6e6295e000-7f6e6295f000 ---p 00000000 00:00 0 7f6e6295f000-7f6e6315f000 rwxp 00000000 00:00 0 7f6e6315f000-7f6e63160000 ---p 00000000 00:00 0 7f6e63160000-7f6e63960000 rwxp 00000000 00:00 0 7f6e63960000-7f6e63961000 ---p 00000000 00:00 0 7f6e63961000-7f6e64161000 rwxp 00000000 00:00 0 7f6e64161000-7f6e64162000 ---p 00000000 00:00 0 7f6e64162000-7f6e64962000 rwxp 00000000 00:00 0 7f6e64962000-7f6e64963000 ---p 00000000 00:00 0 7f6e64963000-7f6e65163000 rwxp 00000000 00:00 0 7f6e65163000-7f6e65179000 r-xp 00000000 fe:02 1441914 /lib64/libgcc_s.so.1 7f6e65179000-7f6e65378000 ---p 00016000 fe:02 1441914 /lib64/libgcc_s.so.1 7f6e65378000-7f6e65379000 r-xp 00015000 fe:02 1441914 /lib64/libgcc_s.so.1 7f6e65379000-7f6e6537a000 rwxp 00016000 fe:02 1441914 /lib64/libgcc_s.so.1 7f6e6537a000-7f6e6551e000 r-xp 00000000 fe:02 1441863 /lib64/libc-2.18.so 7f6e6551e000-7f6e6571e000 ---p 001a4000 fe:02 1441863 /lib64/libc-2.18.so 7f6e6571e000-7f6e65722000 r-xp 001a4000 fe:02 1441863 /lib64/libc-2.18.so 7f6e65722000-7f6e65724000 rwxp 001a8000 fe:02 1441863 /lib64/libc-2.18.so 7f6e65724000-7f6e65728000 rwxp 00000000 00:00 0 7f6e65728000-7f6e6572b000 r-xp 00000000 fe:02 1441967 /lib64/libdl-2.18.so 7f6e6572b000-7f6e6592a000 ---p 00003000 fe:02 1441967 /lib64/libdl-2.18.so 7f6e6592a000-7f6e6592b000 r-xp 00002000 fe:02 1441967 /lib64/libdl-2.18.so 7f6e6592b000-7f6e6592c000 rwxp 00003000 fe:02 1441967 /lib64/libdl-2.18.so 7f6e6592c000-7f6e65a16000 r-xp 00000000 fe:02 22008 /usr/lib64/libstdc++.so.6.0.18 7f6e65a16000-7f6e65c15000 ---p 000ea000 fe:02 22008 /usr/lib64/libstdc++.so.6.0.18 7f6e65c15000-7f6e65c1d000 r-xp 000e9000 fe:02 22008 /usr/lib64/libstdc++.so.6.0.18 7f6e65c1d000-7f6e65c1f000 rwxp 000f1000 fe:02 22008 /usr/lib64/libstdc++.so.6.0.18 7f6e65c1f000-7f6e65c34000 rwxp 00000000 00:00 0 7f6e65c34000-7f6e65c3b000 r-xp 00000000 fe:02 1452659 /lib64/librt-2.18.so 7f6e65c3b000-7f6e65e3a000 ---p 00007000 fe:02 1452659 /lib64/librt-2.18.so 7f6e65e3a000-7f6e65e3b000 r-xp 00006000 fe:02 1452659 /lib64/librt-2.18.so 7f6e65e3b000-7f6e65e3c000 rwxp 00007000 fe:02 1452659 /lib64/librt-2.18.so 7f6e65e3c000-7f6e65e54000 r-xp 00000000 fe:02 1452649 /lib64/libpthread-2.18.so 7f6e65e54000-7f6e66054000 ---p 00018000 fe:02 1452649 /lib64/libpthread-2.18.so 7f6e66054000-7f6e66055000 r-xp 00018000 fe:02 1452649 /lib64/libpthread-2.18.so 7f6e66055000-7f6e66056000 rwxp 00019000 fe:02 1452649 /lib64/libpthread-2.18.so 7f6e66056000-7f6e6605a000 rwxp 00000000 00:00 0 7f6e6605a000-7f6e6615c000 r-xp 00000000 fe:02 1441968 /lib64/libm-2.18.so 7f6e6615c000-7f6e6635b000 ---p 00102000 fe:02 1441968 /lib64/libm-2.18.so 7f6e6635b000-7f6e6635c000 r-xp 00101000 fe:02 1441968 /lib64/libm-2.18.so 7f6e6635c000-7f6e6635d000 rwxp 00102000 fe:02 1441968 /lib64/libm-2.18.so 7f6e6635d000-7f6e6637d000 r-xp 00000000 fe:02 1452665 /lib64/ld-2.18.so 7f6e6653e000-7f6e66545000 rwxp 00000000 00:00 0 7f6e6657a000-7f6e6657c000 rwxp 00000000 00:00 0 7f6e6657c000-7f6e6657d000 r-xp 0001f000 fe:02 1452665 /lib64/ld-2.18.so 7f6e6657d000-7f6e6657e000 rwxp 00020000 fe:02 1452665 /lib64/ld-2.18.so 7f6e6657e000-7f6e6657f000 rwxp 00000000 00:00 0 7ffe9750d000-7ffe9752e000 rwxp 00000000 00:00 0 [stack] 7ffe975a5000-7ffe975a8000 r--p 00000000 00:00 0 [vvar] 7ffe975a8000-7ffe975aa000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] And gdb backtrace: Code:
$ gdb ./mprime core-mprime.1000.lakshmi.1485816220.16652.dump [...] Program terminated with signal SIGABRT, Aborted. #0 0x00007f6e653af4c9 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007f6e653af4c9 in raise () from /lib64/libc.so.6 #1 0x00007f6e653b0958 in abort () from /lib64/libc.so.6 #2 0x00007f6e653ed654 in __libc_message () from /lib64/libc.so.6 #3 0x00007f6e653f2eae in malloc_printerr () from /lib64/libc.so.6 #4 0x00007f6e653f3b87 in _int_free () from /lib64/libc.so.6 #5 0x000000000044dc49 in multithread_term () #6 0x000000000044dd3e in gwdone () #7 0x0000000000436182 in prime () #8 0x0000000000439155 in primeContinue () #9 0x0000000000439375 in LauncherDispatch () #10 0x00000000004394e8 in Launcher () #11 0x0000000000463efa in ThreadStarter () #12 0x00007f6e65e440db in start_thread () from /lib64/libpthread.so.0 #13 0x00007f6e65460e3d in clone () from /lib64/libc.so.6 (gdb) |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| I'm losing faith in my influence... | seba2122 | Prime Sierpinski Project | 2 | 2015-07-22 23:46 |
| Losing downdriver | henryzz | Aliquot Sequences | 28 | 2011-05-12 23:48 |
| Losing Downguide | henryzz | Aliquot Sequences | 5 | 2010-02-10 22:42 |
| How Do I Make Mprime Work In Ubuntu?? | hesdeadjim | Software | 4 | 2010-01-01 19:03 |
| Should mprime work like this? | Carlos | Software | 4 | 2005-08-27 22:06 |