mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   SkylakeX teasers (aka prime95 29.5) (https://www.mersenneforum.org/showthread.php?t=23723)

Prime95 2018-10-20 05:05

SkylakeX teasers (aka prime95 29.5)
 
If you own a Skylake-X machine (or a Xeon supporting AVX-512) and you are feeling brave then Prime95 version 29.5 build 2 supporting AVX-512 is available.

This is not heavily tested but ought to work. Doing a double-check or Gerbicz PRP might be wise. Let me know if you find bugs or it is not an improvement over 29.4.

Download link:

____________original d/l link removed_______________


Linux, Windows 64-bit: [COLOR=Red] EDIT: new download links in post #[URL="https://www.mersenneforum.org/showpost.php?p=506535&postcount=163"]163[/URL][/COLOR]

mackerel 2018-10-20 07:52

I'd like to run benchmarks with it once Windows version is available.

ET_ 2018-10-20 08:39

I'd like to run benchmarks with it once a 7820X is available to me. :smile:

ET_ 2018-10-20 11:46

[QUOTE=Prime95;498324]If you own a Skylake-X machine (or a Xeon supporting AVX-512) and you are feeling brave then Prime95 version 29.5 build 2 supporting AVX-512 is available.

This is not heavily tested but ought to work. Doing a double-check or Gerbicz PRP might be wise. Let me know if you find bugs or it is not an improvement over 29.4.

Download link:
Linux 64-bit: [URL]ftp://mersenne.org/gimps/p95v295b2.linux64.tar.gz[/URL]

If there is enough interest, I'll make a Windows build.[/QUOTE]

Is it safe stopping a run with v29.4 and restarting it with 29.5, or the savefile is not compatible?

[COLOR="Red"]EDIT:[/COLOR] Yess, it is. Restarted after a Jacobi test. Running a test on AWS.

Chuck 2018-10-20 14:25

I'd like a Windows version to try.

GP2 2018-10-20 14:59

It crashes in the benchmark:

[CODE]
Benchmark type (0 = Throughput, 1 = FFT timings, 2 = Trial factoring) (0):

FFTs to benchmark
Minimum FFT size (in K) (2048):
Maximum FFT size (in K) (8192):
Benchmark with round-off checking enabled (N):
Benchmark all-complex FFTs (for LLR,PFGW,PRP users) (N):
Limit FFT sizes (mimic older benchmarking code) (N):

CPU cores to benchmark
Benchmark hyperthreading (Y):

Throughput benchmark options
Benchmark all FFT implementations to find best one for your machine (N):
Time to run each benchmark (in seconds) (15):

Accept the answers above? (Y):

....

[Work thread Oct 20 14:43] Timing 4800K FFT, 1 core, 1 worker. Average times: 21.78 ms. Total throughput: 45.92 iter/sec.
[Work thread Oct 20 14:43] Timing 4800K FFT, 1 core hyperthreaded, 1 worker. Average times: 26.67 ms. Total throughput: 37.50 iter/sec.
free(): invalid pointer
Aborted
[/CODE]

Second run crashed in a different place:

[CODE]
[Work thread Oct 20 14:55] Timing 3000K FFT, 1 core, 1 worker. Average times: 11.69 ms. Total throughput: 85.56 iter/sec.
[Work thread Oct 20 14:56] Timing 3000K FFT, 1 core hyperthreaded, 1 worker. Average times: 13.79 ms. Total throughput: 72.49 iter/sec.
free(): invalid pointer
Aborted
[/CODE]


[CODE]
model name : Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
stepping : 4
microcode : 0x2000043
cpu MHz : 3411.502
cache size : 25344 KB
[/CODE]

ATH 2018-10-20 19:14

More crashes in the Throughput benchmark on EC2 c5d.large.

0: Throughput
1K - 8192K
roundoff: No
All-complex: No
Limit FFT sizes: No
hyperthreading: Y
all FFT: N
15 sec

After 12K FFT if cannot find next size, no matter if doing 1K-8192K or 10K-20K:
[QUOTE]*** Error in `./mprime': free(): invalid next size (normal): 0x00007f92200505e0 ***[/QUOTE]

At 4800K and 5760K it fails several times when benchmarking from lower sizes, but not when starting from those sizes going up.

Failed once at 46080K but not 2nd time.

[CODE]Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Xeon(R)
CPU speed: 2999.97 MHz, with hyperthreading
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA, AVX512F
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 25344 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Machine topology as determined by hwloc library:
Machine#0 (local=3803772KB, total=3803772KB, DMIProductName=c5d.large, DMIProductVersion=, DMIBoardVendor="Amazon EC2", DMIBoardName=, DMIBoardVersion=, DMIBoardAssetTag=i-083f59d8339403d6c, DMIChassisVendor="Amazon EC2", DMIChassisType=1, DMIChassisVersion=, DMIChassisAssetTag="Amazon EC2", DMIBIOSVendor="Amazon EC2", DMIBIOSVersion=1.0, DMIBIOSDate=10/16/2017, DMISysVendor="Amazon EC2", Backend=Linux, OSName=Linux, OSRelease=4.9.76-3.78.amzn1.x86_64, OSVersion="#1 SMP Fri Jan 12 19:51:35 UTC 2018", HostName=ip-172-31-35-125, Architecture=x86_64, hwlocVersion=1.11.10, ProcessName=mprime)
Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=85, CPUModel="Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz", CPUStepping=4)
L3 (size=25344KB, linesize=64, ways=11, Inclusive=0)
L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core#0 (cpuset: 0x00000003)
PU#0 (cpuset: 0x00000001)
PU#1 (cpuset: 0x00000002)





[Work thread Oct 20 16:49] Timing 12K FFT, 1 core, 1 worker. Average times: 0.02 ms. Total throughput: 40185.95 iter/sec.
[Work thread Oct 20 16:49] Timing 12K FFT, 1 core hyperthreaded, 1 worker. Average times: 0.04 ms. Total throughput: 23843.36 iter/sec.
*** Error in `./mprime': free(): invalid next size (normal): 0x00007f92200505e0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f9228ae2499]
./mprime[0x45ecf6]
./mprime[0x4409fc]
./mprime[0x440f9e]
./mprime[0x444bb0]
./mprime[0x444cf2]
./mprime[0x448d54]
./mprime[0x47cbca]
/lib64/libpthread.so.0(+0x7de5)[0x7f92294b7de5]
/lib64/libc.so.6(clone+0x6d)[0x7f9228b5fbad]
======= Memory map: ========
00400000-026a0000 r-xp 00000000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
0289f000-028a1000 r-xp 0229f000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028a1000-028dc000 rwxp 022a1000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028dc000-02903000 rwxp 00000000 00:00 0
02ffa000-0301b000 rwxp 00000000 00:00 0 [heap]
7f9218000000-7f921804b000 rwxp 00000000 00:00 0
7f921804b000-7f921c000000 ---p 00000000 00:00 0
7f921c000000-7f921c021000 rwxp 00000000 00:00 0
7f921c021000-7f9220000000 ---p 00000000 00:00 0
7f9220000000-7f9220076000 rwxp 00000000 00:00 0
7f9220076000-7f9224000000 ---p 00000000 00:00 0
7f9226847000-7f922685d000 r-xp 00000000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f922685d000-7f9226a5c000 ---p 00016000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f9226a5c000-7f9226a5d000 rwxp 00015000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f9226a5d000-7f9226a5e000 ---p 00000000 00:00 0
7f9226a5e000-7f922725e000 rwxp 00000000 00:00 0
7f922725e000-7f922725f000 ---p 00000000 00:00 0
7f922725f000-7f9227a5f000 rwxp 00000000 00:00 0
7f9227a5f000-7f9227a60000 ---p 00000000 00:00 0
7f9227a60000-7f9228260000 rwxp 00000000 00:00 0
7f9228260000-7f9228261000 ---p 00000000 00:00 0
7f9228261000-7f9228a61000 rwxp 00000000 00:00 0
7f9228a61000-7f9228c24000 r-xp 00000000 103:02 2690 /lib64/libc-2.17.so
7f9228c24000-7f9228e23000 ---p 001c3000 103:02 2690 /lib64/libc-2.17.so
7f9228e23000-7f9228e27000 r-xp 001c2000 103:02 2690 /lib64/libc-2.17.so
7f9228e27000-7f9228e29000 rwxp 001c6000 103:02 2690 /lib64/libc-2.17.so
7f9228e29000-7f9228e2e000 rwxp 00000000 00:00 0
7f9228e2e000-7f9228ea3000 r-xp 00000000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f9228ea3000-7f92290a2000 ---p 00075000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f92290a2000-7f92290a4000 rwxp 00074000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f92290a4000-7f92290a6000 r-xp 00000000 103:02 2696 /lib64/libdl-2.17.so
7f92290a6000-7f92292a6000 ---p 00002000 103:02 2696 /lib64/libdl-2.17.so
7f92292a6000-7f92292a7000 r-xp 00002000 103:02 2696 /lib64/libdl-2.17.so
7f92292a7000-7f92292a8000 rwxp 00003000 103:02 2696 /lib64/libdl-2.17.so
7f92292a8000-7f92292af000 r-xp 00000000 103:02 2720 /lib64/librt-2.17.so
7f92292af000-7f92294ae000 ---p 00007000 103:02 2720 /lib64/librt-2.17.so
7f92294ae000-7f92294af000 r-xp 00006000 103:02 2720 /lib64/librt-2.17.so
7f92294af000-7f92294b0000 rwxp 00007000 103:02 2720 /lib64/librt-2.17.so
7f92294b0000-7f92294c7000 r-xp 00000000 103:02 2716 /lib64/libpthread-2.17.so
7f92294c7000-7f92296c6000 ---p 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f92296c6000-7f92296c7000 r-xp 00016000 103:02 2716 /lib64/libpthread-2.17.so
7f92296c7000-7f92296c8000 rwxp 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f92296c8000-7f92296cc000 rwxp 00000000 00:00 0
7f92296cc000-7f92297cd000 r-xp 00000000 103:02 2698 /lib64/libm-2.17.so
7f92297cd000-7f92299cc000 ---p 00101000 103:02 2698 /lib64/libm-2.17.so
7f92299cc000-7f92299cd000 r-xp 00100000 103:02 2698 /lib64/libm-2.17.so
7f92299cd000-7f92299ce000 rwxp 00101000 103:02 2698 /lib64/libm-2.17.so
7f92299ce000-7f92299f0000 r-xp 00000000 103:02 2683 /lib64/ld-2.17.so
7f9229be4000-7f9229be9000 rwxp 00000000 00:00 0
7f9229bec000-7f9229bef000 rwxp 00000000 00:00 0
7f9229bef000-7f9229bf0000 r-xp 00021000 103:02 2683 /lib64/ld-2.17.so
7f9229bf0000-7f9229bf1000 rwxp 00022000 103:02 2683 /lib64/ld-2.17.so
7f9229bf1000-7f9229bf2000 rwxp 00000000 00:00 0
7ffce22f0000-7ffce2311000 rwxp 00000000 00:00 0 [stack]
7ffce23d6000-7ffce23d8000 r--p 00000000 00:00 0 [vvar]
7ffce23d8000-7ffce23da000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]






[Work thread Oct 20 17:51] Timing 4800K FFT, 1 core, 1 worker. Average times: 18.15 ms. Total throughput: 55.10 iter/sec.
[Work thread Oct 20 17:51] Timing 4800K FFT, 1 core hyperthreaded, 1 worker. Average times: 23.30 ms. Total throughput: 42.91 iter/sec.
*** Error in `./mprime': free(): invalid pointer: 0x00007f619843cbc0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f61a0776499]
./mprime[0x4582e8]
./mprime[0x45ed65]
./mprime[0x440a3a]
./mprime[0x440f9e]
./mprime[0x444bb0]
./mprime[0x444cf2]
./mprime[0x448d54]
./mprime[0x47cbca]
/lib64/libpthread.so.0(+0x7de5)[0x7f61a114bde5]
/lib64/libc.so.6(clone+0x6d)[0x7f61a07f3bad]
======= Memory map: ========
00400000-026a0000 r-xp 00000000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
0289f000-028a1000 r-xp 0229f000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028a1000-028dc000 rwxp 022a1000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028dc000-02903000 rwxp 00000000 00:00 0
03e2d000-03e4e000 rwxp 00000000 00:00 0 [heap]
7f618fdea000-7f618fe00000 r-xp 00000000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f618fe00000-7f618ffff000 ---p 00016000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f618ffff000-7f6190000000 rwxp 00015000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f6190000000-7f6192743000 rwxp 00000000 00:00 0
7f6192743000-7f6194000000 ---p 00000000 00:00 0
7f6194000000-7f6194021000 rwxp 00000000 00:00 0
7f6194021000-7f6198000000 ---p 00000000 00:00 0
7f6198000000-7f619896e000 rwxp 00000000 00:00 0
7f619896e000-7f619c000000 ---p 00000000 00:00 0
7f619c158000-7f619e6f1000 rwxp 00000000 00:00 0
7f619e6f1000-7f619e6f2000 ---p 00000000 00:00 0
7f619e6f2000-7f619eef2000 rwxp 00000000 00:00 0
7f619eef2000-7f619eef3000 ---p 00000000 00:00 0
7f619eef3000-7f619f6f3000 rwxp 00000000 00:00 0
7f619f6f3000-7f619f6f4000 ---p 00000000 00:00 0
7f619f6f4000-7f619fef4000 rwxp 00000000 00:00 0
7f619fef4000-7f619fef5000 ---p 00000000 00:00 0
7f619fef5000-7f61a06f5000 rwxp 00000000 00:00 0
7f61a06f5000-7f61a08b8000 r-xp 00000000 103:02 2690 /lib64/libc-2.17.so
7f61a08b8000-7f61a0ab7000 ---p 001c3000 103:02 2690 /lib64/libc-2.17.so
7f61a0ab7000-7f61a0abb000 r-xp 001c2000 103:02 2690 /lib64/libc-2.17.so
7f61a0abb000-7f61a0abd000 rwxp 001c6000 103:02 2690 /lib64/libc-2.17.so
7f61a0abd000-7f61a0ac2000 rwxp 00000000 00:00 0
7f61a0ac2000-7f61a0b37000 r-xp 00000000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f61a0b37000-7f61a0d36000 ---p 00075000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f61a0d36000-7f61a0d38000 rwxp 00074000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f61a0d38000-7f61a0d3a000 r-xp 00000000 103:02 2696 /lib64/libdl-2.17.so
7f61a0d3a000-7f61a0f3a000 ---p 00002000 103:02 2696 /lib64/libdl-2.17.so
7f61a0f3a000-7f61a0f3b000 r-xp 00002000 103:02 2696 /lib64/libdl-2.17.so
7f61a0f3b000-7f61a0f3c000 rwxp 00003000 103:02 2696 /lib64/libdl-2.17.so
7f61a0f3c000-7f61a0f43000 r-xp 00000000 103:02 2720 /lib64/librt-2.17.so
7f61a0f43000-7f61a1142000 ---p 00007000 103:02 2720 /lib64/librt-2.17.so
7f61a1142000-7f61a1143000 r-xp 00006000 103:02 2720 /lib64/librt-2.17.so
7f61a1143000-7f61a1144000 rwxp 00007000 103:02 2720 /lib64/librt-2.17.so
7f61a1144000-7f61a115b000 r-xp 00000000 103:02 2716 /lib64/libpthread-2.17.so
7f61a115b000-7f61a135a000 ---p 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f61a135a000-7f61a135b000 r-xp 00016000 103:02 2716 /lib64/libpthread-2.17.so
7f61a135b000-7f61a135c000 rwxp 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f61a135c000-7f61a1360000 rwxp 00000000 00:00 0
7f61a1360000-7f61a1461000 r-xp 00000000 103:02 2698 /lib64/libm-2.17.so
7f61a1461000-7f61a1660000 ---p 00101000 103:02 2698 /lib64/libm-2.17.so
7f61a1660000-7f61a1661000 r-xp 00100000 103:02 2698 /lib64/libm-2.17.so
7f61a1661000-7f61a1662000 rwxp 00101000 103:02 2698 /lib64/libm-2.17.so
7f61a1662000-7f61a1684000 r-xp 00000000 103:02 2683 /lib64/ld-2.17.so
7f61a1878000-7f61a187d000 rwxp 00000000 00:00 0
7f61a1880000-7f61a1883000 rwxp 00000000 00:00 0
7f61a1883000-7f61a1884000 r-xp 00021000 103:02 2683 /lib64/ld-2.17.so
7f61a1884000-7f61a1885000 rwxp 00022000 103:02 2683 /lib64/ld-2.17.so
7f61a1885000-7f61a1886000 rwxp 00000000 00:00 0
7ffd15a2b000-7ffd15a4c000 rwxp 00000000 00:00 0 [stack]
7ffd15ad3000-7ffd15ad5000 r--p 00000000 00:00 0 [vvar]
7ffd15ad5000-7ffd15ad7000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]





[Work thread Oct 20 18:17] Timing 5760K FFT, 1 core, 1 worker. Average times: 22.84 ms. Total throughput: 43.78 iter/sec.
[Work thread Oct 20 18:17] Timing 5760K FFT, 1 core hyperthreaded, 1 worker. Average times: 29.25 ms. Total throughput: 34.19 iter/sec.
*** Error in `./mprime': free(): invalid pointer: 0x00007f0f304febc0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f0f37d4f499]
./mprime[0x4582e8]
./mprime[0x45ed65]
./mprime[0x440a3a]
./mprime[0x440f9e]
./mprime[0x444bb0]
./mprime[0x444cf2]
./mprime[0x448d54]
./mprime[0x47cbca]
/lib64/libpthread.so.0(+0x7de5)[0x7f0f38724de5]
/lib64/libc.so.6(clone+0x6d)[0x7f0f37dccbad]
======= Memory map: ========
00400000-026a0000 r-xp 00000000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
0289f000-028a1000 r-xp 0229f000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028a1000-028dc000 rwxp 022a1000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028dc000-02903000 rwxp 00000000 00:00 0
03431000-03452000 rwxp 00000000 00:00 0 [heap]
7f0f20000000-7f0f20021000 rwxp 00000000 00:00 0
7f0f20021000-7f0f24000000 ---p 00000000 00:00 0
7f0f28000000-7f0f28b4c000 rwxp 00000000 00:00 0
7f0f28b4c000-7f0f2c000000 ---p 00000000 00:00 0
7f0f2d2e7000-7f0f30000000 rwxp 00000000 00:00 0
7f0f30000000-7f0f305c4000 rwxp 00000000 00:00 0
7f0f305c4000-7f0f34000000 ---p 00000000 00:00 0
7f0f358b7000-7f0f358b8000 ---p 00000000 00:00 0
7f0f358b8000-7f0f360b8000 rwxp 00000000 00:00 0
7f0f362b5000-7f0f362cb000 r-xp 00000000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f0f362cb000-7f0f364ca000 ---p 00016000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f0f364ca000-7f0f364cb000 rwxp 00015000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f0f364cb000-7f0f364cc000 ---p 00000000 00:00 0
7f0f364cc000-7f0f36ccc000 rwxp 00000000 00:00 0
7f0f36ccc000-7f0f36ccd000 ---p 00000000 00:00 0
7f0f36ccd000-7f0f374cd000 rwxp 00000000 00:00 0
7f0f374cd000-7f0f374ce000 ---p 00000000 00:00 0
7f0f374ce000-7f0f37cce000 rwxp 00000000 00:00 0
7f0f37cce000-7f0f37e91000 r-xp 00000000 103:02 2690 /lib64/libc-2.17.so
7f0f37e91000-7f0f38090000 ---p 001c3000 103:02 2690 /lib64/libc-2.17.so
7f0f38090000-7f0f38094000 r-xp 001c2000 103:02 2690 /lib64/libc-2.17.so
7f0f38094000-7f0f38096000 rwxp 001c6000 103:02 2690 /lib64/libc-2.17.so
7f0f38096000-7f0f3809b000 rwxp 00000000 00:00 0
7f0f3809b000-7f0f38110000 r-xp 00000000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f0f38110000-7f0f3830f000 ---p 00075000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f0f3830f000-7f0f38311000 rwxp 00074000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f0f38311000-7f0f38313000 r-xp 00000000 103:02 2696 /lib64/libdl-2.17.so
7f0f38313000-7f0f38513000 ---p 00002000 103:02 2696 /lib64/libdl-2.17.so
7f0f38513000-7f0f38514000 r-xp 00002000 103:02 2696 /lib64/libdl-2.17.so
7f0f38514000-7f0f38515000 rwxp 00003000 103:02 2696 /lib64/libdl-2.17.so
7f0f38515000-7f0f3851c000 r-xp 00000000 103:02 2720 /lib64/librt-2.17.so
7f0f3851c000-7f0f3871b000 ---p 00007000 103:02 2720 /lib64/librt-2.17.so
7f0f3871b000-7f0f3871c000 r-xp 00006000 103:02 2720 /lib64/librt-2.17.so
7f0f3871c000-7f0f3871d000 rwxp 00007000 103:02 2720 /lib64/librt-2.17.so
7f0f3871d000-7f0f38734000 r-xp 00000000 103:02 2716 /lib64/libpthread-2.17.so
7f0f38734000-7f0f38933000 ---p 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f0f38933000-7f0f38934000 r-xp 00016000 103:02 2716 /lib64/libpthread-2.17.so
7f0f38934000-7f0f38935000 rwxp 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f0f38935000-7f0f38939000 rwxp 00000000 00:00 0
7f0f38939000-7f0f38a3a000 r-xp 00000000 103:02 2698 /lib64/libm-2.17.so
7f0f38a3a000-7f0f38c39000 ---p 00101000 103:02 2698 /lib64/libm-2.17.so
7f0f38c39000-7f0f38c3a000 r-xp 00100000 103:02 2698 /lib64/libm-2.17.so
7f0f38c3a000-7f0f38c3b000 rwxp 00101000 103:02 2698 /lib64/libm-2.17.so
7f0f38c3b000-7f0f38c5d000 r-xp 00000000 103:02 2683 /lib64/ld-2.17.so
7f0f38e51000-7f0f38e56000 rwxp 00000000 00:00 0
7f0f38e59000-7f0f38e5c000 rwxp 00000000 00:00 0
7f0f38e5c000-7f0f38e5d000 r-xp 00021000 103:02 2683 /lib64/ld-2.17.so
7f0f38e5d000-7f0f38e5e000 rwxp 00022000 103:02 2683 /lib64/ld-2.17.so
7f0f38e5e000-7f0f38e5f000 rwxp 00000000 00:00 0
7ffc599f3000-7ffc59a14000 rwxp 00000000 00:00 0 [stack]
7ffc59a63000-7ffc59a65000 r--p 00000000 00:00 0 [vvar]
7ffc59a65000-7ffc59a67000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]







[Work thread Oct 20 18:56] Timing 46080K FFT, 1 core, 1 worker. Average times: 222.34 ms. Total throughput: 4.50 iter/sec.
[Work thread Oct 20 18:56] Timing 46080K FFT, 1 core hyperthreaded, 1 worker. Average times: 288.53 ms. Total throughput: 3.47 iter/sec.
*** Error in `./mprime': free(): invalid pointer: 0x00007f71224ca3c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f71264a3499]
./mprime[0x4582e8]
./mprime[0x45ed65]
./mprime[0x440a3a]
./mprime[0x440f9e]
./mprime[0x444bb0]
./mprime[0x444cf2]
./mprime[0x448d54]
./mprime[0x47cbca]
/lib64/libpthread.so.0(+0x7de5)[0x7f7126e78de5]
/lib64/libc.so.6(clone+0x6d)[0x7f7126520bad]
======= Memory map: ========
00400000-026a0000 r-xp 00000000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
0289f000-028a1000 r-xp 0229f000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028a1000-028dc000 rwxp 022a1000 00:15 5878764499047460622 /mnt-efs/z/p95v295b2/mprime
028dc000-02903000 rwxp 00000000 00:00 0
02c50000-02c71000 rwxp 00000000 00:00 0 [heap]
7f70f9277000-7f7110000000 rwxp 00000000 00:00 0
7f7110000000-7f7110021000 rwxp 00000000 00:00 0
7f7110021000-7f7114000000 ---p 00000000 00:00 0
7f7118000000-7f711a50c000 rwxp 00000000 00:00 0
7f711a50c000-7f711c000000 ---p 00000000 00:00 0
7f711cae6000-7f711cae7000 ---p 00000000 00:00 0
7f711cae7000-7f711d2e7000 rwxp 00000000 00:00 0
7f7120000000-7f7122ab2000 rwxp 00000000 00:00 0
7f7122ab2000-7f7124000000 ---p 00000000 00:00 0
7f7124a09000-7f7124a1f000 r-xp 00000000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f7124a1f000-7f7124c1e000 ---p 00016000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f7124c1e000-7f7124c1f000 rwxp 00015000 103:02 2617 /lib64/libgcc_s-7-20170915.so.1
7f7124c1f000-7f7124c20000 ---p 00000000 00:00 0
7f7124c20000-7f7125420000 rwxp 00000000 00:00 0
7f7125420000-7f7125421000 ---p 00000000 00:00 0
7f7125421000-7f7125c21000 rwxp 00000000 00:00 0
7f7125c21000-7f7125c22000 ---p 00000000 00:00 0
7f7125c22000-7f7126422000 rwxp 00000000 00:00 0
7f7126422000-7f71265e5000 r-xp 00000000 103:02 2690 /lib64/libc-2.17.so
7f71265e5000-7f71267e4000 ---p 001c3000 103:02 2690 /lib64/libc-2.17.so
7f71267e4000-7f71267e8000 r-xp 001c2000 103:02 2690 /lib64/libc-2.17.so
7f71267e8000-7f71267ea000 rwxp 001c6000 103:02 2690 /lib64/libc-2.17.so
7f71267ea000-7f71267ef000 rwxp 00000000 00:00 0
7f71267ef000-7f7126864000 r-xp 00000000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f7126864000-7f7126a63000 ---p 00075000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f7126a63000-7f7126a65000 rwxp 00074000 103:02 3438 /usr/lib64/libgmp.so.10.2.0
7f7126a65000-7f7126a67000 r-xp 00000000 103:02 2696 /lib64/libdl-2.17.so
7f7126a67000-7f7126c67000 ---p 00002000 103:02 2696 /lib64/libdl-2.17.so
7f7126c67000-7f7126c68000 r-xp 00002000 103:02 2696 /lib64/libdl-2.17.so
7f7126c68000-7f7126c69000 rwxp 00003000 103:02 2696 /lib64/libdl-2.17.so
7f7126c69000-7f7126c70000 r-xp 00000000 103:02 2720 /lib64/librt-2.17.so
7f7126c70000-7f7126e6f000 ---p 00007000 103:02 2720 /lib64/librt-2.17.so
7f7126e6f000-7f7126e70000 r-xp 00006000 103:02 2720 /lib64/librt-2.17.so
7f7126e70000-7f7126e71000 rwxp 00007000 103:02 2720 /lib64/librt-2.17.so
7f7126e71000-7f7126e88000 r-xp 00000000 103:02 2716 /lib64/libpthread-2.17.so
7f7126e88000-7f7127087000 ---p 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f7127087000-7f7127088000 r-xp 00016000 103:02 2716 /lib64/libpthread-2.17.so
7f7127088000-7f7127089000 rwxp 00017000 103:02 2716 /lib64/libpthread-2.17.so
7f7127089000-7f712708d000 rwxp 00000000 00:00 0
7f712708d000-7f712718e000 r-xp 00000000 103:02 2698 /lib64/libm-2.17.so
7f712718e000-7f712738d000 ---p 00101000 103:02 2698 /lib64/libm-2.17.so
7f712738d000-7f712738e000 r-xp 00100000 103:02 2698 /lib64/libm-2.17.so
7f712738e000-7f712738f000 rwxp 00101000 103:02 2698 /lib64/libm-2.17.so
7f712738f000-7f71273b1000 r-xp 00000000 103:02 2683 /lib64/ld-2.17.so
7f71275a5000-7f71275aa000 rwxp 00000000 00:00 0
7f71275ad000-7f71275b0000 rwxp 00000000 00:00 0
7f71275b0000-7f71275b1000 r-xp 00021000 103:02 2683 /lib64/ld-2.17.so
7f71275b1000-7f71275b2000 rwxp 00022000 103:02 2683 /lib64/ld-2.17.so
7f71275b2000-7f71275b3000 rwxp 00000000 00:00 0
7ffdd4a9a000-7ffdd4abb000 rwxp 00000000 00:00 0 [stack]
7ffdd4adc000-7ffdd4ade000 r--p 00000000 00:00 0 [vvar]
7ffdd4ade000-7ffdd4ae0000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

[/CODE]

fivemack 2018-10-20 19:36

I tried running the benchmark on my i9-7940X and got very confusing behaviour : 'mpstat -P ALL 1' showed that nothing was scheduled on cores 14-27 (the other hyper threads of 0-13) when the test was running 28 workers, but 'top' indicated that it was using more than 1400% CPU.

Prime95 2018-10-20 21:15

Grr. a memory corruption problem.

I did run a full benchmark, but with an earlier debug version that allocated more memory for sin/cos tables than was necessary.

I'll work on a fix and post a Linux & Windows binary.

@Fivemack: I'll check that the machine I built on has hwloc 2.0 installed.

Prime95 2018-10-21 01:25

The length 12800 and 120K FFTs were allocating too little memory. Any FFT with a large pass 1 size was zeroing several hundred bytes too many (which may or may not cause an issue based on how the allocated memory block was aligned to a 4KB boundary).

My SkyLakeX had hwloc 1.11 installed -- hopefully the newer hwloc will solve fivemack's issues.

Running a few more tests, then I'll build new executables.

Prime95 2018-10-21 04:02

29.5 build 3. Hopefully I didn't do anything stupid building these.

Linux 64-bit: [URL="https://www.dropbox.com/s/x08exhqopjia7f6/p95v295b3.linux64.tar.gz?dl=0"]https://www.dropbox.com/s/x08exhqopjia7f6/p95v295b3.linux64.tar.gz?dl=0[/URL]
Windows 64-bit (totally untested): [URL="https://www.dropbox.com/s/hq6bwc4tgubkjy8/p95v295b3.win64.zip?dl=0"]https://www.dropbox.com/s/hq6bwc4tgubkjy8/p95v295b3.win64.zip?dl=0[/URL]

Chuck 2018-10-21 05:59

This is an excellent improvement. I am running a PRP first test with an AVX-512 FFT length 4608K and the speed is about 20% faster.

The test was only about 3% complete, so I just shut down Prime95, copied in the new executable (Windows) and restarted. It resumed from the savefile with no difficulty.

mackerel 2018-10-21 09:29

Doing testing on 7800X now... getting some scary temperatures if HT used, >100C on a delidded CPU, with TIM (can't use liquid metal as testing extreme cooling other times), and watercooling loop. Temps more sane if HT disabled. CPU is at stock settings apart from cache at 3000 (2000 stock, didn't touch voltage).

GP2 2018-10-21 12:32

[QUOTE=mackerel;498421]Doing testing on 7800X now... getting some scary temperatures if HT used, >100C on a delidded CPU, with TIM (can't use liquid metal as testing extreme cooling other times), and watercooling loop. Temps more sane if HT disabled.[/QUOTE]

The benchmarks I did (before they crashed) indicated that on Skylake AVX-512, hyperthreading gives lower throughput with v 29.5, as opposed to v 29.4 where the opposite was true.

GP2 2018-10-21 12:39

Since it's a new release, has any further thought been given to enabling an option to output 2048-bit residues?

Prime95 2018-10-21 15:16

[QUOTE=GP2;498424]Since it's a new release, has any further thought been given to enabling an option to output 2048-bit residues?[/QUOTE]

I believe that feature is on by default for PRP tests.

mackerel 2018-10-21 17:00

On 7800X, running 1 core per worker, saw ball park 70% increase in throughput for small FFTs, up to around 256k, where it drops into the ram limited zone around 512k FFT. I can't be 100% sure I didn't change my ram comparing it to older test, but there's still a boost ball park 10% there.

Harder to say what's happening with 1 worker 6 cores, but there seems to be a clear gain between 1024k - 2560k FFT, less clear either side of that.

I did a spot check for power usage at 64k FFT with 6 threads in place stress test. Estimated CPU power usage went up 25%, or system power up 14%, for 60%+ throughput increase sounds like a good deal to me. Assuming this still applies to other relatively small FFT sizes and I can keep temperatures in check. Will look forward to this getting rolled into LLR.

tshinozk 2018-10-22 04:23

I run benchmark with all-complex FFT in windows 10 on 7980XE at stock setting.
It shows 50%+ improvement on multithreaded works (i.e. within L3 chache )

However, it hang while 10cores hyperthreaded 4workers test, where CPU usage remains 0% on taskmanager.

[Main thread Oct 22 09:50] Mersenne number primality test program version 29.5
[Main thread Oct 22 09:50] Optimizing for CPU architecture: Core i3/i5/i7, L2 cache size: 256 KB, L3 cache size: 25344 KB
...
[Worker #1 Oct 22 10:16] Timing 2048K all-complex FFT, 10 cores, 7 workers. Average times: 11.57, 11.75, 11.74, 11.51, 5.81, 4.57, 4.84 ms. Total throughput: 941.13 iter/sec.
[Worker #1 Oct 22 10:16] Timing 2048K all-complex FFT, 10 cores, 8 workers. Average times: 11.72, 10.76, 11.86, 11.78, 11.78, 11.88, 5.29, 4.44 ms. Total throughput: 930.87 iter/sec.
[Worker #1 Oct 22 10:16] Timing 2048K all-complex FFT, 10 cores, 9 workers. Average times: 11.99, 10.89, 11.80, 12.06, 11.99, 12.02, 11.81, 10.91, 4.55 ms. Total throughput: 905.85 iter/sec.
[Worker #1 Oct 22 10:17] Timing 2048K all-complex FFT, 10 cores, 10 workers. Average times: 12.50, 12.32, 11.13, 12.25, 12.27, 12.23, 12.51, 12.22, 12.23, 11.15 ms. Total throughput: 829.16 iter/se
[Worker #1 Oct 22 10:17] Timing 2048K all-complex FFT, 10 cores hyperthreaded, 1 worker. Average times: 0.78 ms. Total throughput: 1289.81 iter/sec.
[Worker #1 Oct 22 10:17] Timing 2048K all-complex FFT, 10 cores hyperthreaded, 2 workers. Average times: 1.43, 1.43 ms. Total throughput: 1401.32 iter/sec.
[Worker #1 Oct 22 10:17] Timing 2048K all-complex FFT, 10 cores hyperthreaded, 3 workers. Average times: 3.47, 3.25, 2.10 ms. Total throughput: 1071.34 iter/sec.
[Worker #1 Oct 22 10:18] Timing 2048K all-complex FFT, 10 cores hyperthreaded, 4 workers. [Main thread Oct 22 10:29] Stopping all worker threads.

I stopped it manually and because I can not exit it, I killed the prime95 using taskmanager.
It is reproducible, but at other positions.

ATH 2018-10-23 15:42

I resumed an ongoing PRP DC assignment on EC2 with 29.5b2 and later b3. There might be an issue with FFT selection.

I did not check if worktodo.txt had FFT2=4M in the line but I doubt it since 29.5b2 started out at 4200K AVX512 FFT.

Quickly it switched to 4M FFT which is too low it seems because it got a lot of ROUNDOFF > 0.4 errors.
I noticed now that it had FFT2=4M in the worktodo line and I removed it, but it still chose 4M FFT when I restarted.
I stopped again and added FFT2=4200K in the line instead so it will finish the exponent at 4200K.

I have SoftCrossoverAdjust=-0.004 in prime.txt but it never tested FFT size because it was halfway through already probably.

[URL="http://hoegge.dk/mersenne/295b3.txt"]295b3.txt[/URL]

ATH 2018-10-23 16:51

Throughput benchmark when you choose "Limit FFT sizes (mimic older benchmarking code) (N):" to No, it still skips 4096K (4M) and 8192K (8M) FFTs, even though it clearly uses 4M AVX512 FFT, see previous post.

If you choose "Limit FFT sizes (mimic older benchmarking code) (N):" to Yes it only tests 1 FFT size and then stops even though a large range was chosen.

GP2 2018-10-23 17:45

PRP fails with very small exponents
 
1 Attachment(s)
I am attaching a worktodo.txt file suitable for PRP cofactor testing of the known fully-factored Mersenne exponents.

With 29.5 there are problems with very small exponents.

For the tiniest exponents the program halts with this kind of error message:

[CODE]
[Tue Oct 23 17:16:24 2018]
PRP cannot initialize FFT code for M11, errcode=1002
Number sent to gwsetup is too large for the FFTs to handle.
[Tue Oct 23 17:17:03 2018]
PRP cannot initialize FFT code for M101, errcode=1002
Number sent to gwsetup is too large for the FFTs to handle.
[Tue Oct 23 17:17:25 2018]
PRP cannot initialize FFT code for M503, errcode=1002
Number sent to gwsetup is too large for the FFTs to handle.
[Tue Oct 23 17:18:11 2018]
PRP cannot initialize FFT code for M1009, errcode=1002
Number sent to gwsetup is too large for the FFTs to handle.
[/CODE]

For exponents like 1531 and 2069, you get an infinite loop:

[CODE]
[Tue Oct 23 17:18:45 2018]
ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
Continuing from last save file.
ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
Continuing from last save file.
ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
Continuing from last save file.
...
[/CODE]

The save file gets created during the run itself, it shows one iteration completed.

For exponent 3041 it worked, but it had "finite loop" issues at the start:

[CODE]
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3041/24329/5565031 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
[Work thread Oct 23 17:20] Continuing from last save file.
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3041/24329/5565031 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
[Work thread Oct 23 17:20] Continuing from last save file.
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3041/24329/5565031 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
[Work thread Oct 23 17:20] Continuing from last save file.
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3041/24329/5565031 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
[Work thread Oct 23 17:20] Continuing from last save file.
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3041/24329/5565031 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] Gerbicz error check passed at iteration 3025.
[Work thread Oct 23 17:20] M3041/24329/5565031 is a probable prime! Wh8: 17C217C2,2135,00400000
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3079/25324846649810648887383180721 using AVX-512 FFT length 1K
[/CODE]

And for larger exponents all is well.

Similar behavior is seen for Wagstaff PRP testing.

We do occasionally see new factors for exponents this small, for instance in 2017 and 2018 there were new factors for [M]1471[/M], [M]1489[/M], [M]1549[/M], [M]2789[/M], [M]2819[/M], [M]2861[/M], [M]2957[/M].

GP2 2018-10-23 21:06

Actually, taking a closer look, the exponent 3547 also had "finite loop" issues, just like 3041. However the intermediate exponents 3079, 3259, 3359 did not.

[CODE]
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3547/148823192092809407/1948447035193 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
[Work thread Oct 23 17:20] Continuing from last save file.
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3547/148823192092809407/1948447035193 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration 0.
[Work thread Oct 23 17:20] Continuing from last save file.
[Work thread Oct 23 17:20] Starting Gerbicz error-checking PRP test of M3547/148823192092809407/1948447035193 using AVX-512 FFT length 1K
[Work thread Oct 23 17:20] Gerbicz error check passed at iteration 3481.
[Work thread Oct 23 17:20] Gerbicz error check passed at iteration 3545.
[Work thread Oct 23 17:20] M3547/148823192092809407/1948447035193 is a probable prime! Wh8: 1BB61BB6,1795,00200000
[/CODE]

Prime95 2018-10-23 23:26

[QUOTE=GP2;498588]With 29.5 there are problems with very small exponents..[/QUOTE]

Yes, I should have mentioned this. The smallest AVX-512 FFT is 1K. There may be issues with propagating carries when there are only 3 or 4 bits per FFT word.

I intend to revert to AVX FFTs for small exponents, but I have not yet investigated where the crossover needs to be.

Prime95 2018-10-23 23:36

[QUOTE=ATH;498581]Throughput benchmark when you choose "Limit FFT sizes (mimic older benchmarking code) (N):" to No, it still skips 4096K (4M) and 8192K (8M) FFTs, even though it clearly uses 4M AVX512 FFT, see previous post.

If you choose "Limit FFT sizes (mimic older benchmarking code) (N):" to Yes it only tests 1 FFT size and then stops even though a large range was chosen.[/QUOTE]

Believe it or not this is expected. The 4M and 8M FFT sizes exist and you can force prime95 to use them with the FFT2= worktodo trick. However, the next larger FFT size gets more throughput (for me at least). Thus, the default setup for prime95 is to not use the 4M and 8M FFT sizes. You can benchmark these sizes by selecting the bench-all-implementations option.

If 4M and 8M has better throughput than the next larger FFT size for you, that would be interesting.

Note that anomalies such as slower than expected 4M and 8M timings needs to be looked into by me. They could indicate some macros that need more optimization, a memory layout problem, or a prefetching bug.

GP2 2018-10-24 01:02

[QUOTE=Prime95;498622]Yes, I should have mentioned this. The smallest AVX-512 FFT is 1K. There may be issues with propagating carries when there are only 3 or 4 bits per FFT word.[/QUOTE]

I wonder why exponents 3041 and 3547 do succeed eventually after failing the first few times. You'd think it would be an infinite loop. Is some parameter randomly changed each time the program tries to recover from a Gerbicz checksum error?

ATH 2018-10-24 05:20

[QUOTE=Prime95;498624]If 4M and 8M has better throughput than the next larger FFT size for you, that would be interesting.[/QUOTE]

It seems to be better at 4M if you look at the log I linked:
[URL="http://hoegge.dk/mersenne/295b3.txt"]295b3.txt[/URL]

The best benchmarks:
[Main thread Oct 21 05:20:46] Timing 4096K FFT, 1 core hyperthreaded, 1 worker. Average times: 13.02 ms. Total throughput: 76.78 iter/sec.
[Main thread Oct 21 05:21:24] Timing 4116K FFT, 1 core, 1 worker. Average times: 13.52 ms. Total throughput: 73.95 iter/sec.
[Main thread Oct 21 05:22:52] Timing 4200K FFT, 1 core hyperthreaded, 1 worker. Average times: 13.80 ms. Total throughput: 72.49 iter/sec.

(4116K FFT is not part of throughput benchmark either)

The average iteration time during the 4M part: ms/iter: 13.041
after the switch to 4200K: ms/iter: 16.731

Prime95 2018-10-24 05:26

[QUOTE=ATH;498641]
The best benchmarks:
[Main thread Oct 21 05:20:46] Timing 4096K FFT, 1 core hyperthreaded, 1 worker. Average times: 13.02 ms. Total throughput: 76.78 iter/sec.
[Main thread Oct 21 05:21:24] Timing 4116K FFT, 1 core, 1 worker. Average times: 13.52 ms. Total throughput: 73.95 iter/sec.[/QUOTE]

I based my default FFT selection on the throughput benchmark using 8 cores (of my 8 core SkylakeX).

Are you benching an Amazon ECX instance?

ATH 2018-10-24 05:35

Yes this exponent is running on a c5d.large which has 2 vCPU, so it is just 1 core hyperthreaded.

HyperthreadLL=1 is a tiny bit faster (this is from another instance):

[CODE]With Hyperthreading:
[Work thread Oct 21 10:25:06] Iteration: 43100000 / 77310307 [55.749358%], roundoff: 0.362, ms/iter: 13.917, ETA: 5d 12:15
[Work thread Oct 21 10:48:19] Iteration: 43200000 / 77310307 [55.878707%], roundoff: 0.362, ms/iter: 13.906, ETA: 5d 11:45
[Work thread Oct 21 11:11:30] Iteration: 43300000 / 77310307 [56.008055%], roundoff: 0.362, ms/iter: 13.890, ETA: 5d 11:13
[Work thread Oct 21 11:34:42] Iteration: 43400000 / 77310307 [56.137404%], roundoff: 0.362, ms/iter: 13.897, ETA: 5d 10:53
[Work thread Oct 21 11:57:53] Iteration: 43500000 / 77310307 [56.266753%], roundoff: 0.362, ms/iter: 13.896, ETA: 5d 10:30
[Work thread Oct 21 12:21:04] Iteration: 43600000 / 77310307 [56.396102%], roundoff: 0.362, ms/iter: 13.892, ETA: 5d 10:05
[Work thread Oct 21 12:44:16] Iteration: 43700000 / 77310307 [56.525451%], roundoff: 0.362, ms/iter: 13.893, ETA: 5d 09:42
[Work thread Oct 21 13:07:26] Iteration: 43800000 / 77310307 [56.654800%], roundoff: 0.362, ms/iter: 13.891, ETA: 5d 09:18
[Work thread Oct 21 13:30:38] Iteration: 43900000 / 77310307 [56.784149%], roundoff: 0.362, ms/iter: 13.894, ETA: 5d 08:56
[Work thread Oct 21 13:53:50] Iteration: 44000000 / 77310307 [56.913497%], roundoff: 0.362, ms/iter: 13.901, ETA: 5d 08:37

Without hyperthreading:
[Work thread Oct 22 06:08:52] Iteration: 48100000 / 77310307 [62.216801%], roundoff: 0.321, ms/iter: 14.151, ETA: 4d 18:49
[Work thread Oct 22 06:32:30] Iteration: 48200000 / 77310307 [62.346150%], roundoff: 0.321, ms/iter: 14.153, ETA: 4d 18:26
[Work thread Oct 22 06:56:06] Iteration: 48300000 / 77310307 [62.475498%], roundoff: 0.336, ms/iter: 14.150, ETA: 4d 18:01
[Work thread Oct 22 07:19:49] Iteration: 48400000 / 77310307 [62.604847%], roundoff: 0.336, ms/iter: 14.199, ETA: 4d 18:01
[Work thread Oct 22 07:43:34] Iteration: 48500000 / 77310307 [62.734196%], roundoff: 0.336, ms/iter: 14.240, ETA: 4d 17:57
[Work thread Oct 22 08:07:20] Iteration: 48600000 / 77310307 [62.863545%], roundoff: 0.336, ms/iter: 14.239, ETA: 4d 17:33
[Work thread Oct 22 08:31:07] Iteration: 48700000 / 77310307 [62.992894%], roundoff: 0.336, ms/iter: 14.244, ETA: 4d 17:12
[Work thread Oct 22 08:54:53] Iteration: 48800000 / 77310307 [63.122243%], roundoff: 0.336, ms/iter: 14.245, ETA: 4d 16:48
[Work thread Oct 22 09:18:39] Iteration: 48900000 / 77310307 [63.251592%], roundoff: 0.336, ms/iter: 14.246, ETA: 4d 16:25
[Work thread Oct 22 09:42:26] Iteration: 49000000 / 77310307 [63.380940%], roundoff: 0.336, ms/iter: 14.247, ETA: 4d 16:02

Hyperthreading back on:
[Work thread Oct 22 11:18:27] Iteration: 49400000 / 77310307 [63.898336%], roundoff: 0.318, ms/iter: 13.959, ETA: 4d 12:13
[Work thread Oct 22 11:41:46] Iteration: 49500000 / 77310307 [64.027685%], roundoff: 0.357, ms/iter: 13.972, ETA: 4d 11:56
[Work thread Oct 22 12:05:05] Iteration: 49600000 / 77310307 [64.157034%], roundoff: 0.357, ms/iter: 13.971, ETA: 4d 11:32
[Work thread Oct 22 12:28:25] Iteration: 49700000 / 77310307 [64.286382%], roundoff: 0.357, ms/iter: 13.975, ETA: 4d 11:10
[Work thread Oct 22 12:51:44] Iteration: 49800000 / 77310307 [64.415731%], roundoff: 0.357, ms/iter: 13.975, ETA: 4d 10:47
[Work thread Oct 22 13:15:03] Iteration: 49900000 / 77310307 [64.545080%], roundoff: 0.357, ms/iter: 13.980, ETA: 4d 10:26
[Work thread Oct 22 13:38:24] Iteration: 50000000 / 77310307 [64.674429%], roundoff: 0.357, ms/iter: 13.979, ETA: 4d 10:02
[/CODE]

Chuck 2018-10-24 12:42

Program error messages this morning
 
I was nearing the end of the PRP first test of 87255060 when the program began outputting error messages (I obscured the AID).

It then started outputting a continuous stream of messages for the next queued test, a PRP double-check of 77979067.

[CODE][Wed Oct 24 07:43:41 2018]
Iteration: 87255060/87255083, Possible error: round off (0.2268580493) > -42387
Iteration: 87255060/87255083, Possible error: round off (0.1564778853) > -42387
Iteration: 87255062/87255083, Possible error: round off (0.2268580493) > -35310
Iteration: 87255065/87255083, Possible error: round off (0.2268580493) > -1.0911e+005
Iteration: 87255067/87255083, Possible error: round off (0.2268580493) > -87582
Iteration: 87255067/87255083, Possible error: round off (0.153231718) > -87582
Iteration: 87255068/87255083, Possible error: round off (0.2268580493) > -41809
Iteration: 87255069/87255083, Possible error: round off (0.2268580493) > -1.1253e+005
Iteration: 87255070/87255083, Possible error: round off (0.2268580493) > -14919
Iteration: 87255073/87255083, Possible error: round off (0.2268580493) > -64820
Iteration: 87255068/87255083, Possible error: round off (0.2149417585) > -1.2398e+005
Iteration: 87255069/87255083, Possible error: round off (0.2268580493) > -14680
Iteration: 87255073/87255083, Possible error: round off (0.2268580493) > -1.0284e+005
Iteration: 87255074/87255083, Possible error: round off (0.2268580493) > -35973
Iteration: 87255074/87255083, Possible error: round off (0.1575920773) > -35973
Iteration: 87255076/87255083, Possible error: round off (0.2268580493) > -1.1785e+005
Iteration: 87255080/87255083, Possible error: round off (0.2268580493) > -86994
Iteration: 87255081/87255083, Possible error: round off (0.2268580493) > -45048
Iteration: 87255083/87255083, Possible error: round off (0.2268580493) > -69856
Iteration: 87255076/87255083, Possible error: round off (0.2268580493) > -1.1785e+005
Iteration: 87255080/87255083, Possible error: round off (0.2268580493) > -86994
Iteration: 87255081/87255083, Possible error: round off (0.2268580493) > -45048
Iteration: 87255083/87255083, Possible error: round off (0.2268580493) > -69856
{"status":"C", "k":1, "b":2, "n":87255083, "c":-1, "worktype":"PRP-3", "res64":"B753D12F3E0435D3", "residue-type":1, "res2048":"F572805416335AB7767ED1208B6CA1873B96438C66EBD147B7DBC451144A4535265274A678657B36FDE4E19B4A6B9DC9697B68C7D5BE60F94063A9A09F5AEFF0980F97F832D641D64097C2CDA17225BE491E781AE684A5BD62BC3692670B3C22FED772058D7F8D3995A67DDC4D2F19F023DDF8A28A4B72D3CA70B9A4B7DF674F56B59DD4ACFD293F5E67CCD71D38D3CD57FC1AA3B45FF8A5E98A4D601708540D1EA06ADFB10D7B0589CEA026ED794B178904B5CC0B46F4B8B59131244D6952FED053C789CB41DB748DA1F676CAB6DAC26FA8FF41C895CBCCF4CB88FE6192F50290EBC1FF863B14FB9B75EAAB3E8A63D02CC9415078EC7070B753D12F3E0435D3", "fft-length":4718592, "shift-count":557107, "error-code":"00001700", "security-code":"C5B35D13", "program":{"name":"Prime95", "version":"29.5", "build":3, "port":4}, "timestamp":"2018-10-24 11:43:51", "errors":{"gerbicz":0}, "user":"jaxbuilder", "computer":"Maingear_i7-7800", "aid":"#################"}
Iteration: 1/77979067, Possible error: round off (0.1412162686) > 0
Iteration: 2/77979067, Possible error: round off (0.1367013401) > 0
Iteration: 3/77979067, Possible error: round off (0.1334150282) > 0
Iteration: 4/77979067, Possible error: round off (0.136214313) > 0
Iteration: 5/77979067, Possible error: round off (0.1355326745) > 0
Iteration: 6/77979067, Possible error: round off (0.1466572128) > 0
Iteration: 7/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 8/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 9/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 10/77979067, Possible error: round off (0.1399357129) > 0
Iteration: 11/77979067, Possible error: round off (0.135522705) > 0
Iteration: 12/77979067, Possible error: round off (0.1533570224) > 0
Iteration: 13/77979067, Possible error: round off (0.1372912802) > 0
Iteration: 14/77979067, Possible error: round off (0.1359849303) > 0
Iteration: 15/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 16/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 17/77979067, Possible error: round off (0.1423625708) > 0
Iteration: 18/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 19/77979067, Possible error: round off (0.1359548851) > 0
Iteration: 20/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 21/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 22/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 24/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 25/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 26/77979067, Possible error: round off (0.1351397599) > -29938
Iteration: 27/77979067, Possible error: round off (0.1351397599) > -1.2351e+005
Iteration: 28/77979067, Possible error: round off (0.1351397599) > -22360
Iteration: 30/77979067, Possible error: round off (0.1351397599) > -88632
Iteration: 31/77979067, Possible error: round off (0.1484038053) > -88632
Iteration: 32/77979067, Possible error: round off (0.1475221147) > -88632
Iteration: 33/77979067, Possible error: round off (0.1330651214) > -88632
Iteration: 34/77979067, Possible error: round off (0.1481876628) > -88632
Iteration: 35/77979067, Possible error: round off (0.137989409) > -88632
Iteration: 36/77979067, Possible error: round off (0.1441307379) > -88632
Iteration: 37/77979067, Possible error: round off (0.1288571716) > -88632
Iteration: 38/77979067, Possible error: round off (0.1533668243) > -88632
Iteration: 39/77979067, Possible error: round off (0.1419894314) > -88632
Iteration: 40/77979067, Possible error: round off (0.1306012338) > -88632
Iteration: 41/77979067, Possible error: round off (0.1453630195) > -88632
Iteration: 42/77979067, Possible error: round off (0.1452567279) > -88632
Iteration: 43/77979067, Possible error: round off (0.1455717164) > -88632
Iteration: 44/77979067, Possible error: round off (0.150328931) > -88632[/CODE]

Chuck 2018-10-24 12:50

Error on PRP double check
 
I deleted the backup files and restarted the PRP double check of 77979067. I immediately got a stream or errors.

Starting Gerbicz error-checking PRP test of M77979067 using AVX-512 FFT length 4200K, Pass1=1920, Pass2=2240, clm=1, 5 threads

[CODE][Wed Oct 24 08:44:28 2018]
Iteration: 1/77979067, Possible error: round off (0.1569584618) > 0
Iteration: 2/77979067, Possible error: round off (0.14001889) > 0
Iteration: 3/77979067, Possible error: round off (0.1397896085) > 0
Iteration: 4/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 5/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 6/77979067, Possible error: round off (0.1334150282) > 0
Iteration: 7/77979067, Possible error: round off (0.1334150282) > 0
Iteration: 8/77979067, Possible error: round off (0.1688922839) > 0
Iteration: 9/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 10/77979067, Possible error: round off (0.1401436378) > 0
Iteration: 11/77979067, Possible error: round off (0.1356045784) > 0
Iteration: 12/77979067, Possible error: round off (0.1440369386) > 0
Iteration: 13/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 14/77979067, Possible error: round off (0.1471612118) > 0
Iteration: 15/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 16/77979067, Possible error: round off (0.1373752097) > 0
Iteration: 17/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 18/77979067, Possible error: round off (0.1383202986) > 0
Iteration: 19/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 20/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 21/77979067, Possible error: round off (0.1356798274) > 0
Iteration: 22/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 23/77979067, Possible error: round off (0.1351397599) > 0
Iteration: 24/77979067, Possible error: round off (0.1334150282) > 0
Iteration: 26/77979067, Possible error: round off (0.1351397599) > -23943
Iteration: 27/77979067, Possible error: round off (0.1351397599) > -96928
Iteration: 30/77979067, Possible error: round off (0.1351397599) > -13599
Iteration: 31/77979067, Possible error: round off (0.1360927037) > -13599
Iteration: 32/77979067, Possible error: round off (0.1462716058) > -13599
Iteration: 33/77979067, Possible error: round off (0.1318965346) > -13599
Iteration: 34/77979067, Possible error: round off (0.1508120858) > -13599
Iteration: 35/77979067, Possible error: round off (0.1354844715) > -13599
Iteration: 36/77979067, Possible error: round off (0.1301119176) > -13599
Iteration: 37/77979067, Possible error: round off (0.141492689) > -13599
Iteration: 38/77979067, Possible error: round off (0.1330696416) > -13599
Iteration: 39/77979067, Possible error: round off (0.13587089) > -13599
Iteration: 40/77979067, Possible error: round off (0.1363885126) > -13599
Iteration: 41/77979067, Possible error: round off (0.1480976757) > -13599
Iteration: 42/77979067, Possible error: round off (0.1353371833) > -13599
Iteration: 43/77979067, Possible error: round off (0.1297843534) > -13599
Iteration: 44/77979067, Possible error: round off (0.1496594111) > -13599
Iteration: 45/77979067, Possible error: round off (0.1391846252) > -13599
Iteration: 46/77979067, Possible error: round off (0.1279003836) > -13599
Iteration: 47/77979067, Possible error: round off (0.142696835) > -13599
Iteration: 48/77979067, Possible error: round off (0.1402622526) > -13599
Iteration: 49/77979067, Possible error: round off (0.1429398311) > -13599
Iteration: 50/77979067, Possible error: round off (0.1426179457) > -13599
Iteration: 129/77979067, Possible error: round off (0.1429085169) > -13599
Iteration: 257/77979067, Possible error: round off (0.1594783021) > -13599[/CODE]

I restarted with version 29.4 and the program used a larger FFT and is running OK.

[Wed Oct 24 08:51:59 2018]
Trying 1000 iterations for exponent 77979067 using 4096K FFT.
If average roundoff error is above 0.143, then a larger FFT will be used.
Final average roundoff error is 0.2176, using 4480K FFT for exponent 77979067.

ATH 2018-10-24 14:31

You could try add FFT2=4480K to the worktodo.txt line for the 29.5 version like this:

PRP=<assignmentkey>,FFT2=4480K,1,2,77979067,-1

Chuck 2018-10-24 16:24

I made the post to point out that something is wrong with the new version.

Prime95 2018-10-24 18:37

[QUOTE=Chuck;498668]I made the post to point out that something is wrong with the new version.[/QUOTE]

I will investigate. It looks like the FFT was running fine -- the roundoff errors are reasonable. The stack variable containing the value to compare against was roached.

My gut reaction (I could well be wrong) is there is a memory corruption problem running multithreaded FFTs. You were running 5 threads per worker. tshinozk had a problem 10 cores 4 or 5 workers benchmark. Whereas, I've been running single threaded PRP tests for the last few months without an issue.

pepi37 2018-10-24 18:54

I found in some occasion roundoff error on Haswell I5 chip.
When I add in worktodo

PRP=FFT2=[B][COLOR=Red]xxx[/COLOR][/B]K,x,x,xxxxx,x error will disappear
But I never sow time difference .

Lets say Prime95 say that is 384K in length, and I add that candidate is 400K in length.
Will 400K increase time of PRP for that candidate?

chalsall 2018-10-24 18:58

[QUOTE=Prime95;498680]I will investigate.[/QUOTE]

[QUOTE=Asimov]The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka!” (I found it!) but “That’s funny …”[/QUOTE]

I tried posting just the above quotes, but the forum rejected it for lack of original content. Hopefully this paragraph will prove I'm sentient.

ET_ 2018-10-26 11:38

Got my segfault while running a PRP.
The issue happened when a benchmark started in the middle of the PRP test.

Nothing happened during the previous LL-D test.

Now I will download the new build.


[code]
[Work thread Oct 26 11:22] Iteration: 14110000 / 81950377 [17.21%], ms/iter: 9.027, ETA: 7d 02:06
[Main thread Oct 26 11:23] Benchmarking multiple workers to tune FFT selection.
[Work thread Oct 26 11:23] Stopping PRP test of M81950377 at iteration 14118224 [17.22%]
[Work thread Oct 26 11:23] Worker stopped while running needed benchmarks.
[Main thread Oct 26 11:23] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.80 ms. Total throughput: 113.64 iter/sec.
[Main thread Oct 26 11:23] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.93 ms. Total throughput: 112.01 iter/sec.
[Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.96 ms. Total throughput: 111.58 iter/sec.
[Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.84 ms. Total throughput: 113.13 iter/sec.
[Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 8.82 ms. Total throughput: 113.32 iter/sec.
[Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.62 ms. Total throughput: 104.00 iter/sec.
[Main thread Oct 26 11:24] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.39 ms. Total throughput: 106.51 iter/sec.
[Main thread Oct 26 11:25] Timing 4320K FFT, 2 cores, 1 worker. Average times: 9.50 ms. Total throughput: 105.24 iter/sec.
*** Error in `./mprime': double free or corruption (!prev): 0x00007f7c100471c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f7c18f3a499]
./mprime[0x45edbc]
./mprime[0x440a3a]
./mprime[0x441aa7]
./mprime[0x44986e]
./mprime[0x47cbca]
/lib64/libpthread.so.0(+0x7de5)[0x7f7c1990fde5]
/lib64/libc.so.6(clone+0x6d)[0x7f7c18fb7bad]
======= Memory map: ========
00400000-026a0000 r-xp 00000000 103:02 18997 /home/ec2-user/mprime/29.5/mprime
0289f000-028a1000 r-xp 0229f000 103:02 18997 /home/ec2-user/mprime/29.5/mprime
028a1000-028dc000 rwxp 022a1000 103:02 18997 /home/ec2-user/mprime/29.5/mprime
028dc000-02903000 rwxp 00000000 00:00 0
036da000-036fb000 rwxp 00000000 00:00 0 [heap]
7f7bfc000000-7f7bfc4fe000 rwxp 00000000 00:00 0
7f7bfc4fe000-7f7c00000000 ---p 00000000 00:00 0
7f7c00000000-7f7c019e9000 rwxp 00000000 00:00 0
7f7c019e9000-7f7c04000000 ---p 00000000 00:00 0
7f7c075e6000-7f7c075fc000 r-xp 00000000 103:02 2338 /lib64/libresolv-2.17.so
7f7c075fc000-7f7c077fb000 ---p 00016000 103:02 2338 /lib64/libresolv-2.17.so
7f7c077fb000-7f7c077fc000 r-xp 00015000 103:02 2338 /lib64/libresolv-2.17.so
7f7c077fc000-7f7c077fd000 rwxp 00016000 103:02 2338 /lib64/libresolv-2.17.so
7f7c077fd000-7f7c077ff000 rwxp 00000000 00:00 0
7f7c077ff000-7f7c07800000 ---p 00000000 00:00 0
7f7c07800000-7f7c08000000 rwxp 00000000 00:00 0
7f7c08000000-7f7c0b9f5000 rwxp 00000000 00:00 0
7f7c0b9f5000-7f7c0c000000 ---p 00000000 00:00 0
7f7c0c000000-7f7c0d9e6000 rwxp 00000000 00:00 0
7f7c0d9e6000-7f7c10000000 ---p 00000000 00:00 0
7f7c10000000-7f7c11f7c000 rwxp 00000000 00:00 0
7f7c11f7c000-7f7c14000000 ---p 00000000 00:00 0
7f7c140e6000-7f7c140eb000 r-xp 00000000 103:02 2326 /lib64/libnss_dns-2.17.so
7f7c140eb000-7f7c142eb000 ---p 00005000 103:02 2326 /lib64/libnss_dns-2.17.so
7f7c142eb000-7f7c142ec000 r-xp 00005000 103:02 2326 /lib64/libnss_dns-2.17.so
7f7c142ec000-7f7c142ed000 rwxp 00006000 103:02 2326 /lib64/libnss_dns-2.17.so
7f7c142ed000-7f7c142f9000 r-xp 00000000 103:02 2328 /lib64/libnss_files-2.17.so
7f7c142f9000-7f7c144f8000 ---p 0000c000 103:02 2328 /lib64/libnss_files-2.17.so
7f7c144f8000-7f7c144f9000 r-xp 0000b000 103:02 2328 /lib64/libnss_files-2.17.so
7f7c144f9000-7f7c144fa000 rwxp 0000c000 103:02 2328 /lib64/libnss_files-2.17.so
7f7c144fa000-7f7c14500000 rwxp 00000000 00:00 0
7f7c14500000-7f7c14501000 ---p 00000000 00:00 0
7f7c14501000-7f7c14d01000 rwxp 00000000 00:00 0
7f7c174a0000-7f7c174b6000 r-xp 00000000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1
7f7c174b6000-7f7c176b5000 ---p 00016000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1
7f7c176b5000-7f7c176b6000 rwxp 00015000 103:02 2250 /lib64/libgcc_s-7-20170915.so.1
7f7c176b6000-7f7c176b7000 ---p 00000000 00:00 0
7f7c176b7000-7f7c17eb7000 rwxp 00000000 00:00 0
7f7c17eb7000-7f7c17eb8000 ---p 00000000 00:00 0
7f7c17eb8000-7f7c186b8000 rwxp 00000000 00:00 0
7f7c186b8000-7f7c186b9000 ---p 00000000 00:00 0
7f7c186b9000-7f7c18eb9000 rwxp 00000000 00:00 0
7f7c18eb9000-7f7c1907c000 r-xp 00000000 103:02 2310 /lib64/libc-2.17.so
7f7c1907c000-7f7c1927b000 ---p 001c3000 103:02 2310 /lib64/libc-2.17.so
7f7c1927b000-7f7c1927f000 r-xp 001c2000 103:02 2310 /lib64/libc-2.17.so
7f7c1927f000-7f7c19281000 rwxp 001c6000 103:02 2310 /lib64/libc-2.17.so
7f7c19281000-7f7c19286000 rwxp 00000000 00:00 0
7f7c19286000-7f7c192fb000 r-xp 00000000 103:02 3420 /usr/lib64/libgmp.so.10.2.0
7f7c192fb000-7f7c194fa000 ---p 00075000 103:02 3420 /usr/lib64/libgmp.so.10.2.0
7f7c194fa000-7f7c194fc000 rwxp 00074000 103:02 3420 /usr/lib64/libgmp.so.10.2.0
7f7c194fc000-7f7c194fe000 r-xp 00000000 103:02 2316 /lib64/libdl-2.17.so
7f7c194fe000-7f7c196fe000 ---p 00002000 103:02 2316 /lib64/libdl-2.17.so
7f7c196fe000-7f7c196ff000 r-xp 00002000 103:02 2316 /lib64/libdl-2.17.so
7f7c196ff000-7f7c19700000 rwxp 00003000 103:02 2316 /lib64/libdl-2.17.so
7f7c19700000-7f7c19707000 r-xp 00000000 103:02 2340 /lib64/librt-2.17.so
7f7c19707000-7f7c19906000 ---p 00007000 103:02 2340 /lib64/librt-2.17.so
7f7c19906000-7f7c19907000 r-xp 00006000 103:02 2340 /lib64/librt-2.17.so
7f7c19907000-7f7c19908000 rwxp 00007000 103:02 2340 /lib64/librt-2.17.so
7f7c19908000-7f7c1991f000 r-xp 00000000 103:02 2336 /lib64/libpthread-2.17.so
7f7c1991f000-7f7c19b1e000 ---p 00017000 103:02 2336 /lib64/libpthread-2.17.so
7f7c19b1e000-7f7c19b1f000 r-xp 00016000 103:02 2336 /lib64/libpthread-2.17.so
7f7c19b1f000-7f7c19b20000 rwxp 00017000 103:02 2336 /lib64/libpthread-2.17.so
7f7c19b20000-7f7c19b24000 rwxp 00000000 00:00 0
7f7c19b24000-7f7c19c25000 r-xp 00000000 103:02 2318 /lib64/libm-2.17.so
7f7c19c25000-7f7c19e24000 ---p 00101000 103:02 2318 /lib64/libm-2.17.so
7f7c19e24000-7f7c19e25000 r-xp 00100000 103:02 2318 /lib64/libm-2.17.so
7f7c19e25000-7f7c19e26000 rwxp 00101000 103:02 2318 /lib64/libm-2.17.so
7f7c19e26000-7f7c19e48000 r-xp 00000000 103:02 2303 /lib64/ld-2.17.so
7f7c1a03c000-7f7c1a041000 rwxp 00000000 00:00 0
7f7c1a044000-7f7c1a047000 rwxp 00000000 00:00 0
7f7c1a047000-7f7c1a048000 r-xp 00021000 103:02 2303 /lib64/ld-2.17.so
7f7c1a048000-7f7c1a049000 rwxp 00022000 103:02 2303 /lib64/ld-2.17.so
7f7c1a049000-7f7c1a04a000 rwxp 00000000 00:00 0
7ffec9b5f000-7ffec9b80000 rwxp 00000000 00:00 0 [stack]
7ffec9bcd000-7ffec9bd0000 r--p 00000000 00:00 0 [vvar]
7ffec9bd0000-7ffec9bd2000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Annullato

[/code]

ATH 2018-10-27 01:31

The EC2 instance I posted logs from earlier with all the roundoff errors finished the DC and it matched despite the errors.

It now got a new exponent 30K [I]higher[/I] than the last one, and it STILL chose 4M FFT, and already got 7 new roundoff errors: [URL="http://hoegge.dk/mersenne/295b3-2.txt"]295b3-2.txt[/URL]

Setting it back to 4200K manually again.

Added this to prime.txt to try and prevent this issue from now on:
SoftCrossover=1.0
SoftCrossoverAdjust=-0.008


Edit: Another instance finished its DC successfully, got a new exponent: 77.97M and chose 4M FFT and quickly got 5 roundoff errors. It is now also set to 4200K manually.

R. Gerbicz 2018-10-27 09:13

[QUOTE=ATH;498866]The EC2 instance I posted logs from earlier with all the roundoff errors finished the DC and it matched despite the errors.
[/QUOTE]

From your posted file:
[CODE]
[Work thread Oct 26 14:42:43] Iteration: 77900000 / 77947687 [99.938821%], roundoff: 0.243, ms/iter: 13.860, ETA: 00:11:00
[Work thread Oct 26 14:42:43] Possible hardware errors have occurred during the test! 24 ROUNDOFF > 0.4.
[Work thread Oct 26 14:42:43] Confidence in final result is excellent.
[Work thread Oct 26 14:53:46] Gerbicz error check passed at iteration 77946729.
[Work thread Oct 26 14:54:00] Gerbicz error check passed at iteration 77947629.
[Work thread Oct 26 14:54:04] Gerbicz error check passed at iteration 77947678.
[Work thread Oct 26 14:54:15] M77947687 is not prime. RES64: 4BCF9784E9A93DEE. Wh8: 34742F74,19637789,00001800
[/CODE]

We see it differently, yes, that is really valid RES64 with incredible high probability because of my error checks. And for this you wouldn't even need to do/see roundoff checks in the run. If my check fails, then you need to fall back to a previous iteration, and you lost 1M iterations work (in your run), but the confidence is still high. How many times the check failed in that test? Seeing the roundoff errors could be very useful, when we decide the fft tablelimits (for a new code/processor ?).

Ofcourse there is a trade off here: with higher FFT the iteration time is (in general) larger, but there is fewer number of fall backs.

ATH 2018-10-27 13:50

Yes there were 24 errors before I manually switched to 4200K FFT, it is very nice that it still works fine.

But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages.

Full log since I switched to AVX-512 on that instance: [URL="http://hoegge.dk/mersenne/295b3.txt"]295b3.txt[/URL]

[CODE][Work thread Oct 21 09:15:59] Iteration: 45256282/77947687, Possible error: round off (0.4344111008) > 0.42188
[Work thread Oct 21 11:26:42] Iteration: 45857274/77947687, Possible error: round off (0.4226175989) > 0.42188
[Work thread Oct 21 14:56:41] Iteration: 46817196/77947687, Possible error: round off (0.4234568715) > 0.42188
[Work thread Oct 21 16:00:42] Iteration: 47110891/77947687, Possible error: round off (0.4270896037) > 0.42188
[Work thread Oct 21 19:56:46] Iteration: 48197311/77947687, Possible error: round off (0.428642894) > 0.42188
[Work thread Oct 21 20:17:11] Iteration: 48291440/77947687, Possible error: round off (0.429588787) > 0.42188
[Work thread Oct 21 22:09:44] Iteration: 48809957/77947687, Possible error: round off (0.430043636) > 0.42188
[Work thread Oct 21 23:16:45] Iteration: 49117544/77947687, Possible error: round off (0.4256841092) > 0.42188
[Work thread Oct 22 04:15:09] Iteration: 50490754/77947687, Possible error: round off (0.4303004887) > 0.42188
[Work thread Oct 22 09:16:28] Iteration: 51847612/77947687, Possible error: round off (0.4338173701) > 0.42188
[Work thread Oct 22 09:55:51] Iteration: 52027255/77947687, Possible error: round off (0.4343059627) > 0.42188
[Work thread Oct 22 09:58:39] Iteration: 52040154/77947687, Possible error: round off (0.454604666) > 0.42188
[Work thread Oct 22 12:18:37] Iteration: 52683329/77947687, Possible error: round off (0.4277743109) > 0.42188
[Work thread Oct 22 13:44:48] Iteration: 53078067/77947687, Possible error: round off (0.4273455066) > 0.42188
[Work thread Oct 22 16:58:38] Iteration: 53968523/77947687, Possible error: round off (0.4307389637) > 0.42188
[Work thread Oct 22 19:02:21] Iteration: 54535700/77947687, Possible error: round off (0.4228008512) > 0.42188
[Work thread Oct 22 22:48:33] Iteration: 55573616/77947687, Possible error: round off (0.4268757585) > 0.42188
[Work thread Oct 23 01:15:29] Iteration: 56247746/77947687, Possible error: round off (0.4692008444) > 0.42188
[Work thread Oct 23 03:08:10] Iteration: 56738296/77947687, Possible error: round off (0.4224648114) > 0.42188
[Work thread Oct 23 04:05:36] Iteration: 57000989/77947687, Possible error: round off (0.4358626009) > 0.42188
[Work thread Oct 23 04:27:30] Iteration: 57101702/77947687, Possible error: round off (0.425993915) > 0.42188
[Work thread Oct 23 09:57:31] Iteration: 58617472/77947687, Possible error: round off (0.4493439792) > 0.42188
[Work thread Oct 23 11:18:36] Iteration: 58990146/77947687, Possible error: round off (0.4567632981) > 0.42188
[Work thread Oct 23 15:07:51] Iteration: 60041159/77947687, Possible error: round off (0.4326492791) > 0.42188
[/CODE]

R. Gerbicz 2018-10-27 14:39

[QUOTE=ATH;498895]Yes there were 24 errors before I manually switched to 4200K FFT, it is very nice that it still works fine.

But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages.
[/QUOTE]

No, asked the number of lines, when the check failed, I think you should see such line:
[CODE]
ERROR: Comparing Gerbicz checksum values failed. Rolling back to iteration
...
[/CODE]
with the iteration number. In that partial file I don't see such line.
Only the number of roundoff errors doesn't matter, I think that with larger p we could see even more errors. What matters is how many times you need to rollback because that increase the additional overhead ( 0.2% ) of my check. And ofcourse there is a relation with these numbers (expected number of roundoff errors and the rollbacks for a given p,FFT), so these are not independent numbers.

Prime95 2018-10-27 16:25

[QUOTE=ATH;498895]But it is still a bug in this version, it should either choose a higher FFT or disable the roundoff error messages.

[CODE][Work thread Oct 21 09:15:59] Iteration: 45256282/77947687, Possible error: round off (0.4344111008) > 0.42188[/QUOTE]

Yes, this is unexpected. I used the same FFT crossovers as for AVX FFTs, which for a 4M FFT is 77990000. I do not see why AVX-512 FFTs have worse round-off behavior than AVX FFTs. More to investigate....

@Gerbicz: It is important for me to get the FFT crossovers right as the gwnum FFT routines are used for LL, LLR, PFGW etc. where Gerbicz error checking is not used. I could (should?) change prime95 to not even look for roundoff errors during a Gerbicz PRP test, esp. since calculating the roundoff error is not free.

R. Gerbicz 2018-10-27 16:59

[QUOTE=Prime95;498906]
@Gerbicz: It is important for me to get the FFT crossovers right as the gwnum FFT routines are used for LL, LLR, PFGW etc. where Gerbicz error checking is not used. I could (should?) change prime95 to not even look for roundoff errors during a Gerbicz PRP test, esp. since calculating the roundoff error is not free.[/QUOTE]

Yes, we don't need those roundoff error calculations at least for PRP, Preda's Gpu Owl has already removed it for Prp. As written you would still need only when you have new code/processor to get the code's new FFT crossovers (basically it doesn't change a lot).

pepi37 2018-10-27 17:03

@Prime95 : if you erase those roundoff error from code: will that affect all PPR testing or only PRP testing on base2? ( since Gerbicz error test is only for base2)
I use Prime95 in CRUS searching as in my personal search for primes, on base 2 but also on any other base

R. Gerbicz 2018-10-27 17:22

[QUOTE=pepi37;498910]@Prime95 : if you erase those roundoff error from code: will that affect all PPR testing or only PRP testing on base2? ( since Gerbicz error test is only for base2)
I use Prime95 in CRUS searching as in my personal search for primes, on base 2 but also on any other base[/QUOTE]

Ofcourse I've spoken about prp with my error checking, For base!=2 you don't have this, hence for that you should keep those roundoff error checks.

preda 2018-10-28 12:43

[QUOTE=R. Gerbicz;498914]Ofcourse I've spoken about prp with my error checking[/QUOTE]
[unrelated to the thread]
Robert, I would personally be interested in a small write-up about how you came up with the error check. A small "history" of the idea of the check.

R. Gerbicz 2018-10-28 22:19

[QUOTE=preda;498961][unrelated to the thread]
Robert, I would personally be interested in a small write-up about how you came up with the error check. [/QUOTE]

off
First of all, without the Jacobi check (and their topic) I wouldn't have discovered my check. And it had some handicap, because the topic was in the GPU section, and I rarely read those topics (this has changed). My first impression when saw the Jacobi check that there could be something more on this area, because of the simplicity of the test.

For some days searched for another Jacobi check and other possible error check in the LL sequence. When for me these seemed dead-end, I have switched to work with Prp test, here the Proth numbers seemed easier, because there the exponent is N-1=k*2^n, or if you delay the k-th powmod (or even you can start with it), then you need to find the error check in a^(2^n) mod N. Actually it was easy to extend this to the general N=(k*2^n+c)/d form.

But return to the easier version, how to find the check:
let f(x)=a^(2^x) mod N, and in the following equations we'll omit writing the mod N to shorten the formulas.
from this we can see that
f(s+t)=f(s)^(2^t) for any s,t.

if we want to see a "check" at x=s, then what else could we do:
f(s)=f(s-t)^(2^t)
but this checks "only" the last t iterations at duplicated cost.
And mainly this is essentially the same thing that we redo the last t iterations.
But we can "continue" this, checking the previous t iterations in a block:
f(s-t)=f(s-2*t)^(2^t)
f(s-2*t)=f(s-3*t)^(2^t)
...
When we see such equations then the standard technique is that we multiple (or say add) all these equations:
f(s)*f(s-t)*f(s-2*t)*...=a*(f(s-t)*f(s-2*t)*f(s-3*t)*...)^(2^t)
if s is divisible by t.

Checking at every s, where s is divisible by t is still costly, so just delay it, the rest was pretty trivial.

additional short/easy examples
In general when we multiple/add equations the only goal is to solve a recursion, like in:
f(n)=f(n-1)+n^2, here you add(!) these equations.

Or with this we can also solve a system of (linear) equation in a nice way, like in:
y+z=a
z+x=b
x+y=c
this is coming from geometry (if a,b,c are the sides of a triangle), see: [url]http://www.cut-the-knot.org/triangle/InExCircles.shtml[/url]

petrw1 2018-10-28 22:44

Immediate crash
 
1 Attachment(s)
…Sorry had to trim the picture to get under 1MB.

It was part of the standard Sad Face Blue Screen referring to windows stop codes

29.4.7 is running fine

S485122 2018-10-29 06:08

[QUOTE=petrw1;498989]…Sorry had to trim the picture to get under 1MB.
...[/QUOTE]Even more efficient would be to just cite the error "WHEA_Uncorrectable_Error" of whatever. Less than 25 bytes !

Jacob

chalsall 2018-10-29 19:21

[QUOTE=petrw1;498989]It was part of the standard Sad Face Blue Screen referring to windows stop codes. 29.4.7 is running fine[/QUOTE]

This is on your brand new, seriously expensive and seriously bleeding edge machine. Correct?

Neither Prime95 nor mfaktc should crash your machine. Unless there is something fundamentally wrong...

Have you looked at your temperatures, your voltages, and your power draw?

Have you tried running your new kit under a Linux environment (to try to eliminate a driver issue)?

Have you tried running only Prime95, and then only mfaktc?

I'm sorry if I'm telling you how to chew gum. But when "making friends with new kit" it is important to change only one variable at a time when the kit is being obstinate.

petrw1 2018-10-29 20:15

[QUOTE=chalsall;499039]This is on your brand new, seriously expensive and seriously bleeding edge machine. Correct?
============= YES

Neither Prime95 nor mfaktc should crash your machine. Unless there is something fundamentally wrong...
============= AGREED....I did try the second version that was supposed to have fixed a memory corruption error (this would/could crash a machine)

Have you looked at your temperatures, your voltages, and your power draw?
============= NOT YET

Have you tried running your new kit under a Linux environment (to try to eliminate a driver issue)?
============= NOT THAT BRAVE....and SWMBO uses it also and would not approve LINUX

Have you tried running only Prime95, and then only mfaktc?
============= I did run Prime95 only first with the SkyLake version.....then backed off to 29.4 which runs fine.

I'm sorry if I'm telling you how to chew gum. But when "making friends with new kit" it is important to change only one variable at a time when the kit is being obstinate.
============ A appreciate all the help
[/QUOTE]

P.S. Still some growing pains....they installed 32GB (8x4) but windows only sees 24GB.

Also mfaktc on the GPU "messes up" the display then "blue screens" the CPU ... after running about 1 minute.

chalsall 2018-10-29 20:48

[QUOTE=petrw1;499046]Also mfaktc on the GPU "messes up" the display then "blue screens" the CPU ... after running about 1 minute.[/QUOTE]

Just putting this out there...

On all of my machines which have GPUs, I have the motherboard drive the displays and the GPU(s) just run CUDA code (but, then, I'm not a "gamer").

It's possible there is a conflict in the code trying to drive the display and that trying to run compute.

P.S. I understand all too well the SHMBO dimension of the problem space. Personally I've found that handing off something to be used which doesn't work perfectly is rarely rewarding.... :wink:

petrw1 2018-10-29 21:02

[QUOTE=chalsall;499049]Just putting this out there...

On all of my machines which have GPUs, I have the motherboard drive the displays and the GPU(s) just run CUDA code (but, then, I'm not a "gamer").

It's possible there is a conflict in the code trying to drive the display and that trying to run compute.

P.S. I understand all too well the SHMBO dimension of the problem space. Personally I've found that handing off something to be used which doesn't work perfectly is rarely rewarding.... :wink:[/QUOTE]

Does this require a video card on the MB? Or do they all have this capability? And if so would we NOT lose a lot of screen speed or quality?

AND.....
I had to buy a new monitor because the ONLY video parts on the back of the case were HDMI ports from the GPU

chalsall 2018-10-29 21:20

[QUOTE=petrw1;499050]Does this require a video card on the MB? Or do they all have this capability? And if so would we NOT lose a lot of screen speed or quality?[/QUOTE]

Yes, no (but most do, look for video ports on the MB), and yes.

[QUOTE=petrw1;499050]I had to buy a new monitor because the ONLY video parts on the back of the case were HDMI ports from the GPU[/QUOTE]

Are you sure about that?

Most Intel based MBs have HDMI and legacy VGA video ports, driven by the GPU contained within the processor (and is crap at OpenCL). Further, with the correct cable, you can drive a VGA monitor from a HDMI port.

I'm not suggesting this is the optimal solution, but you might consider trying driving your display(s) from a device separate from the GPU you want to run compute on, and see if the system is stable.

Mysticial 2018-10-29 21:25

[QUOTE=chalsall;499053]Yes, no (but most do, look for video ports on the MB), and yes.



Are you sure about that?

Most Intel based MBs have HDMI and legacy VGA video ports, driven by the GPU contained within the processor (and is crap at OpenCL). Further, with the correct cable, you can drive a VGA monitor from a HDMI port.

I'm not suggesting this is the optimal solution, but you might consider trying driving your display(s) from a device separate from the GPU you want to run compute on, and see if the system is stable.[/QUOTE]

Skylake X doesn't have onboard video.

chalsall 2018-10-29 21:35

[QUOTE=Mysticial;499054]Skylake X doesn't have onboard video.[/QUOTE]

OK. Thanks. I didn't know that.

I guess going "headless" for a diagnostic "deep dive" is going to be more difficult than expected....

ET_ 2018-10-29 21:57

[QUOTE=Mysticial;499054]Skylake X doesn't have onboard video.[/QUOTE]

So, pairing a Skylake X with an AsRock X299 Extreme LGA 2066, it would need an external video board, or this MB has its video board integrated, although not driven by the processor GPU??

Mysticial 2018-10-29 22:03

[QUOTE=ET_;499059]So, pairing a Skylake X with an AsRock X299 Extreme LGA 2066, it would need an external video board, or this MB has its video board integrated, although not driven by the processor GPU??[/QUOTE]

I don't see any video ports on that mobo. So no, it doesn't have onboard video on the mobo. You will need external graphics if you don't want to run headless.

p.s. My Designare X299 has DP ports, but they're actually inputs for the thunderbolt ports. And they need to be powered with an external GPU. (yeah, weird)

ET_ 2018-10-30 12:07

[QUOTE=Mysticial;499061]I don't see any video ports on that mobo. So no, it doesn't have onboard video on the mobo. You will need external graphics if you don't want to run headless.

p.s. My Designare X299 has DP ports, but they're actually inputs for the thunderbolt ports. And they need to be powered with an external GPU. (yeah, weird)[/QUOTE]

Thank you Mystical. :tu: Any hints on a "cheap" but "beefy" X299- and LGA2066-ready MB with the same characteristics of the AsRock but with a graphic output?

I will search the web for it, but your (people reading) experiences might be helpful, thanks.

Mark Rose 2018-10-30 15:23

There have been high failure rates with the GTX 2080 Ti's. It's probably defective.

Mysticial 2018-10-30 16:45

[QUOTE=ET_;499097]Thank you Mystical. :tu: Any hints on a "cheap" but "beefy" X299- and LGA2066-ready MB with the same characteristics of the AsRock but with a graphic output?

I will search the web for it, but your (people reading) experiences might be helpful, thanks.[/QUOTE]

I'm not sure they even exist. Do you really need every last PCIe slot on the motherboard? You can easily pop in a low-end card into one of the PCIe 2 slots to leave all the 3.0 slots available.

ET_ 2018-10-30 18:42

[QUOTE=Mysticial;499120]I'm not sure they even exist. Do you really need every last PCIe slot on the motherboard? You can easily pop in a low-end card into one of the PCIe 2 slots to leave all the 3.0 slots available.[/QUOTE]

Indeed, that's what I plan to do. Such MB DID exist in a not too far past.

ZK19 2018-10-31 09:31

Hi. I have a problem with prime95 29.5b3 linux build to return a certain result.


prime.log contains:


Got assignment _: PRP M40493
Sending expected completion date for M40493: Oct 31 2018
Sending result to server: UID: _, M40493/known_factors is not prime. Type-5 RES64: 28B346EDA839BB__. Wh8: _,13840,00000000, AID: _

PrimeNet error 13: Server database full or broken
ar Insert t_gimps_results_log failed: _ GUID: _, exponent: 40493, C2D_CPU_GHz_days: 2.8907502777778E-5


(some values replaced by _ or ... for this post). Similar messages about the same exponent repeat after this, apparently every 70 minutes. I can't see the result listed when I query the exponent and the exponent stays assigned to me. This is the corresponding result for the test, in the local results.txt:


{"status":"C", "exponent":40493, "known-factors":"92433209129,1667510486489,171398538300197951,269366686360389758713349
089,72084381262372670043690044179567991113", "worktype":"PRP-3", "res64":"28B346EDA839BB__", "residue-type":5, "res2048
":"3C4F...}

overall the line is a few characters >1029 in length.


Later I'll try to clear the spool file and rerun the exponent, falling back to prime 29.4 if the same problem happens. If that doesn't work I'll unreserve the exponent so somebody else can try.

petrw1 2018-10-31 14:48

I7-7820X WITH DDR4 3600....I expected better
 
I am disappointed that the Benchmark for this is no better than my son's i7-6700 running at 3.4 with much slower RAM(2400 I recall)

Anyone else with similar hardware have any comments?

When I look at Benchmarks for 7820s I don't see a big difference.

[url]https://www.mersenne.org/report_benchmarks/?exv25=1&exv26=1&64bit=1&speed_lo=100&speed_hi=10000&min_num=1&exp_date=2017-01-01&B1=Get+Benchmarks[/url]

Mine is running 3990Mhz.

GP2 2018-10-31 15:58

I am also using 29.5 build 3 on Linux. I am running on a single core (c5.large on AWS).

I put the line
[CODE]
PRP=1,2,40493,-1,99,0,3,5,"92433209129,1667510486489,171398538300197951,269366686360389758713349089,72084381262372670043690044179567991113"
[/CODE]
in my worktodo and ran it.

It completed successfully.

Can you try submitting the results.txt line manually via [url]https://www.mersenne.org/manual_result/[/url] ?


[QUOTE=ZK19;499152]Hi. I have a problem with prime95 29.5b3 linux build to return a certain result.


prime.log contains:


Got assignment _: PRP M40493
Sending expected completion date for M40493: Oct 31 2018
Sending result to server: UID: _, M40493/known_factors is not prime. Type-5 RES64: 28B346EDA839BB__. Wh8: _,13840,00000000, AID: _

PrimeNet error 13: Server database full or broken
ar Insert t_gimps_results_log failed: _ GUID: _, exponent: 40493, C2D_CPU_GHz_days: 2.8907502777778E-5


(some values replaced by _ or ... for this post). Similar messages about the same exponent repeat after this, apparently every 70 minutes. I can't see the result listed when I query the exponent and the exponent stays assigned to me. This is the corresponding result for the test, in the local results.txt:


{"status":"C", "exponent":40493, "known-factors":"92433209129,1667510486489,171398538300197951,269366686360389758713349
089,72084381262372670043690044179567991113", "worktype":"PRP-3", "res64":"28B346EDA839BB__", "residue-type":5, "res2048
":"3C4F...}

overall the line is a few characters >1029 in length.


Later I'll try to clear the spool file and rerun the exponent, falling back to prime 29.4 if the same problem happens. If that doesn't work I'll unreserve the exponent so somebody else can try.[/QUOTE]

ZK19 2018-10-31 16:25

[QUOTE=GP2;499165]I am also using 29.5 build 3 on Linux. I am running on a single core (c5.large on AWS).

I put the line
[CODE]
PRP=1,2,40493,-1,99,0,3,5,"92433209129,1667510486489,171398538300197951,269366686360389758713349089,72084381262372670043690044179567991113"
[/CODE]in my worktodo and ran it.

It completed successfully.

Can you try submitting the results.txt line manually via [URL]https://www.mersenne.org/manual_result/[/URL] ?[/QUOTE]

Excellent, thank you for running the exponent. My worker was running on an r5.2xlarge AWS (one of 4 single core workers). I tried submitting the results line from results.txt

{"status":"C", "exponent":40493, ...}

to the manual result page. It gave this error:

Found 1 lines to process.
[COLOR=darkgreen]processing: PRP=(false) for [URL="https://www.mersenne.org/M40493"]M40493[/URL]/92433209129,1667510486489,171398538300197951,269366686360389758713349089,72084381262372670043690044179567991113[/COLOR]
Warning: odbc_do(): SQL error: [Microsoft][ODBC SQL Server Driver][SQL Server]String or binary data would be truncated., SQL state 22001 in SQLExecDirect in C:\inetpub\v5\v5server\0.95_database.inc.php on line 286 [COLOR=red]Error code: 13, error text: ar Insert t_gimps_results_log failed: _ GUID: _, exponent: 40493, C2D_CPU_GHz_days: 2.8907502777778E-5[/COLOR]

This is just a guess, but perhaps since you added the task without an assignment id it may not have written that or some other fields into your results line, making it shorter. I thought that perhaps that is the reason it does not accept the result I found.

GP2 2018-10-31 16:46

[QUOTE=ZK19;499168]This is just a guess, but perhaps since you added the task without an assignment id it may not have written that or some other fields into your results line, making it shorter. I thought that perhaps that is the reason it does not accept the result I found.[/QUOTE]

Yes, my results line was lacking the usual "aid" field at the end. It's hard to imagine that makes a difference though.

Maybe you could ask Madpoo to look at it.

ATH 2018-10-31 17:05

[QUOTE=petrw1;499162]I am disappointed that the Benchmark for this is no better than my son's i7-6700 running at 3.4 with much slower RAM(2400 I recall)

Anyone else with similar hardware have any comments?

When I look at Benchmarks for 7820s I don't see a big difference.

[url]https://www.mersenne.org/report_benchmarks/?exv25=1&exv26=1&64bit=1&speed_lo=100&speed_hi=10000&min_num=1&exp_date=2017-01-01&B1=Get+Benchmarks[/url]

Mine is running 3990Mhz.[/QUOTE]

AVX-512 should be faster try the 29.5 beta, but watch your CPU temperature that your cooling can handle it. According to Mysticial AVX512 makes the CPU run much hotter.

In the benchmark window I assume you chose Throughput benchmark, and make sure "Benchmark all FFT implementations to find the best one for your machine" is checked. There can be a big speed difference between the different FFT settings on the same FFT size.

Make sure your RAM is running dual channel, see CPU-Z Memory tab it says "Channel #" Dual:
[url]https://www.cpuid.com/softwares/cpu-z.html[/url]

Mysticial 2018-10-31 17:14

[QUOTE=ATH;499174]AVX-512 should be faster try the 29.5 beta, but watch your CPU temperature that your cooling can handle it. According to Mysticial AVX512 makes the CPU run much hotter.[/QUOTE]

At the same frequency as non-AVX/AVX, yes it will be much hotter. But AVX512 was never meant to run without a throttle.

These chips are all supposed to be have the same TDP regardless of load, non-AVX/AVX(512). They are normalized to the same TDP by offsetting the frequencies of each workload respectively.

The only problem is that most people don't understand this. (even the mobo vendors got it wrong initially) They try to run everything at the same speed and of course things blow up. Then they blame it on Intel because AVX512 is stupid or something like that.

TheJudger 2018-10-31 19:16

[QUOTE=Mysticial;499175]These chips are all supposed to be have the same TDP regardless of load, non-AVX/AVX(512). They are normalized to the same TDP by offsetting the frequencies of each workload respectively.

The only problem is that most people don't understand this. (even the mobo vendors got it wrong initially) They try to run everything at the same speed and of course things blow up. Then they blame it on Intel because AVX512 is stupid or something like that.[/QUOTE]

So true. Throttle is when frequency is below the base frequency (which can differ for different workloads (read: non-AVX, AVX and AVX-512 for current Intel CPUs). But consumers and boardmakers think it should always run at the maximum single core non-AVX speed. Same with GPUs - if they didn't run at maximum specified frequency they call it throttle and the product gets bad reviews...

Oliver

VictordeHolland 2018-10-31 21:38

The problem is also partly due to marketing and expectations.
"This chip has a max turbo of 4.x GHz" [SUB](but only in single thread load)[/SUB]
xxxW TDP is only when it runs at it's base frequency (it can consume a lot more in turbo modes, see also: [URL]https://www.anandtech.com/show/13400/intel-9th-gen-core-i9-9900k-i7-9700k-i5-9600k-review/21[/URL]

Prime95 2018-10-31 23:34

[QUOTE=ZK19;499152]Hi. I have a problem with prime95 29.5b3 linux build to return a certain result

Got assignment _: PRP M40493

overall the line is a few characters >1029 in length..[/QUOTE]

I changed the SQL table to accept result_text up to 2048 characters

Prime95 2018-10-31 23:39

[QUOTE=petrw1;499162]I am disappointed that the Benchmark for this is no better than my son's i7-6700 running at 3.4 with much slower RAM(2400 I recall)[/QUOTE]

In the BIOS, make sure your RAM is running at the fastest speed (XMP profile).

For throughput benchmarks, I've found my AVX-512 CPU gets much more throughput using 8 cores/8 workers rather than 8 cores/1 worker.

petrw1 2018-11-01 00:22

CPU-Z
 
1 Attachment(s)
I've no experience in reading these but it doesn't look like the RAM is 3600 even though I watched the support guy spend 2 hours in the BIOS until he got it to say the RAM was 3600 and stable…. what do you see?

The "SPD" was the same in Slots 1, 3, 5, and 7.

Thanks

ATH 2018-11-01 01:08

So on the SPD tab you can see the RAM modules are actually DDR4-2133, the 1066 Mhz is the clock rate and 2133 Mhz is the "effective" speed since they can transfer data twice per clock. You are running Quad channel RAM which is great, the best you can get without buying very expensive Xeon processors.

On the Memory tab it shows that RAM is running overclocked at ~ 1900 Mhz actual clock rate so "effective" rate of 3800 Mhz, and your processor is slightly overclocked to 3.7 Ghz (CPU tab).

I hope you only payed for DDR4-2133 (PC4-17000) modules and not something higher? The question is if it is running stable at 3800 Mhz, you will have to test that with stress test and double checks.


It is strange you are getting bad benchmarks with this setup, it could be throttling because the CPU is getting too hot, try watch the CPU temperature during benchmarks. Also try to benchmark with the 29.4b8 Prime95 (with AVX2 instead of AVX-512), and check the temperature running AVX2.

petrw1 2018-11-01 02:18

29.5 much better
 
1 Attachment(s)
Heat should not be an issue I hope with a "Liquid 360mm Freezer 360 cooler"

ATH 2018-11-01 02:24

So you got 7.89 ms/iteration at 3072K FFT but how many cores is that? What are you getting with 8 cores on 1 worker? Should get < 3ms then at 3072K FFT

petrw1 2018-11-01 03:00

[QUOTE=ATH;499216]So you got 7.89 ms/iteration at 3072K FFT but how many cores is that? What are you getting with 8 cores on 1 worker? Should get < 3ms then at 3072K FFT[/QUOTE]

Someone can correct me but I believe the Benchmark timings are for 1 core/1worker.

[CODE]Timings for 3072K FFT length (8 cores, 1 worker): 1.38 ms. Throughput: 723.83 iter/sec.
Timings for 3072K FFT length (8 cores, 4 workers): 7.29, 7.23, 7.18, 7.01 ms. Throughput: 557.47 iter/sec.
Timings for 3072K FFT length (8 cores, 8 workers): 14.50, 14.40, 14.33, 14.36, 14.34, 14.36, 14.41, 14.06 ms. Throughput: 557.74 iter/sec.[/CODE]

ATH 2018-11-01 03:08

That's more like it: 1.38 ms/iteration at 3072K, should be < 3 ms/ite at 4096K and even at 4608K maybe. That is pretty good, is it still not better than your son's computer?

petrw1 2018-11-01 03:08

[QUOTE=ATH;499209]So on the SPD tab you can see the RAM modules are actually DDR4-2133, the 1066 Mhz is the clock rate and 2133 Mhz is the "effective" speed since they can transfer data twice per clock. You are running Quad channel RAM which is great, the best you can get without buying very expensive Xeon processors.

On the Memory tab it shows that RAM is running overclocked at ~ 1900 Mhz actual clock rate so "effective" rate of 3800 Mhz, and your processor is slightly overclocked to 3.7 Ghz (CPU tab).

I hope you only payed for DDR4-2133 (PC4-17000) modules and not something higher? The question is if it is running stable at 3800 Mhz, you will have to test that with stress test and double checks.


It is strange you are getting bad benchmarks with this setup, it could be throttling because the CPU is getting too hot, try watch the CPU temperature during benchmarks. Also try to benchmark with the 29.4b8 Prime95 (with AVX2 instead of AVX-512), and check the temperature running AVX2.[/QUOTE]

I don't know what to say about the DDR4-2133. Assuming the techie wasn't lying he read the labels on the RAM and said they are DDR4-3600; my invoice says the same.

He had to OC to 3800 because the RAM is CL18 and the only speeds the BIOS/MB supported with CL18 were 2133, a couple others and 3800.

If he chose a speed (like 3600) that wasn't CL=18 then only 24GB (3 of the 4 sticks)would show.

When he ran MemTest is showed the RAM running at 3598 or something else really close to 3600.

I have a liquid cooler so I hope heat won't be an issue.

Another thread shows much better iteration times with v29.5

sdbardwick 2018-11-01 04:26

Your RAM is DDR4-3600, as verified by the SPD tab in CPU-Z; the XMP profile shows it as 3600 C18 (well, 3596 C18). All RAM is advertised using the XMP figures.

Usually, you just need to tell the BIOS to use the XMP profile (and sometimes manually set the memory voltage to match that in the XMP profile - some motherboards do that automatically, some don't) and you are good to go.

science_man_88 2018-11-01 09:24

[QUOTE=petrw1;499220]
If he chose a speed (like 3600) that wasn't CL=18 then only 24GB (3 of the 4 sticks)would show.
[/QUOTE]

really your memory tab says 32 GB quad channel your SPD tab says 8192 Megabytes = 8 Gibibytes

sdbardwick 2018-11-01 10:37

[QUOTE=science_man_88;499236]really your memory tab says 32 GB quad channel your SPD tab says 8192 Megabytes = 8 Gibibytes[/QUOTE]
Because it is only showing information for the DIMM in slot 1.

Prime95 2018-11-18 21:48

29.5 build 4.

1) Slightly faster small FFTs (under 128K)
2) Reduced round off error for FFT lengths divisible by 7.
3) Proper FFT crossover points.

Not fixed:
1) Switching to AVX FFTs for ultra-tiny tests.
2) The possible stack corruption in erroneously reporting round off errors. Multi-thread crash bugs. These do not seem to be repeatable and could be caused by AVX-512 overheating the CPU. Please continue testing and reporting any issues along these lines.

Linux 64-bit: [URL="https://www.dropbox.com/s/67wgupwxlz6shqb/p95v295b4.linux64.tar.gz?dl=0"]https://www.dropbox.com/s/67wgupwxlz6shqb/p95v295b4.linux64.tar.gz?dl=0[/URL]
Windows 64-bit (totally untested): [URL="https://www.dropbox.com/s/yzfx2jcryf6nrr3/p95v295b4.win64.zip?dl=0"]https://www.dropbox.com/s/yzfx2jcryf6nrr3/p95v295b4.win64.zip?dl=0[/URL]

Chuck 2018-11-19 00:03

So far so good...
 
1 Attachment(s)
Just started using the new Windows version. Using 1 worker, 5 threads. Working OK so far. Good improvement in speed.

Prime95 2018-11-29 03:05

29.5 build 5.

1) Fixed AVX FFTs where FFT length was divisible by 7.
2) Fixed several zero padded FFT bugs.
3) Wider CPU dialog box.
4) Correct reporting of Skylake-X L2 cache size.

Not fixed:
1) Switching to AVX FFTs for ultra-tiny tests.
2) The possible stack corruption in erroneously reporting round off errors. Multi-thread crash bugs. These do not seem to be repeatable and could be caused by AVX-512 overheating the CPU. Please continue testing and reporting any issues along these lines.

Linux 64-bit: [URL="ftp://mersenne.org/gimps/p95v295b5.linux64.tar.gz"]ftp://mersenne.org/gimps/p95v295b5.linux64.tar.gz[/URL]
Windows 64-bit (less tested): [URL="ftp://mersenne.org/gimps/p95v295b5.win64.zip"]ftp://mersenne.org/gimps/p95v295b5.win64.zip[/URL]

Yuno 2018-11-29 04:59

[QUOTE=Prime95;501210]29.5 build 5.

1) Fixed AVX FFTs where FFT length was divisible by 7.
2) Fixed several zero padded FFT bugs.
3) Wider CPU dialog box.
4) Correct reporting of Skylake-X L2 cache size.

Not fixed:
1) Switching to AVX FFTs for ultra-tiny tests.
2) The possible stack corruption in erroneously reporting round off errors. Multi-thread crash bugs. These do not seem to be repeatable and could be caused by AVX-512 overheating the CPU. Please continue testing and reporting any issues along these lines.

Linux 64-bit: [URL]ftp://mersenne.org/gimps/p95v295b5.linux64.tar.gz[/URL]
Windows 64-bit (less tested): [URL]ftp://mersenne.org/gimps/p95v295b5.win64.zip[/URL][/QUOTE]


date error on ftp, says [URL="ftp://mersenne.org/gimps/p95v295b5.win64.zip"]p95v295b5.win64.zip[/URL] 6953 KB 11/29/[B]2017[/B] 2:23:00 AM

Chuck 2018-12-01 04:47

29.5 build 5 Windows problem with PRP
 
4 Attachment(s)
This new AVX-512 version still isn't working correctly on my Skylake-X system. When the new build was released, I had an in-process exponent 88042723 running. I exited the program, copied in the new version and libraries, and restarted. Everything seemed fine for several days until it got to the final iterations. The first screen capture shows the results.
I also started getting errors on the next queued exponent 88135807.

I exited the program, deleted the save file, and started 88135807 again. Both runs were using 1 worker, 5 threads. The second screen captures shows the errors.

I erased the save file again and restarted with 1 worker, 1 thread. Still the errors as shown in the third capture. I also uploaded a CPU-Z information image. NOTE when I came back to read this post I see I didn't capture an image with just 1 worker; however, I still got the stream of roundoff errors.

I had to revert back to version 29.4 again. I wish the new AVX version would work for me since it is quite a bit faster.

This is not an issue with the CPU overheating as there is no noticeable difference in the temperature (high 70s).

Prime95 2018-12-01 06:07

[QUOTE=Chuck;501399]This new AVX-512 version still isn't working correctly [/QUOTE]

Good info. The first 30 and last 30 iterations of an LL test are done differently. Maybe this clue will get me closer to the bug.

Chuck 2018-12-01 15:22

Are my results for [URL="https://www.mersenne.org/report_exponent/?exp_lo=88042723&full=1"]88042723[/URL] valid? Should someone run a PRP double check?

Prime95 2018-12-01 16:41

[QUOTE=Chuck;501437]Are my results for [URL="https://www.mersenne.org/report_exponent/?exp_lo=88042723&full=1"]88042723[/URL] valid?[/QUOTE]

Almost certainly.

tshinozk 2018-12-08 00:06

LLR test is very fine using 18 core multi-threaded works.
I finished some known primes with no problems.
M30402457 is prime! Wh4: DB61A8B9,00000000
M77232917 is prime! Wh4: F9C5D09F,00000000

For only benchmark, there may be issues in my case.

Prime95 2018-12-08 05:21

[QUOTE=Chuck;501399]This new AVX-512 version still isn't working correctly on my Skylake-X system. [/QUOTE]

I have an idea. Some routines were not saving and restoring xmm8 - xmm15 as per the Windows 64 ABI. If I can build an executable for you before I go on a cruise, you can test out that theory.

The idea is iffy as I build with MSVC 2005. Those registers did not exist back then.

Edit: Chuck, you have a PM.

Chuck 2018-12-08 14:05

[QUOTE=Prime95;502047]I have an idea. Some routines were not saving and restoring xmm8 - xmm15 as per the Windows 64 ABI. If I can build an executable for you before I go on a cruise, you can test out that theory.

Edit: Chuck, you have a PM.[/QUOTE]

George, the version you sent me is working correctly. I renamed my existing save files and started the LL test from the beginning with no error messages.

Madpoo 2018-12-08 17:51

[QUOTE=Prime95;502047]...The idea is iffy as I build with MSVC 2005...[/QUOTE]

We need to get you an updated copy of Visual Studio. :smile:

GP2 2018-12-08 19:26

[QUOTE=Madpoo;502089]We need to get you an updated copy of Visual Studio. :smile:[/QUOTE]

Surely it's a backwards compatibility issue rather than an affordability issue.

ewmayer 2018-12-08 21:05

[QUOTE=Prime95;502047]I have an idea. Some routines were not saving and restoring xmm8 - xmm15 as per the Windows 64 ABI. If I can build an executable for you before I go on a cruise, you can test out that theory.

The idea is iffy as I build with MSVC 2005. Those registers did not exist back then.[/QUOTE]

LOL, and here I thought *I* was a toolchain Luddite. :)

Enjoy your cruise - will you be celebrating the holidays at sea?

Prime95 2018-12-09 02:52

I was wrong, xmm8-xmm15 arrived with SSE2 in 2004. Saving the registers fixed Chuck's problem.

@Ernst: A short 10-day cruise in the Caribbean. I'll be back in time for Christmas with the grandkids. Some say I got out of town just in time as 15 to 20 inches of snow is due tomorrow. I think I would have enjoyed the once-a-decade storm.

As to upgrading MSVC, if it ain't broke don't fix it :)


All times are UTC. The time now is 13:24.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.