![]() |
|
|
#23 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
10100111100112 Posts |
Mlucas V19 built very smoothly on an Ubuntu install (19.04 I think; what's linux's equivalent to dos/win's "ver"?) atop Windows Subsystem for Linux atop Windows 10 Home 64-bit build 18362, and passed self test. https://en.wikipedia.org/wiki/Window...stem_for_Linux is only available for Windows 10 and Server 2019. (VM approaches such as Oracle Virtualbox could be taken for Windows 8.x and 7, presumably with considerable overhead.)
Mlucas for X86, SSE2, and FMA3 built multithreaded without issue. Code:
ken@peregrine:~/mlucas_v19/mlucas_v19/build$ gcc -c -O3 -DUSE_THREADS ../src/*.c >& build.log ken@peregrine:~/mlucas_v19/mlucas_v19/build$ grep error build.log ken@peregrine:~/mlucas_v19/mlucas_v19/build$ gcc -o mlucas-x86-mt *.o -lm -lpthread -lrt Code:
ken@peregrine:~/mlucas_v19/mlucas_v19/build$ gcc -c -O3 -DUSE_SSE2 -DUSE_THREADS ../src/*.c >& build.log ken@peregrine:~/mlucas_v19/mlucas_v19/build$ grep error build.log ken@peregrine:~/mlucas_v19/mlucas_v19/build$ gcc -o mlucas-sse2-mt *.o -lm -lpthread -lrt Code:
ken@peregrine:~/mlucas_v19/mlucas_v19/build$ gcc -c -O3 -DUSE_AVX2 -mavx2 -DUSE_THREADS ../src/*.c >& build.log ken@peregrine:~/mlucas_v19/mlucas_v19/build$ grep error build.log ken@peregrine:~/mlucas_v19/mlucas_v19/build$ gcc -o mlucas-fma3-mt *.o -lm -lpthread -lrt Code:
ken@peregrine:~/mlucas_v19/mlucas_v19/build$ ./mlucas-fma3-mt -fftlen 192 -iters 100 -radset 0 -nthread 12
Mlucas 19.0
http://www.mersenneforum.org/mayer/README.html
INFO: testing qfloat routines...
CPU Family = x86_64, OS = Linux, 64-bit Version, compiled with Gnu C [or other compatible], Version 7.4.0.
INFO: Build uses AVX2 instruction set.
INFO: Using inline-macro form of MUL_LOHI64.
INFO: Using FMADD-based 100-bit modmul routines for factoring.
INFO: MLUCAS_PATH is set to ""
INFO: using 64-bit-significand form of floating-double rounding constant for scalar-mode DNINT emulation.
Setting DAT_BITS = 10, PAD_BITS = 2
INFO: testing IMUL routines...
INFO: System has 12 available processor cores.
INFO: testing FFT radix tables...
Set affinity for the following 12 cores: 0.1.2.3.4.5.6.7.8.9.10.11.
Mlucas selftest running.....
/****************************************************************************/
INFO: Unable to find/open mlucas.cfg file in r+ mode ... creating from scratch.
NTHREADS = 12
INFO: Maximum recommended exponent for this runlength = 3888516; p[ = 3888509]/pmax_rec = 0.9999981998.
Initial DWT-multipliers chain length = [short] in carry step.
M3888509: using FFT length 192K = 196608 8-byte floats, initial residue shift count = 3736240
this gives an average 19.777979532877605 bits per digit
Using complex FFT radices 192 16 32
mers_mod_square: Init threadpool of 12 threads
Using 8 threads in carry step
100 iterations of M3888509 with FFT length 196608 = 192 K, final residue shift count = 744463
Res64: 71E61322CCFB396C. AvgMaxErr = 0.262946429. MaxErr = 0.312500000. Program: E19.0
Res mod 2^35 - 1 = 29259839105
Res mod 2^36 - 1 = 50741070790
Clocks = 00:00:00.331
Done ...
Code:
19.0
2048 msec/iter = 49.19 ROE[avg,max] = [0.231026786, 0.281250000] radices = 256 16 16 16 0 0 0 0 0 0
2304 msec/iter = 58.64 ROE[avg,max] = [0.188309152, 0.226562500] radices = 288 16 16 16 0 0 0 0 0 0
2560 msec/iter = 64.61 ROE[avg,max] = [0.223995536, 0.281250000] radices = 160 16 16 32 0 0 0 0 0 0
2816 msec/iter = 72.57 ROE[avg,max] = [0.215401786, 0.250000000] radices = 352 16 16 16 0 0 0 0 0 0
3072 msec/iter = 83.32 ROE[avg,max] = [0.243191964, 0.281250000] radices = 192 16 16 32 0 0 0 0 0 0
3328 msec/iter = 87.62 ROE[avg,max] = [0.216573661, 0.281250000] radices = 208 16 16 32 0 0 0 0 0 0
3584 msec/iter = 91.89 ROE[avg,max] = [0.256696429, 0.312500000] radices = 224 16 16 32 0 0 0 0 0 0
3840 msec/iter = 96.61 ROE[avg,max] = [0.197767857, 0.218750000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 104.96 ROE[avg,max] = [0.184507533, 0.218750000] radices = 256 16 16 32 0 0 0 0 0 0
4608 msec/iter = 117.71 ROE[avg,max] = [0.195703125, 0.234375000] radices = 288 16 16 32 0 0 0 0 0 0
5120 msec/iter = 142.07 ROE[avg,max] = [0.194757952, 0.218750000] radices = 320 16 16 32 0 0 0 0 0 0
5632 msec/iter = 151.15 ROE[avg,max] = [0.187960379, 0.222656250] radices = 352 16 16 32 0 0 0 0 0 0
6144 msec/iter = 173.96 ROE[avg,max] = [0.214758301, 0.250000000] radices = 768 16 16 16 0 0 0 0 0 0
6656 msec/iter = 198.43 ROE[avg,max] = [0.189202009, 0.250000000] radices = 208 32 32 16 0 0 0 0 0 0
7168 msec/iter = 210.43 ROE[avg,max] = [0.199651228, 0.218750000] radices = 224 16 32 32 0 0 0 0 0 0
7680 msec/iter = 232.87 ROE[avg,max] = [0.233147321, 0.312500000] radices = 240 16 32 32 0 0 0 0 0 0
For comparison prime95 v29.8b6 gives on the same i7-8750H and directly on the Windows OS, using all 6 cores, benchmark values ranging 42-80 iter/sec at 5M, and is currently running ~16.5 ms/iter on 93M. Note that I've not spent any time trying to tune Mlucas performance or run all cores. A fairer test might be running mprime on Ubuntu on WSL on Win10 for comparison, or single-core prime95 on Win10, versus mlucas compiled single core on msys2 if I could get that to compile and link there. Last fiddled with by kriesel on 2020-03-08 at 16:39 |
|
|
|
|
|
#24 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
31·173 Posts |
Running mprime benchmarking, 5120K fft, 1 core hyperthreaded, gave 66ms/iter.
Next time I do these I should stop the production prime95, mfakto on the IGP, and anything else that might be competing for cpu cycles or memory bandwidth instead of leaving them run as in the previously described timings. |
|
|
|
|
|
#25 |
|
∂2ω=0
Sep 2002
República de California
103·113 Posts |
By way of comparison, my current PRP-3 run of p~103M @5632K using all 4 cores of my aged Haswell system (stock 3.3GHz, no HT) is getting ~16-17 ms/iter, implying a 1-thread timing of no worse than 66 ms on that system. So a whopping 142 ms/iter @5120K using 2 threads on a single HT physical core of your system indicates ... I'm not sure what. OTOH if Prime95/mprime gets 66 ms/iter on the same system using a similar 2-threads-1-physical-core setup, that ~2x is typical of the single-physical-core speed difference I've seen.
I suggest first rerunning the self-test/benchmark with 1,2,4 and 6-threads on the same number of physical cores (i.e. -cpu 0, -cpu 0:1, -cpu 0:3, -cpu 0:5), using 100 iters to get the total runtime down to reasonable), posting those numbers, then we can examine || scaling, whether overloading 1 core (your -cpu 0,6) was faster than not, etc. Last fiddled with by ewmayer on 2020-03-08 at 22:12 |
|
|
|
|
|
#26 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
14F316 Posts |
Ok, Duck-Duck-Go is my linux tutor:
"Ubuntu linux" from the MS Store on WSL on Win10: Code:
$ uname -a Linux peregrine 4.4.0-18362-Microsoft #476-Microsoft Fri Nov 01 16:53:00 PST 2019 x86_64 x86_64 x86_64 GNU/Linux $ cat /proc/version Linux version 4.4.0-18362-Microsoft (Microsoft@Microsoft.com) (gcc version 5.4.0 (GCC) ) #476 Microsoft Fri Nov 01 16:53:00 PST 2019 $ cat /etc/issue Ubuntu 18.04.2 LTS \n \l $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.2 LTS Release: 18.04 Codename: bionic Code:
$ uname -a MINGW64_NT-6.1-7601 condorella 3.0.7-338.x86_64 2019-07-11 10:58 UTC x86_64 Msys $ cat /proc/version MINGW64_NT-6.1-7601 version 3.0.7-338.x86_64 (Alexx@WARLOCK) (gcc version 9.1.0 (GCC) ) 2019-07-11 10:58 UTC $ cat /etc/issue cat: /etc/issue: No such file or directory $ lsb_release -a bash: lsb_release: command not found Code:
$ uname -a Linux ken-peregrine-ubuntu 5.0.0-38-generic #41-Ubuntu SMP Tue Dec 3 00:27:35 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux $ cat /proc/version Linux version 5.0.0-38-generic (buildd@lgw01-amd64-036) (gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1)) #41-Ubuntu SMP Tue De 3 00:27:35 UTC 2019 $ cat /etc/issue Ubuntu 19.04 \n \l $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 19.04 Release: 19.04 Codenme: disco Code:
>ver Microsoft Windows [Version 6.1.7601] Code:
>ver Microsoft Windows [Version 10.0.18362.657] |
|
|
|
|
|
#27 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
123638 Posts |
Mlucas V19 on "Ubuntu" on WSL on Win10, i7-8750H laptop,
production prime95 and mfakto stopped for the test; incidental other background activity using ~5% of cpu capacity. Note only one of the two SODIMM sockets is occupied; 16GB RAM ./mlucas-fma3-mt -fftlen 192 -iters 1000 -radset 0 -nthread x, varied nthread 1: 3.108 sec 2: 1.819 3: 1.883 4: 2.012 6: 2.020 12: 2.100 ./mlucas-fma3-mt -fftlen 5120 -iters 100 -radset 0 -nthread x, varied nthread 1: 4.389 sec 2: 2.689 3: 2.351 4: 1.832 6: 1.736 12: 1.825 ./mlucas-fma3-mt -fftlen 5120 -iters 100 -radset 0 -nthread x, varied nthread 1: 4.389 sec 2: 2.689 3: 2.351 4: 1.832 6: 1.736 12: 1.825 ./mlucas-fma3-mt -fftlen 5632 -iters 100 -radset 0 -nthread x, varied nthread 1: 4.796 sec 2: 2.829 3: 2.442 4: 2.025 6: 1.888 12: 2.035 ./mlucas-fma3-mt -fftlen 5632 -iters 100 -radset 0 -cpu 0:11 2.012 sec ./mlucas-fma3-mt -fftlen 5632 -iters 100 -radset 0 -cpu 0:5 1.870 sec ./mlucas-fma3-mt -fftlen 5632 -iters 100 -radset 0 -cpu 0:3 1.929 sec ./mlucas-fma3-mt -fftlen 5632 -iters 100 -radset 0 -cpu 0:1 2.773 sec repeated timings 2.781, 2.788, 2.805, 2.924. ./mlucas-fma3-mt -fftlen 5632 -iters 100 -radset 0 -cpu 0 4.634 sec repeated timings 4.624, 4.647, 4.647, 4.665. Last fiddled with by kriesel on 2020-03-09 at 21:52 |
|
|
|
|
|
#28 | |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
My PRP test of M103928393 on my famously glitchy and error-prone Haswell quad finished a couple days ago. I noticed it had no fewer than 4 Gerbicz-check failure-and-retries along the way, so decided a DC using gpuOwl on my GPU was in order. Problem was, when I tried reserving it via the Manual Tests pages - not completely necessary, since gpuOwl can test without a proper Primenet assignment ID - I got
Code:
Error code: 40 Error text: No assignment available meeting CPU, program code and work preference requirements, cpu_id: 145323, cpu # = 0, user_id = 20047 Quote:
Code:
[2020-02-24 12:17:07] M103928393 Iter# = 27000000 [25.98% complete] clocks = 00:03:23.189 [ 20.3190 msec/iter] Res64: F3BE8DF410F5D624. AvgMaxErr = 0.148711378. MaxErr = 0.218750000. Residue shift count = 16345459. Gerbicz check failed! Restarting from last-good-Gerbicz-check data, or from scratch if iteration < 1000000 Restarting M103928393 at iteration = 26000000. Res64: 0E65A742C2BB20B1, residue shift count = 69955037 M103928393: using FFT length 5632K = 5767168 8-byte floats, initial residue shift count = 69955037 this gives an average 18.020698027177289 bits per digit The test will be done in form of a 3-PRP test. [2020-02-24 12:20:53] M103928393 Iter# = 26010000 [25.03% complete] clocks = 00:03:23.089 [ 20.3089 msec/iter] Res64: 475BF991D56F1BA0. AvgMaxErr = 0.148887068. MaxErr = 0.218750000. Residue shift count = 84179301. ... [2020-02-24 18:01:56] M103928393 Iter# = 27000000 [25.98% complete] clocks = 00:03:22.197 [ 20.2198 msec/iter] Res64: 6A0AD63777D57CDB. AvgMaxErr = 0.148770224. MaxErr = 0.203125000. Residue shift count = 16345459. At iteration 27000000, shift = 16345459: Gerbicz check passed. [2020-02-24 18:05:42] M103928393 Iter# = 27010000 [25.99% complete] clocks = 00:03:22.284 [ 20.2285 msec/iter] Res64: D5ECCFB2699E1849. AvgMaxErr = 0.148870173. MaxErr = 0.218750000. Residue shift count = 44187191. Code:
[2020-02-24 03:36:55] M103928393 Iter# = 26600000 [25.59% complete] clocks = 00:03:23.000 [ 20.3000 msec/iter] Res64: 24006083D0E3DC08. AvgMaxErr = 0.148862082. MaxErr = 0.218750000. Residue shift count = 26402859. [2020-02-24 03:40:20] M103928393 Iter# = 26610000 [25.60% complete] clocks = 00:03:22.976 [ 20.2977 msec/iter] Res64: 3D5C81ED9B60E581. AvgMaxErr = 0.148812770. MaxErr = 0.203125000. Residue shift count = 17047481. [2020-02-24 03:43:46] M103928393 Iter# = 26620000 [25.61% complete] clocks = 00:03:22.984 [ 20.2984 msec/iter] Res64: CC78A4D41D7C36B0. AvgMaxErr = 0.148618110. MaxErr = 0.203125000. Residue shift count = 90014462. [2020-02-24 03:47:13] M103928393 Iter# = 26630000 [25.62% complete] clocks = 00:03:25.069 [ 20.5070 msec/iter] Res64: 28666DBCDDDACF95. AvgMaxErr = 0.148698978. MaxErr = 0.218750000. Residue shift count = 71209289. [2020-02-24 03:50:41] M103928393 Iter# = 26640000 [25.63% complete] clocks = 00:03:25.529 [ 20.5530 msec/iter] Res64: 97A4BE71C4C69F35. AvgMaxErr = 0.148884626. MaxErr = 0.234375000. Residue shift count = 74415220. [2020-02-24 03:54:06] M103928393 Iter# = 26650000 [25.64% complete] clocks = 00:03:23.486 [ 20.3487 msec/iter] Res64: E21867090F1ED800. AvgMaxErr = 0.148870620. MaxErr = 0.218750000. Residue shift count = 35507203. [2020-02-24 03:57:33] M103928393 Iter# = 26660000 [25.65% complete] clocks = 00:03:23.890 [ 20.3891 msec/iter] Res64: 80B2F7731FECF0FF. AvgMaxErr = 0.148867484. MaxErr = 0.187500000. Residue shift count = 44352630. [2020-02-24 04:00:58] M103928393 Iter# = 26670000 [25.66% complete] clocks = 00:03:23.111 [ 20.3112 msec/iter] Res64: BCE2C7BC49D369AB. AvgMaxErr = 0.148717655. MaxErr = 0.250000000. Residue shift count = 94577777. [2020-02-24 04:04:23] M103928393 Iter# = 26680000 [25.67% complete] clocks = 00:03:22.826 [ 20.2826 msec/iter] Res64: 8F058E4D087CAA99. AvgMaxErr = 0.148834566. MaxErr = 0.218750000. Residue shift count = 43649701. M103928393 Roundoff warning on iteration 26687954, maxerr = 0.500000000000 Retrying iteration interval to see if roundoff error is reproducible. Restarting M103928393 at iteration = 26680000. Res64: 8F058E4D087CAA99, residue shift count = 43649701 M103928393: using FFT length 5632K = 5767168 8-byte floats, initial residue shift count = 43649701 this gives an average 18.020698027177289 bits per digit The test will be done in form of a 3-PRP test. Retry of iteration interval with fatal roundoff error was successful. [2020-02-24 04:10:03] M103928393 Iter# = 26690000 [25.68% complete] clocks = 00:02:56.183 [ 17.6184 msec/iter] Res64: BBC72ACB4A413178. AvgMaxErr = 0.148836098. MaxErr = 0.250000000. Residue shift count = 21946860. Code:
[2020-02-24 15:43:21] M103928393 Iter# = 26600000 [25.59% complete] clocks = 00:03:23.937 [ 20.3937 msec/iter] Res64: 24006083D0E3DC08. AvgMaxErr = 0.148862082. MaxErr = 0.218750000. Residue shift count = 26402859. [2020-02-24 15:46:48] M103928393 Iter# = 26610000 [25.60% complete] clocks = 00:03:24.153 [ 20.4153 msec/iter] Res64: 3D5C81ED9B60E581. AvgMaxErr = 0.148812770. MaxErr = 0.203125000. Residue shift count = 17047481. [2020-02-24 15:50:13] M103928393 Iter# = 26620000 [25.61% complete] clocks = 00:03:22.396 [ 20.2396 msec/iter] Res64: CC78A4D41D7C36B0. AvgMaxErr = 0.148618110. MaxErr = 0.203125000. Residue shift count = 90014462. [2020-02-24 15:53:38] M103928393 Iter# = 26630000 [25.62% complete] clocks = 00:03:23.522 [ 20.3522 msec/iter] Res64: 28666DBCDDDACF95. AvgMaxErr = 0.148698978. MaxErr = 0.218750000. Residue shift count = 71209289. [2020-02-24 15:57:04] M103928393 Iter# = 26640000 [25.63% complete] clocks = 00:03:23.579 [ 20.3579 msec/iter] Res64: 97A4BE71C4C69F35. AvgMaxErr = 0.148884626. MaxErr = 0.234375000. Residue shift count = 74415220. [2020-02-24 16:00:30] M103928393 Iter# = 26650000 [25.64% complete] clocks = 00:03:23.273 [ 20.3273 msec/iter] Res64: E21867090F1ED800. AvgMaxErr = 0.148870620. MaxErr = 0.218750000. Residue shift count = 35507203. [2020-02-24 16:03:55] M103928393 Iter# = 26660000 [25.65% complete] clocks = 00:03:23.569 [ 20.3570 msec/iter] Res64: 80B2F7731FECF0FF. AvgMaxErr = 0.148867484. MaxErr = 0.187500000. Residue shift count = 44352630. [2020-02-24 16:07:21] M103928393 Iter# = 26670000 [25.66% complete] clocks = 00:03:23.612 [ 20.3612 msec/iter] Res64: BCE2C7BC49D369AB. AvgMaxErr = 0.148717655. MaxErr = 0.250000000. Residue shift count = 94577777. [2020-02-24 16:10:46] M103928393 Iter# = 26680000 [25.67% complete] clocks = 00:03:22.849 [ 20.2850 msec/iter] Res64: 8F058E4D087CAA99. AvgMaxErr = 0.148834566. MaxErr = 0.218750000. Residue shift count = 43649701. [2020-02-24 16:14:12] M103928393 Iter# = 26690000 [25.68% complete] clocks = 00:03:23.453 [ 20.3454 msec/iter] Res64: 070333818EEE6AE7. AvgMaxErr = 0.148934921. MaxErr = 0.203125000. Residue shift count = 21946860. M103928393 Roundoff warning on iteration 26694909, maxerr = 0.500000000000 Retrying iteration interval to see if roundoff error is reproducible. Restarting M103928393 at iteration = 26690000. Res64: 070333818EEE6AE7, residue shift count = 21946860 M103928393: using FFT length 5632K = 5767168 8-byte floats, initial residue shift count = 21946860 this gives an average 18.020698027177289 bits per digit The test will be done in form of a 3-PRP test. Retry of iteration interval with fatal roundoff error was successful. It's funny - I used to regularly curse this, um, accursed Haswell system, but now that it's no longer doing a significant % of my GIMPS crunching I'm really enjoying the if-I-can-get-reliable-results-from-this-piece-of-crap aspects. But no more LL-tests or LL-DCs on this system, that's for sure. Oh, here is the full set of every-1M-iter interim residues for my run, including the G-check failures: Code:
[2020-02-17 01:50:20] M103928393 Iter# = 1000000 Res64: BBBC3545748228D8. MaxErr = 0.250000000. shift = 72344423. [2020-02-17 07:43:10] M103928393 Iter# = 2000000 Res64: 025C080BC4C6055B. MaxErr = 0.250000000. shift = 44575791. [2020-02-17 13:36:33] M103928393 Iter# = 3000000 Res64: 85DB2AC8E470DFEC. MaxErr = 0.281250000. shift = 89726158. [2020-02-17 19:21:57] M103928393 Iter# = 4000000 Res64: BA66000615EF926A. MaxErr = 0.218750000. shift = 87626016. [2020-02-18 00:57:26] M103928393 Iter# = 5000000 Res64: E721C49DB5954694. MaxErr = 0.203125000. shift = 16969828. [2020-02-18 06:24:33] M103928393 Iter# = 6000000 Res64: 9FDEA3BDD4B7FC45. MaxErr = 0.218750000. shift = 99400703. [2020-02-18 13:07:25] M103928393 Iter# = 7000000 Res64: BE95DA8D2E615F52. MaxErr = 0.203125000. shift = 20741788. [2020-02-18 22:01:34] M103928393 Iter# = 8000000 Res64: 60B0C6EE5B0DDCE6. MaxErr = 0.203125000. shift = 37090148. [2020-02-19 06:30:59] M103928393 Iter# = 9000000 Res64: 511F198D4F2BE727. MaxErr = 0.203125000. shift = 93281939. [2020-02-19 11:18:56] M103928393 Iter# = 10000000 Res64: AB10CE8109FF2E7E. MaxErr = 0.218750000. shift = 40201342. [2020-02-19 17:02:02] M103928393 Iter# = 11000000 Res64: 0D79D26162CBA4B5. MaxErr = 0.203125000. shift = 101752387. [2020-02-20 00:06:20] M103928393 Iter# = 12000000 Res64: EDA5832B97C533DC. MaxErr = 0.195312500. shift = 10233705. [2020-02-20 09:02:10] M103928393 Iter# = 13000000 Res64: BF586784A1B0D9DD. MaxErr = 0.218750000. shift = 48015562. [2020-02-20 18:00:31] M103928393 Iter# = 14000000 Res64: 48452729BF0A0732. MaxErr = 0.203125000. shift = 3158862. [2020-02-21 02:56:16] M103928393 Iter# = 15000000 Res64: 8B80BB93C1361106. MaxErr = 0.203125000. shift = 96856822. [2020-02-21 11:32:12] M103928393 Iter# = 16000000 Res64: FA48BCC7E0ABC5F7. MaxErr = 0.203125000. shift = 32146366. [2020-02-21 20:44:32] M103928393 Iter# = 17000000 Res64: 377C185E322A9F98. MaxErr = 0.218750000. shift = 17221864. [2020-02-22 02:20:33] M103928393 Iter# = 18000000 Res64: CD8C589AFB89DB19. MaxErr = 0.218750000. shift = 50146915. [2020-02-22 08:04:17] M103928393 Iter# = 19000000 Res64: 24A95ADC07142A48. MaxErr = 0.187500000. shift = 69632311. [2020-02-22 13:52:49] M103928393 Iter# = 20000000 Res64: 3C675B06ED79F705. MaxErr = 0.203125000. shift = 26706482. [2020-02-22 19:42:02] M103928393 Iter# = 21000000 Res64: EC5660CA1082F345. MaxErr = 0.203125000. shift = 45377540. [2020-02-23 01:23:59] M103928393 Iter# = 22000000 Res64: 5A363FFEAF9FFDD8. MaxErr = 0.218750000. shift = 33006999. [2020-02-23 07:07:14] M103928393 Iter# = 23000000 Res64: 134074260FFFEFA5. MaxErr = 0.218750000. shift = 63572438. [2020-02-23 12:49:41] M103928393 Iter# = 24000000 Res64: 3A557523E1D1B19A. MaxErr = 0.203125000. shift = 78368660. [2020-02-23 18:27:31] M103928393 Iter# = 25000000 Res64: 138552F613F4BD6D. MaxErr = 0.203125000. shift = 16942643. [2020-02-24 00:10:02] M103928393 Iter# = 26000000 Res64: 0E65A742C2BB20B1. MaxErr = 0.218750000. shift = 69955037. Gerbicz check failed! Restarting from last-good-Gerbicz-check data, or from scratch if iteration < 1000000 [2020-02-24 12:17:07] M103928393 Iter# = 27000000 Res64: F3BE8DF410F5D624. MaxErr = 0.218750000. shift = 16345459. [2020-02-24 18:01:56] M103928393 Iter# = 27000000 Res64: 6A0AD63777D57CDB. MaxErr = 0.203125000. shift = 16345459. [2020-02-25 13:36:24] M103928393 Iter# = 28000000 Res64: A8F349B3B5B1AFF7. MaxErr = 0.203125000. shift = 58352056. [2020-02-25 19:04:37] M103928393 Iter# = 29000000 Res64: F1CF00263AA7CC28. MaxErr = 0.203125000. shift = 48574191. [2020-02-26 12:58:12] M103928393 Iter# = 30000000 Res64: 99D4FB9E0C76ADE4. MaxErr = 0.203125000. shift = 74711714. [2020-02-26 20:16:40] M103928393 Iter# = 31000000 Res64: CD7BA2D0CBF319C6. MaxErr = 0.203125000. shift = 46484356. [2020-02-27 01:38:02] M103928393 Iter# = 32000000 Res64: D26AFB429CFD18AD. MaxErr = 0.203125000. shift = 60793685. [2020-02-27 06:51:41] M103928393 Iter# = 33000000 Res64: 02C2DAC6FA89A512. MaxErr = 0.203125000. shift = 37220461. [2020-02-27 12:07:02] M103928393 Iter# = 34000000 Res64: F73E3B502508EC17. MaxErr = 0.250000000. shift = 73230751. [2020-02-27 17:20:32] M103928393 Iter# = 35000000 Res64: 378304DEB0552788. MaxErr = 0.203125000. shift = 16550084. [2020-02-28 01:06:29] M103928393 Iter# = 36000000 Res64: 44931F56605C0665. MaxErr = 0.218750000. shift = 50708651. [2020-02-28 06:17:32] M103928393 Iter# = 37000000 Res64: 440D64F93493A5F0. MaxErr = 0.203125000. shift = 87534239. [2020-02-28 11:31:11] M103928393 Iter# = 38000000 Res64: D5A0E22C8AC79E65. MaxErr = 0.218750000. shift = 35802507. [2020-02-28 17:02:25] M103928393 Iter# = 39000000 Res64: 2C198B4B490A06E3. MaxErr = 0.218750000. shift = 30983032. [2020-02-28 22:18:04] M103928393 Iter# = 40000000 Res64: 9F79C2894C3D7FBC. MaxErr = 0.187500000. shift = 44669779. [2020-02-29 03:34:38] M103928393 Iter# = 41000000 Res64: B13F55C1BAE6FCC8. MaxErr = 0.218750000. shift = 46462316. [2020-02-29 22:57:48] M103928393 Iter# = 42000000 Res64: EF9620EBB65ACC84. MaxErr = 0.218750000. shift = 12147797. [2020-03-01 15:15:37] M103928393 Iter# = 43000000 Res64: 052C1F13AE12DFD3. MaxErr = 0.218750000. shift = 22454771. [2020-03-01 20:24:23] M103928393 Iter# = 44000000 Res64: 7CDD9FA077AFEF10. MaxErr = 0.187500000. shift = 41381271. [2020-03-02 01:33:42] M103928393 Iter# = 45000000 Res64: B1A45BB8243F2532. MaxErr = 0.203125000. shift = 46613693. [2020-03-02 13:17:34] M103928393 Iter# = 46000000 Res64: 2C0A2FFA764C82BA. MaxErr = 0.203125000. shift = 92985671. [2020-03-02 18:22:38] M103928393 Iter# = 47000000 Res64: 00CD4BD9278888D4. MaxErr = 0.218750000. shift = 46300678. [2020-03-02 23:45:02] M103928393 Iter# = 48000000 Res64: B98EDEC6431ACA7A. MaxErr = 0.218750000. shift = 76984717. [2020-03-03 16:22:45] M103928393 Iter# = 49000000 Res64: D3C28C254DD0D26F. MaxErr = 0.218750000. shift = 100048267. Gerbicz check failed! Restarting from last-good-Gerbicz-check data, or from scratch if iteration < 1000000 [2020-03-03 21:20:31] M103928393 Iter# = 50000000 Res64: 5752753AF252BE20. MaxErr = 0.203125000. shift = 57343354. [2020-03-04 03:14:25] M103928393 Iter# = 50000000 Res64: 4063D9F2B7BE34EE. MaxErr = 0.203125000. shift = 57343354. [2020-03-04 08:23:33] M103928393 Iter# = 51000000 Res64: 490F2A5D543108AA. MaxErr = 0.218750000. shift = 45026483. [2020-03-04 14:27:26] M103928393 Iter# = 52000000 Res64: 8A931552D40A1C3C. MaxErr = 0.187500000. shift = 50135037. [2020-03-04 20:42:32] M103928393 Iter# = 53000000 Res64: 46C9814B2B4B4429. MaxErr = 0.218750000. shift = 60994797. [2020-03-05 13:27:18] M103928393 Iter# = 54000000 Res64: F424E1ABA000A55D. MaxErr = 0.218750000. shift = 36384582. [2020-03-05 18:46:37] M103928393 Iter# = 55000000 Res64: 9695E2732CD478FE. MaxErr = 0.218750000. shift = 66125097. [2020-03-06 00:08:23] M103928393 Iter# = 56000000 Res64: 9BD8E857DAFCCD37. MaxErr = 0.218750000. shift = 58385411. [2020-03-06 05:32:35] M103928393 Iter# = 57000000 Res64: 3E97DC5E3101937E. MaxErr = 0.203125000. shift = 30148663. [2020-03-06 10:59:50] M103928393 Iter# = 58000000 Res64: 681CA06F306588A8. MaxErr = 0.218750000. shift = 57683924. [2020-03-06 16:36:30] M103928393 Iter# = 59000000 Res64: 2313B6026B7D5B5A. MaxErr = 0.218750000. shift = 100909260. [2020-03-06 21:54:06] M103928393 Iter# = 60000000 Res64: 79DA089E14798438. MaxErr = 0.203125000. shift = 52052766. [2020-03-07 03:10:30] M103928393 Iter# = 61000000 Res64: 0604748846C7BA43. MaxErr = 0.203125000. shift = 50100806. [2020-03-07 13:24:02] M103928393 Iter# = 62000000 Res64: 73BA8F7177734BFC. MaxErr = 0.218750000. shift = 24937938. [2020-03-07 18:42:52] M103928393 Iter# = 63000000 Res64: E15DB4126ED9FDCC. MaxErr = 0.203125000. shift = 102809206. [2020-03-08 00:02:44] M103928393 Iter# = 64000000 Res64: ACE78A6D6AFCD490. MaxErr = 0.187500000. shift = 10622924. [2020-03-08 06:24:27] M103928393 Iter# = 65000000 Res64: 8B6F19963789D44D. MaxErr = 0.203125000. shift = 15187609. [2020-03-08 11:48:59] M103928393 Iter# = 66000000 Res64: A13DA6AEB79BA5BC. MaxErr = 0.218750000. shift = 19703128. [2020-03-08 17:08:14] M103928393 Iter# = 67000000 Res64: 9DE7325DEBECE0A7. MaxErr = 0.203125000. shift = 31776614. [2020-03-08 22:27:35] M103928393 Iter# = 68000000 Res64: 0DD39D6A8AEA83C5. MaxErr = 0.187500000. shift = 16852816. [2020-03-09 03:47:32] M103928393 Iter# = 69000000 Res64: CB9C15D14E09FA8A. MaxErr = 0.218750000. shift = 34242940. [2020-03-09 09:10:22] M103928393 Iter# = 70000000 Res64: 2EF92E9C7474B7DF. MaxErr = 0.203125000. shift = 13081306. [2020-03-09 14:32:28] M103928393 Iter# = 71000000 Res64: 68E9F52257DED1CB. MaxErr = 0.203125000. shift = 83966239. [2020-03-09 19:42:51] M103928393 Iter# = 72000000 Res64: FB96E02765CAE316. MaxErr = 0.218750000. shift = 67499467. Gerbicz check failed! Restarting from last-good-Gerbicz-check data, or from scratch if iteration < 1000000 [2020-03-10 02:44:11] M103928393 Iter# = 73000000 Res64: 684D004C6CB2BE79. MaxErr = 0.203125000. shift = 89238658. [2020-03-10 07:58:53] M103928393 Iter# = 73000000 Res64: E618711B71EB2D54. MaxErr = 0.218750000. shift = 89238658. [2020-03-10 13:41:19] M103928393 Iter# = 74000000 Res64: 17DDE850894B6914. MaxErr = 0.218750000. shift = 5272140. [2020-03-10 19:02:27] M103928393 Iter# = 75000000 Res64: 32B399B8B93D9210. MaxErr = 0.218750000. shift = 45348029. Gerbicz check failed! Restarting from last-good-Gerbicz-check data, or from scratch if iteration < 1000000 [2020-03-11 01:25:59] M103928393 Iter# = 76000000 Res64: 541D72E853D20C99. MaxErr = 0.218750000. shift = 93566275. [2020-03-11 13:02:27] M103928393 Iter# = 76000000 Res64: C98E4FDC71CBB181. MaxErr = 0.218750000. shift = 93566275. [2020-03-11 18:15:54] M103928393 Iter# = 77000000 Res64: 3D4B7BCF6B1A8368. MaxErr = 0.203125000. shift = 22817827. [2020-03-11 23:27:29] M103928393 Iter# = 78000000 Res64: 7E17BD3314DBBEE4. MaxErr = 0.218750000. shift = 25349038. [2020-03-12 11:59:14] M103928393 Iter# = 79000000 Res64: D4179244E4CEC500. MaxErr = 0.203125000. shift = 35774788. [2020-03-12 17:35:40] M103928393 Iter# = 80000000 Res64: F91127EB7F404C68. MaxErr = 0.187500000. shift = 33102786. [2020-03-12 23:08:51] M103928393 Iter# = 81000000 Res64: 8A0E250E067312BA. MaxErr = 0.203125000. shift = 98858751. [2020-03-13 11:16:35] M103928393 Iter# = 82000000 Res64: 6C5E069A5B68F3A0. MaxErr = 0.203125000. shift = 56389958. [2020-03-13 16:33:22] M103928393 Iter# = 83000000 Res64: 95B59DE3DC7CCF89. MaxErr = 0.218750000. shift = 7674443. [2020-03-13 21:52:00] M103928393 Iter# = 84000000 Res64: 594F77368081548A. MaxErr = 0.203125000. shift = 15739078. [2020-03-14 03:11:26] M103928393 Iter# = 85000000 Res64: 0D438FDBFFA56306. MaxErr = 0.234375000. shift = 21319387. [2020-03-14 08:33:02] M103928393 Iter# = 86000000 Res64: 05C10F4FEA66F3EF. MaxErr = 0.218750000. shift = 14519481. [2020-03-14 13:54:05] M103928393 Iter# = 87000000 Res64: E53A17F251CB45DE. MaxErr = 0.218750000. shift = 76083446. [2020-03-14 19:14:57] M103928393 Iter# = 88000000 Res64: C1C67FC031E0C7C2. MaxErr = 0.218750000. shift = 59669909. [2020-03-15 00:34:47] M103928393 Iter# = 89000000 Res64: 4C71AE552A4C1882. MaxErr = 0.203125000. shift = 5762860. [2020-03-15 11:00:37] M103928393 Iter# = 90000000 Res64: 4FEC488B88836FAE. MaxErr = 0.203125000. shift = 58418774. [2020-03-15 16:17:27] M103928393 Iter# = 91000000 Res64: F769669BA8561D38. MaxErr = 0.203125000. shift = 19813633. [2020-03-15 21:34:46] M103928393 Iter# = 92000000 Res64: 6D5B294ED5503BF6. MaxErr = 0.207031250. shift = 37832227. [2020-03-16 02:53:00] M103928393 Iter# = 93000000 Res64: 3ACB5D989A7F0AFF. MaxErr = 0.218750000. shift = 21992959. [2020-03-16 08:14:54] M103928393 Iter# = 94000000 Res64: B33A4FFDC97BEA3F. MaxErr = 0.203125000. shift = 59446616. [2020-03-16 13:34:35] M103928393 Iter# = 95000000 Res64: C9E06367147C632E. MaxErr = 0.203125000. shift = 49196708. [2020-03-16 18:56:31] M103928393 Iter# = 96000000 Res64: 111AAB6935CA25B3. MaxErr = 0.218750000. shift = 9592755. [2020-03-17 00:15:11] M103928393 Iter# = 97000000 Res64: 5DF28D3B980EDDC1. MaxErr = 0.218750000. shift = 100780262. [2020-03-17 05:35:12] M103928393 Iter# = 98000000 Res64: 61712C5A42CD31F5. MaxErr = 0.218750000. shift = 76892552. [2020-03-17 10:53:45] M103928393 Iter# = 99000000 Res64: 3C5CBC3DC2FE5DAF. MaxErr = 0.218750000. shift = 38276972. [2020-03-17 16:11:59] M103928393 Iter# = 100000000 Res64: C8A48D71DC6613FD. MaxErr = 0.218750000. shift = 73920976. [2020-03-17 21:19:25] M103928393 Iter# = 101000000 Res64: F6C5F459CD73AFFC. MaxErr = 0.218750000. shift = 40478354. [2020-03-18 02:28:11] M103928393 Iter# = 102000000 Res64: 1A67D6A14E2C10C8. MaxErr = 0.203125000. shift = 56433856. [2020-03-18 18:36:41] M103928393 Iter# = 103000000 Res64: D01FC617E6CD6FAB. MaxErr = 0.218750000. shift = 76799979. Last fiddled with by ewmayer on 2020-03-22 at 23:38 |
|
|
|
|
|
|
#29 |
|
Romulan Interpreter
Jun 2011
Thailand
7×1,373 Posts |
What do I see there? do you have some kind of "random" shift for every iteration?
|
|
|
|
|
|
#30 | |
|
"Robert Gerbicz"
Oct 2005
Hungary
148410 Posts |
Quote:
You can do this in multiple ways. Say forcing an error check at 103928000 and then (boring) double checking the last 393 iterations. Other way what gpuowl's idea: go past, do 929000 iterations and then force an error check. George's code is a little more complicated on this (not even using a fixed block size=1000). LaurV, the whole shifting idea isn't that very necessary (but it is still good to have shifting). |
|
|
|
|
|
|
#31 |
|
Romulan Interpreter
Jun 2011
Thailand
961110 Posts |
My argument was that having the shift value changed every time randomly, is stupid. You must start with a shift, and keep it to the end. Otherwise, restoring any test is impossible (like for hunting bugs, fixing FFT mess, etc. I used to found bugs in cudaLucas that appeared for a particular shift, but not for the other, i.e. particular values for FFT, and those bugs could be traced by repeating the test with the same shift and watch where the residues start to differ (do a binary search with the checkpoint value for residues, etc). Maybe my understanding of the shift column is wrong, but with random shifts not only that such "bug fixing" became impossible, but also the tests themselves become unsure, how do you know that, even if you started with two different shifts, it didn't happen that in exactly the same point of both tests (FC and DC) the shifts weren't the same, and both tests aren't screwed? (probabilistic impossible, but the mother life is a bitch..., if you catch my point...)
Last fiddled with by LaurV on 2020-03-23 at 16:19 |
|
|
|
|
|
#32 | ||
|
∂2ω=0
Sep 2002
República de California
2D7716 Posts |
Quote:
Quote:
Code:
define shift_update(s0,niters,p) {
auto s,i;;
s = s0;
for(i = 0; i < niters; i++) {
s = 2*s % p;
}
print s0," * 2^",niters," (mod ",p,") = ",s,".\n";
}
Code:
p=103928393 shift_update(13429491,10^6,p) 13429491 * 2^1000000 (mod 103928393) = 72344423. Interestingly, though, adding randomization to this process proves crucial for doing *Fermat* number testing-with-shift. Here's why: for squaring chains modulo a Fermat number Fm, each subsequent squaring doubles the shift count (mod 2^m), so we have the problem that any nonzero initial shift s0 will lead to a shift s = 0 after precisely m square-mods, and the shift will remain 0 throughout the remainder of the squaring chain. One way to get around this problem is to define an auxiliary random-bit array, and on the (i)th square-mod, if bit[i] = 1, do a further mod-doubling of the residue, thus yielding the shift-count update s[i] = 2.s[i-1] + bit[i]. For a pair of Pepin tests of Fm such as ar typically done to allow for real-time cross-validation of interim residues, it is also advisable to seed the bit-arrays differently for each run. W.r.to debug-purposes reprducibility, they key is pseudorandomness in all this, i.e. that one's various 'random' numbers are 'reproducibly random' as the admittedly oxymoronic term of art has it. |
||
|
|
|
|
|
#33 |
|
Romulan Interpreter
Jun 2011
Thailand
7·1,373 Posts |
Exactly. We now talk the same language. Then my understanding was wrong, I was put off by that random-looking sting of residues on each line (which don't really needed to be printed, you could just print the shift at the beginning). In the past there was a talk on the forum about "random shifts" and I was afraid that you subscribed to that silly idea. False alarm. Sorry.
Last fiddled with by LaurV on 2020-03-24 at 02:50 |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Mlucas v18 available | ewmayer | Mlucas | 48 | 2019-11-28 02:53 |
| Mlucas version 17 | ewmayer | Mlucas | 3 | 2017-06-17 11:18 |
| MLucas on IBM Mainframe | Lorenzo | Mlucas | 52 | 2016-03-13 08:45 |
| Mlucas on Sparc - | Unregistered | Mlucas | 0 | 2009-10-27 20:35 |
| mlucas on sun | delta_t | Mlucas | 14 | 2007-10-04 05:45 |