![]() |
|
|
#45 | |
|
Sep 2002
Database er0rr
3,739 Posts |
Quote:
Last fiddled with by paulunderwood on 2019-06-07 at 13:40 |
|
|
|
|
|
|
#46 | |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
Quote:
Yes, it's very easy to get up and running - download the ARMv8 binary linked at the readme page, set up a pair of rundirs, one for the job which will run on the big and little CPUs, respectively. On the N2, the two a53 cores are numbered 0-1 in /proc/cpuinfo and the four a73 cores are 2-5 - I suggest you double-check that in your own copy of said file, because it's crucial to getting the most out of your Mlucas runs. So say you call the a73-rundir 'run0' and the a53 one 'run1'. To create the optimal-FFT-config files in each: In run0: [path to exec] -s m -iters 100 -cpu 2:5 In run1: [path to exec] -s m -iters 100 -cpu 0:1 I suggest doing these sequentially, to avoid any timing weirdness from the short timing subtests done by each self-test somehow throwing each other off. By way of reference, here is the 4a73 mlucas.cfg from my N2 self-tests: Code:
18.0
2048 msec/iter = 37.48 ROE[avg,max] = [0.325223214, 0.375000000] radices = 256 16 16 16 0 0 0 0 0 0
2304 msec/iter = 43.51 ROE[avg,max] = [0.287946429, 0.343750000] radices = 288 16 16 16 0 0 0 0 0 0
2560 msec/iter = 48.17 ROE[avg,max] = [0.275669643, 0.312500000] radices = 160 16 16 32 0 0 0 0 0 0
2816 msec/iter = 54.51 ROE[avg,max] = [0.259933036, 0.312500000] radices = 352 16 16 16 0 0 0 0 0 0
3072 msec/iter = 60.39 ROE[avg,max] = [0.316294643, 0.400000000] radices = 192 16 16 32 0 0 0 0 0 0
3328 msec/iter = 65.47 ROE[avg,max] = [0.280580357, 0.375000000] radices = 208 16 16 32 0 0 0 0 0 0
3584 msec/iter = 69.10 ROE[avg,max] = [0.325000000, 0.375000000] radices = 224 16 16 32 0 0 0 0 0 0
3840 msec/iter = 75.63 ROE[avg,max] = [0.275892857, 0.312500000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 79.60 ROE[avg,max] = [0.267633929, 0.343750000] radices = 256 16 16 32 0 0 0 0 0 0
4608 msec/iter = 91.84 ROE[avg,max] = [0.284375000, 0.375000000] radices = 288 16 16 32 0 0 0 0 0 0
5120 msec/iter = 104.83 ROE[avg,max] = [0.323437500, 0.406250000] radices = 320 16 16 32 0 0 0 0 0 0
5632 msec/iter = 114.77 ROE[avg,max] = [0.228450230, 0.250000000] radices = 352 16 16 32 0 0 0 0 0 0
6144 msec/iter = 134.42 ROE[avg,max] = [0.240848214, 0.281250000] radices = 768 16 16 16 0 0 0 0 0 0
6656 msec/iter = 149.31 ROE[avg,max] = [0.266964286, 0.343750000] radices = 208 32 32 16 0 0 0 0 0 0
7168 msec/iter = 159.99 ROE[avg,max] = [0.228906250, 0.281250000] radices = 224 32 32 16 0 0 0 0 0 0
7680 msec/iter = 174.89 ROE[avg,max] = [0.252455357, 0.312500000] radices = 240 32 32 16 0 0 0 0 0 0
Once you're up and running and have a few checkpoints under your belt, I'll post an "advance peek" v19 binary - same build I recently switched all my ARMs to - which still lacks the PRP support which will go into the final v19 release, but has some speedups related to relaxing the floating-point accuracy requirements for exponents not close to the p_max for each FFT length. I'm getting 2-8% speedup (depending on FFT length, exponent and random run-to-run timing variations) from using the new code, for the ~90% of exponents at each FFT length which are eligible for the accuracy-for-speed tradeoff. From the user perspective, it's a simply drop-in binary replacement, though. Last fiddled with by ewmayer on 2019-06-07 at 20:17 |
|
|
|
|
|
|
#47 | |
|
Sep 2002
Database er0rr
3,739 Posts |
Quote:
Nice website btw. |
|
|
|
|
|
|
#48 | |
|
∂2ω=0
Sep 2002
República de California
103·113 Posts |
Quote:
BTW, for my phones, I am requiring each one to produce 2 matching DC results prior to letting it start first-time-LL-test work. I do that via the priment.py script - first time I run it I use ./*py -d -t 0 -T DoubleCheck -u [uid] -p [pwd] which creates a 2-entry worktodo.ini file. Then on subsequent invocations (whenever the device in question completes an LL-job of either kind) I use ./*py -d -t 0 -T SmallestAvail -u [uid] -p [pwd] ("-d" enables debug, causing the script to provide some basic informational printing of work-submit and assignment-fetch. "-t 0" means run in single-shot once-only mode, as opposed to the automated every-6-hours mode which is the default.) Thanks for the thumbs-up on the Readme page - it's a continuing struggle to srike a balance between providing enough info but not overwhelming the new user, I rely on user feedback to help me maintain said balance. |
|
|
|
|
|
|
#49 | |
|
Sep 2002
Database er0rr
3,739 Posts |
Quote:
|
|
|
|
|
|
|
#50 | |
|
∂2ω=0
Sep 2002
República de California
103·113 Posts |
Quote:
|
|
|
|
|
|
|
#51 | |
|
Sep 2002
Database er0rr
72338 Posts |
Quote:
Code:
18.0
2048 msec/iter = 39.39 ROE[avg,max] = [0.003125000, 0.375000000] radices = 128 16 16 32 0 0 0 0 0 0
2304 msec/iter = 44.54 ROE[avg,max] = [0.002785714, 0.375000000] radices = 144 16 16 32 0 0 0 0 0 0
2560 msec/iter = 48.91 ROE[avg,max] = [0.002387312, 0.281250000] radices = 160 16 16 32 0 0 0 0 0 0
2816 msec/iter = 55.53 ROE[avg,max] = [0.002627232, 0.312500000] radices = 176 16 16 32 0 0 0 0 0 0
3072 msec/iter = 61.21 ROE[avg,max] = [0.002651786, 0.375000000] radices = 192 16 16 32 0 0 0 0 0 0
3328 msec/iter = 65.64 ROE[avg,max] = [0.002812500, 0.312500000] radices = 208 16 16 32 0 0 0 0 0 0
3584 msec/iter = 70.97 ROE[avg,max] = [0.002535714, 0.281250000] radices = 224 16 16 32 0 0 0 0 0 0
3840 msec/iter = 76.91 ROE[avg,max] = [0.002471819, 0.281250000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 81.47 ROE[avg,max] = [0.002280134, 0.281250000] radices = 256 16 16 32 0 0 0 0 0 0
4608 msec/iter = 94.10 ROE[avg,max] = [0.002476144, 0.281250000] radices = 288 16 16 32 0 0 0 0 0 0
5120 msec/iter = 107.07 ROE[avg,max] = [0.003209821, 0.375000000] radices = 320 16 16 32 0 0 0 0 0 0
5632 msec/iter = 129.73 ROE[avg,max] = [0.002598214, 0.312500000] radices = 176 32 32 16 0 0 0 0 0 0
6144 msec/iter = 141.46 ROE[avg,max] = [0.002475446, 0.281250000] radices = 192 32 32 16 0 0 0 0 0 0
6656 msec/iter = 152.57 ROE[avg,max] = [0.002642857, 0.312500000] radices = 208 32 32 16 0 0 0 0 0 0
7168 msec/iter = 164.32 ROE[avg,max] = [0.002260045, 0.250000000] radices = 224 32 32 16 0 0 0 0 0 0
7680 msec/iter = 178.51 ROE[avg,max] = [0.002350551, 0.281250000] radices = 240 32 32 16 0 0 0 0 0 0
|
|
|
|
|
|
|
#52 | |
|
∂2ω=0
Sep 2002
República de California
1163910 Posts |
Quote:
Are you running a full-blown GIMPS assignment now? I'd be interested in seeing a sample of the typical checkpoint timing line from the p*.stat file. (And if you started said run using v18, what effect ctrl-c and restart using the v19 binary has - you could just use the above cfg-file for that, unless you are testing @2816K or 5632K, in which case radix-352 will likely help, timing-wise). Will you be using the N2 for development work of your own? Last fiddled with by ewmayer on 2019-06-08 at 21:33 |
|
|
|
|
|
|
#53 | |
|
Sep 2002
Database er0rr
373910 Posts |
Quote:
Code:
INFO: no restart file found...starting run from scratch. M9141xxxxx: using FFT length 5120K = 5242880 8-byte floats, initial residue shift count = 6324947 this gives an average 17.435125541687011 bits per digit Using complex FFT radices 320 16 16 32 [Jun 08 19:23:42] M914xxxxx Iter# = 10000 [ 0.01% complete] clocks = 00:18:13.72 7 [109.3727 msec/iter] Res64: 771472D5BD75657A. AvgMaxErr = 0.062755114. MaxErr = 0.085937500. Residue shift count = 24468007. [Jun 08 19:41:24] M914xxxxx Iter# = 20000 [ 0.02% complete] clocks = 00:17:40.61 6 [106.0617 msec/iter] Res64: 6A3AB4D6D38D864F. AvgMaxErr = 0.062874775. MaxErr = 0.093750000. Residue shift count = 41145087. [Jun 08 19:59:16] M914xxxxx Iter# = 30000 [ 0.03% complete] clocks = 00:17:50.35 4 [107.0355 msec/iter] Res64: 6A42564A06E2381C. AvgMaxErr = 0.062869088. MaxErr = 0.085937500. Residue shift count = 28935869. [Jun 08 20:17:42] M914xxxxx Iter# = 40000 [ 0.04% complete] clocks = 00:18:21.24 7 [110.1247 msec/iter] Res64: 4F6CF208BAE55456. AvgMaxErr = 0.062931570. MaxErr = 0.085937500. Residue shift count = 80180192. [Jun 08 20:35:48] M914xxxxx Iter# = 50000 [ 0.05% complete] clocks = 00:18:03.30 9 [108.3310 msec/iter] Res64: C9F7EABB3783A435. AvgMaxErr = 0.062861539. MaxErr = 0.085937500. Residue shift count = 84044778. [Jun 08 20:55:01] M914xxxxx Iter# = 60000 [ 0.07% complete] clocks = 00:19:10.38 9 [115.0389 msec/iter] Res64: DB534D6782A1A68E. AvgMaxErr = 0.062953338. MaxErr = 0.093750000. Residue shift count = 15509133. [Jun 08 21:12:44] M914xxxxx Iter# = 70000 [ 0.08% complete] clocks = 00:17:38.15 5 [105.8155 msec/iter] Res64: 82734EA25CAAE188. AvgMaxErr = 0.062877951. MaxErr = 0.085937500. Residue shift count = 59420150. [Jun 08 21:30:15] M914xxxxx Iter# = 80000 [ 0.09% complete] clocks = 00:17:26.34 2 [104.6343 msec/iter] Res64: B3DDB30E8EA490B8. AvgMaxErr = 0.062913278. MaxErr = 0.093750000. Residue shift count = 36766103. [Jun 08 21:47:43] M914xxxxx Iter# = 90000 [ 0.10% complete] clocks = 00:17:25.12 5 [104.5126 msec/iter] Res64: 9F0AAFBA656E82BC. AvgMaxErr = 0.062897896. MaxErr = 0.093750000. Residue shift count = 87580274. Last fiddled with by paulunderwood on 2019-06-08 at 22:10 |
|
|
|
|
|
|
#54 |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
Mainly just interested to hear from other folks who may be doing ARM-oriented code development. Thanks for the data!
|
|
|
|
|
|
#55 | |
|
Sep 2002
Database er0rr
373910 Posts |
Quote:
p.s. How often should the client report into PrimeNet? Last fiddled with by paulunderwood on 2019-06-08 at 22:59 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mprime on Odroid 64bit | ET_ | Software | 2 | 2017-02-24 15:42 |
| GPU72 plans post-announcement | garo | GPU to 72 | 25 | 2013-03-04 10:11 |
| The Prime Announcement Thread | axn | Sierpinski/Riesel Base 5 | 61 | 2008-12-08 16:28 |
| Subscribing to announcement thread | fetofs | GMP-ECM | 1 | 2006-05-30 04:32 |
| Fourth known factor of M(M31) (preliminary announcement) | ewmayer | Operazione Doppi Mersennes | 22 | 2005-07-06 00:33 |