mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software > Mlucas

Reply
 
Thread Tools
Old 2019-06-09, 01:07   #56
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

2×3×53 Posts
Default

Quote:
Originally Posted by ewmayer View Post
[Edit: I just realized that the cfg-file data I copied below are from my "advance peek" v19 binary, so you might as well just use that one from the get-go - cf. the attachment at bottom.]
The attachment seems to be a source code package though, not a precompiled binary...
nomead is offline   Reply With Quote
Old 2019-06-09, 18:41   #57
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

101101011101112 Posts
Default

Quote:
Originally Posted by nomead View Post
The attachment seems to be a source code package though, not a precompiled binary...
My mistake - the previous attachment is indeed the source tarball from which I built the advanced-look v19 binary I mentioned - here is the latter (SIMD binary only) together with the resulting cfg-file running 4-threaded on the a73 core of my N2 (-cpu 2:5), and a copy the primenet.py script, i.e. all the files needed for someone to get up and running on a fresh device with similar CPU. md5sum = 7b5850114211d68234c391ff1a3d62eb:
Attached Files
File Type: xz mlucas_v19_asimd.tar.xz (1.15 MB, 89 views)
ewmayer is offline   Reply With Quote
Old 2019-06-09, 18:55   #58
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,739 Posts
Default

Quote:
Originally Posted by ewmayer View Post
My mistake - the previous attachment is indeed the source tarball from which I built the advanced-look v19 binary I mentioned - here is the latter (SIMD binary only) together with the resulting cfg-file running 4-threaded on the a73 core of my N2 (-cpu 2:5), and a copy the primenet.py script, i.e. all the files needed for someone to get up and running on a fresh device with similar CPU. md5sum = 7b5850114211d68234c391ff1a3d62eb:
Would I only need to drop the above 3 files into the directory currently running v18 to get v19 running?

Last fiddled with by paulunderwood on 2019-06-09 at 18:57
paulunderwood is offline   Reply With Quote
Old 2019-06-09, 19:04   #59
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

103·113 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
Would I only need to drop the above 3 files into the directory currently running v18 to get v19 running?
Yep! And if you 'fg' your current v18 job and ctrl-c it, it should write a checkpoint file for the current iteration, i.e. you won't lose any work due to a partially-completed checkpoint interval. That signal-handling code is still not fully reliable across all Linux platforms, but I haven't encountered any issues with it on my various ARM devices, including the N2.

Last fiddled with by ewmayer on 2019-06-09 at 19:05
ewmayer is offline   Reply With Quote
Old 2019-06-09, 19:20   #60
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,739 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Yep! And if you 'fg' your current v18 job and ctrl-c it, it should write a checkpoint file for the current iteration, i.e. you won't lose any work due to a partially-completed checkpoint interval. That signal-handling code is still not fully reliable across all Linux platforms, but I haven't encountered any issues with it on my various ARM devices, including the N2.
v18 was running in the foreground in a terminal. ^C did not kill it. I killed the process from top. Thanks!

Oops. It was running in the background and I forgot to do "fg"

Last fiddled with by paulunderwood on 2019-06-09 at 19:22
paulunderwood is offline   Reply With Quote
Old 2019-06-11, 19:23   #61
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

103×113 Posts
Default

Paul, did you notice any change in timing/ROE-levels after switching your current LL test to the v19 build?
ewmayer is offline   Reply With Quote
Old 2019-06-11, 20:13   #62
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,739 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Paul, did you notice any change in timing/ROE-levels after switching your current LL test to the v19 build?
The Av/MaxROE seem about the same -- but is difficult to say by inspection. The timings have improved from about 106.7ms to 103.5ms per iteration -- minimum values. The N2 run is at 2.67% -- that is after 3 days. I have a patient nature! I have no number crunching on the a53 -- I just use it for desktop work. I am running Skype form an Intel box with ssh -X over 100Mbs soon to be upgraded to 1000Mbs.
paulunderwood is offline   Reply With Quote
Old 2019-06-11, 21:31   #63
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

2D7716 Posts
Default

Cool - 3% speedup translates to 3 days sooner. Though note this is within the level of one-run-to-the-next timing variability, which in my experience can be as much as 5%. But 3% is roughly what I saw on average when I switched all my Galaxy S7s to v19 a few weeks ago.

Yah, this kind of ARM micro-PC hardware is not for the results-greedy, in my case it helps to have a whole mess of such devices (Odroid C2 and N2, 12 cellphones, plus my Intel 2-core Broadwell NUC running Mlucas avx2 build) patiently and quietly working away.
ewmayer is offline   Reply With Quote
Old 2019-06-11, 21:50   #64
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,739 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Cool - 3% speedup translates to 3 days sooner. Though note this is within the level of one-run-to-the-next timing variability, which in my experience can be as much as 5%. But 3% is roughly what I saw on average when I switched all my Galaxy S7s to v19 a few weeks ago.

Yah, this kind of ARM micro-PC hardware is not for the results-greedy, in my case it helps to have a whole mess of such devices (Odroid C2 and N2, 12 cellphones, plus my Intel 2-core Broadwell NUC running Mlucas avx2 build) patiently and quietly working away.
Plus I have none of this on my Arm machines:

Code:
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds
paulunderwood is offline   Reply With Quote
Old 2019-06-18, 16:35   #65
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,739 Posts
Default

The good news is that when the N2 is idle the iteration time is ~102.5 ms per iteration.

A little disconcerting is that I occasionally get a crash of the open tab in FireFox-esr and when this happens the Max ROE jumps to 0.125. Is this a software phenomenon or hardware related?

Last fiddled with by paulunderwood on 2019-06-18 at 16:36
paulunderwood is offline   Reply With Quote
Old 2019-06-18, 18:57   #66
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

265678 Posts
Default

Quote:
Originally Posted by paulunderwood View Post
The good news is that when the N2 is idle the iteration time is ~102.5 ms per iteration.

A little disconcerting is that I occasionally get a crash of the open tab in FireFox-esr and when this happens the Max ROE jumps to 0.125. Is this a software phenomenon or hardware related?
Would you be so kind as to post your p*.stat file and list at least one approximate iteration interval where the phenomenon you describe occurred?

After successfully completing a pair of DCs, the first-set-up of my S7 compute-o-phones has been crunching away on an exponent ~87M for several weeks. As I noted previously the quad-core Snapdragon CPU in the S7 is roughly equal to the 4xa73 portion of the N2. This exponent is sufficiently close to the upper limit for 4608K FFT that on a half-dozen occasions it's hit ROE = 0.4375, causing it to restart from the most-recent savefile and resume @5120K, with a resulting 10-15% performance hit, from 95-100ms/iter @4608K to 112-117ms/iter @5120K. Whenever I see such a jump has occurred, I kill the run and force resumption @4608K via "nohup nice ./Mlucas -cpu 0:3 -fftlen 4608 &", but this points out another desirable feature-add for the next release ... each run gathers running statistics about such FFT-length-increasing ROEs, and if their frequency is sufficiently low, the program should simply re-do the iteration interval of the ROE >= 0.4375 occurrence (assuming same is repeatable-on-retry-at-same-length, which check is already done first) at the next-larger FFT length and then drop back down to the original default length.
ewmayer is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mprime on Odroid 64bit ET_ Software 2 2017-02-24 15:42
GPU72 plans post-announcement garo GPU to 72 25 2013-03-04 10:11
The Prime Announcement Thread axn Sierpinski/Riesel Base 5 61 2008-12-08 16:28
Subscribing to announcement thread fetofs GMP-ECM 1 2006-05-30 04:32
Fourth known factor of M(M31) (preliminary announcement) ewmayer Operazione Doppi Mersennes 22 2005-07-06 00:33

All times are UTC. The time now is 05:33.


Sat Jul 17 05:33:38 UTC 2021 up 50 days, 3:20, 1 user, load averages: 1.62, 2.17, 2.17

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.