mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2021-09-23, 18:20   #441
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

171E16 Posts
Default

Quote:
Originally Posted by chalsall View Post
Coordination of the concurrency of processes is a non-trivial problem space.
... Talented humans are ***very*** expensive...
Except when they're free. George does what he does not for the money.

After watching George deliver very well for a quarter century, it seems clear to me he's up to the task. Multiple workers using multiple cores each, plus a PrimeNet communications thread. Maybe it just goes on a to-do-someday list beneath some other priorities. Or maybe there are good reasons not to try it, that I'm unaware of.

Gpuowl source provides an example of how the GCD parallelism may be handled. Different situation GPU & CPU combined there, but still.

For proof space preallocation, the potential time saving is smaller, but one could compute a time estimate for space preallocation and a time estimate for when depositing the first proof residue will be needed, and only parallelize when there's a comfortable time margin, and also ensure it wait for completion of preallocation.

V30.7 is in preparation. AFAIK this includes P-1 speed improvements in primes pairing, & Alder Lake support. Not sure what else.

Last fiddled with by kriesel on 2021-09-23 at 18:28
kriesel is offline   Reply With Quote
Old 2021-09-23, 18:36   #442
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

273C16 Posts
Default

Quote:
Originally Posted by kriesel View Post
Except when they're free. George does what he does not for the money.
Time is the fundamental currency. Perfect is the enemy of good. Much like a poem, software is never finished. Simply abandoned.

George /might/ have made a conscious decision that the effort required (including all the "in the wild" debugging) was not worth the tiny amount of throughput which /might/ be gained.

Or, maybe, he's just busy with other stuff...

Last fiddled with by chalsall on 2021-09-23 at 18:40 Reason: s/gained which/ which/; # OCD
chalsall is offline   Reply With Quote
Old 2021-09-23, 19:12   #443
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·11·269 Posts
Default

Let's ballpark these for ppm of system productivity.

P-1 GCD 1 hour at 880M on Xeon Phi 7210. I have another similar-exponent P-1 that's projecting a week left to go for about half of stage 2. So let's assume 30 days for both stages on 880M, 7210; 60 minutes x 2 stages / (30x24x60) x 15/16 ~2600 ppm = 0.26% of P-1 time, which is ~1/40 of PRP time, so ~62. ppm of exponent (TF + P-1 + PRP) time. That might become worthwhile to pursue at some point, depending on what other optimization opportunities remain and effort needed.

Preallocate PRP proof space 3 minutes at 500M on Xeon Phi 7210.
Forecast PRP time 328.5 days ~473040 minutes. 3/473040 x 15cores/16cores= 6. ppm of PRP time.
That would need to be a very quick modification to be worth the programming and test time. Seems unlikely.
kriesel is offline   Reply With Quote
Old 2021-09-23, 19:35   #444
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

22×34×31 Posts
Default

Quote:
Originally Posted by kriesel View Post
Let's ballpark these for ppm of system productivity.
Let's...

You are working at the extreme edge. I understand the reasoning, but I would argue this should not inform "general policy".

My P-1'ers (using mprime (Linux64,Prime95,v30.5,build 2)) are currently taking about 5 seconds for the GCDs (single-threaded). Not a problem, in my Universe.
chalsall is offline   Reply With Quote
Old 2021-09-23, 19:59   #445
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·11·269 Posts
Default

Assuming the ~p2.1 scaling also applies to GCD operations, and you're doing ~106M P-1, there's a factor of ~4.2 unexplained difference in GCD speed in your favor. Maybe faster cores giving faster GCDs, and correspondingly faster stages too.

Timing I gave for large exponent was using ~10GB in stage 2, prime95 V30.6b4.


edit: chalsall's small exponent ~27.4M more than explains the rest of the speed ratio. 5.05sec x 2 /2hr29min = 0.11% potential speedup for him. Except, i3-9100 is 4-core no hyperthreading. Gpouwl's parallelism came about because Mihai took pity on my multi-Radeon VII/slow-cpu-forGCD P-1 factory, which spent ~5 minutes of a 40 minute wavefront P-1 factoring in single-cpu-core GCD with the GPU idle and waiting. System didn't have enough max ram to support dual-instance P-1 on its GPUs to mitigate it. 40/35 = 14.% P-1 speedup via speculative parallelism. As always, George's call what is worth George's time, and not worthwhile.

Last fiddled with by kriesel on 2021-09-23 at 20:30
kriesel is offline   Reply With Quote
Old 2021-09-23, 20:11   #446
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

22·34·31 Posts
Default

Quote:
Originally Posted by kriesel View Post
...there's nearly a factor of 5 unexplained difference in GCD speed in your favor.
All I can do is give you my empirical.

Code:
[Work thread Sep 23 09:30] M27430621 stage 1 complete. 2997862 transforms. Time: 2750.641 sec.
[Work thread Sep 23 09:30] Starting stage 1 GCD - please be patient.
[Work thread Sep 23 09:30] Stage 1 GCD complete. Time: 5.052 sec.
[Work thread Sep 23 09:30] D: 462, relative primes: 857, stage 2 primes: 3303121, pair%=90.33
[Work thread Sep 23 09:30] Using 9996MB of memory.
[Work thread Sep 23 09:30] Stage 2 init complete. 7751 transforms. Time: 15.059 sec.

[Work thread Sep 23 11:12] M27430621 stage 2 complete. 4016210 transforms. Time: 6135.924 sec.
[Work thread Sep 23 11:12] Starting stage 2 GCD - please be patient.
[Work thread Sep 23 11:12] Stage 2 GCD complete. Time: 5.054 sec.
[Work thread Sep 23 11:12] M27430621 completed P-1, B1=1039000, B2=56821000, Wi8: C6D8FB56
[Comm thread Sep 23 11:12] Sending result to server: UID: [redacted]/usbenv, M27430621 completed P-1, B1=1039000, B2=56821000, Wi8: C6D8FB56, AID: 8B45B0E3C88E84E8B42236C07C5F070A

[Work thread Sep 23 11:58] M27430643 stage 1 complete. 2997862 transforms. Time: 2749.861 sec.
[Work thread Sep 23 11:58] Starting stage 1 GCD - please be patient.
[Work thread Sep 23 11:58] Stage 1 GCD complete. Time: 5.043 sec.
[Work thread Sep 23 11:58] D: 462, relative primes: 857, stage 2 primes: 3303121, pair%=90.33
[Work thread Sep 23 11:58] Using 9996MB of memory.
[Work thread Sep 23 11:59] Stage 2 init complete. 7751 transforms. Time: 15.055 sec.

[Work thread Sep 23 13:41] M27430643 stage 2 complete. 4016210 transforms. Time: 6143.559 sec.
[Work thread Sep 23 13:41] Starting stage 2 GCD - please be patient.
[Work thread Sep 23 13:41] Stage 2 GCD complete. Time: 5.052 sec.
[Work thread Sep 23 13:41] M27430643 completed P-1, B1=1039000, B2=56821000, Wi8: C6B2FB4A
[Comm thread Sep 23 13:41] Sending result to server: UID: [redacted]/usbenv, M27430643 completed P-1, B1=1039000, B2=56821000, Wi8: C6B2FB4A, AID: 30BC556ED1625FFF02A0B1960F00B038
Code:
[chalsall@usbwalker prime]$ cat /proc/cpuinfo | grep name
model name	: Intel(R) Core(TM) i3-9100 CPU @ 3.60GHz
model name	: Intel(R) Core(TM) i3-9100 CPU @ 3.60GHz
model name	: Intel(R) Core(TM) i3-9100 CPU @ 3.60GHz
model name	: Intel(R) Core(TM) i3-9100 CPU @ 3.60GHz
chalsall is offline   Reply With Quote
Old 2021-09-23, 20:37   #447
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

1004410 Posts
Default

Quote:
Originally Posted by kriesel View Post
edit: chalsall's small exponent ~27.4M more than explains the rest of the speed ratio. 5.05sec x 2 /2hr29min = 0.11% potential speedup for him.
I would argue that for future readers it might have been more valuable for you to quote my message to yours in a new post, rather than editing your post speaking to my subsequent post.

I deeply appreciate your curation skills, Ken.

It's a job description that few appreciate. And those that do, would only take on if the subject domain was important enough...
chalsall is offline   Reply With Quote
Old 2021-09-30, 07:33   #448
S485122
 
S485122's Avatar
 
"Jacob"
Sep 2006
Brussels, Belgium

110110110012 Posts
Default "Sending interim residue" Mxxx / AID

Prime95 30.6 b4
Nothing dramatic but an inconsistency nevertheless ;-)

When interim residues are sent to the server, between or during the periodic communication the format of the output to the screen and the prime.log file has the following format :
Code:
[Comm thread Sep 16 19:22] Sending interim residue 40000000 for M58193041
But when the residues are sent together with a result the AID is used instead of the M followed by the exponent :
Code:
[Comm thread Sep 16 23:26] Sending interim residue 55000000 for assignment 172076D8AD6993D981F397637613B8DC
[Comm thread Sep 16 23:26] Sending result to server: UID: S485122/i9-10920X, M58193041 is not prime. Res64: 1B5E1783A3861E57. Wh4: 67E20740,22995864,00000000, AID: 172076D8AD6993D981F397637613B8DC
(Never mind the AID in clear : the assignment has been completed the assignment and its ID are bygones.)
S485122 is offline   Reply With Quote
Old 2021-09-30, 12:21   #449
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

171E16 Posts
Default

Found the following in an mprime run log immediately after starting mprime v30.6b4:
Code:
[Main thread Sep 30 12:04] Mersenne number primality test program version 30.6 [Main thread Sep 30 12:04] Optimizing for CPU architecture: Core i3/i5/i7, L2 cache size: 256 KB, L3 cache size: 55 MB 
[Main thread Sep 30 12:04] Starting worker. 
[Main thread Sep 30 12:04] Stopping all worker windows. 
[Work thread Sep 30 12:04] Worker starting 
[Work thread Sep 30 12:04] Worker stopped. 
[Main thread Sep 30 12:04] Execution halted. 
[Main thread Sep 30 12:04] Choose Test/Continue to restart
That's hard to do when it's a Google Colab background process, no menu, no keyboard, no means of input.
Stop and Continue the notebook section seems to have worked.
No idea what caused the immediate stop.

Last fiddled with by kriesel on 2021-09-30 at 12:24
kriesel is offline   Reply With Quote
Old 2021-09-30, 19:40   #450
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

97516 Posts
Default

Any chance we could write PRP results to results.txt too?

I understand that results.txt has been deprecated in favor of the JSON file, but it would be nice to have data that is more human-readable. Or as a compromise, could we have an option to "pretty print" the JSON strings?
ixfd64 is offline   Reply With Quote
Old 2021-09-30, 19:45   #451
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·5·353 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Any chance we could write PRP results to results.txt too?

I understand that results.txt has been deprecated in favor of the JSON file, but it would be nice to have data that is more human-readable. Or as a compromise, could we have an option to "pretty print" the JSON strings?
If by pretty-print you mean presenting JSON over multiple lines with indenting and such then no, as this will break manual results which is based on the assumption that one-line=one-result.

I have no objection if George wants to add output to the non-JSON output, but support for any new format will not be added to manual results parsing (we don't want users submitting less data).

I'm curious what part you find less-than-readable about the JSON results? If it would be universally considered helpful the JSON elements could be re-ordered without causing any problems.
James Heinrich is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 10:18.


Wed Dec 1 10:18:05 UTC 2021 up 131 days, 4:47, 1 user, load averages: 1.62, 1.19, 1.19

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.