mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2020-03-05, 17:31   #485
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·4,643 Posts
Default Weird slowdown on mprime P-1'ing.

Hey George et al.

I noticed something which seemed a bit strange to me in one of my Colab jobs's log file:
Code:
[Work thread Mar 5 11:01] Using 10235MB of memory.  Processing 237 relative primes (237 of 480 already processed).
[Work thread Mar 5 11:13] M95xxxxxx stage 2 is 80.04% complete.
[Work thread Mar 5 11:22] M95xxxxxx stage 2 is 80.56% complete. Time: 543.684 sec.
...
[Work thread Mar 5 16:19] M95xxxxxx stage 2 is 97.81% complete. Time: 540.886 sec.
[Work thread Mar 5 16:28] M95xxxxxx stage 2 is 98.33% complete. Time: 540.351 sec.
[Work thread Mar 5 16:37] M95xxxxxx stage 2 is 98.86% complete. Time: 544.413 sec.
[Work thread Mar 5 16:44] Using 859MB of memory.  Processing 6 relative primes (474 of 480 already processed). 
[Work thread Mar 5 16:48] M95xxxxxx stage 2 is 99.25% complete. Time: 633.379 sec.
[Work thread Mar 5 16:55] M95xxxxxx stage 2 is 99.30% complete. Time: 434.847 sec.
[Work thread Mar 5 17:02] M95xxxxxx stage 2 is 99.34% complete. Time: 431.063 sec.
It seems like mprime slows down considerably while working on the smaller number of relative primes right at the end of the run. Is this expected?

Thanks.
chalsall is offline   Reply With Quote
Old 2020-03-05, 18:06   #486
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

713810 Posts
Default

Quote:
Originally Posted by chalsall View Post
It seems like mprime slows down considerably while working on the smaller number of relative primes right at the end of the run. Is this expected?
Probably normal. I'm guessing 633.379 number is due to some extra overhead in setting up the next batch of relative primes.
Prime95 is offline   Reply With Quote
Old 2020-03-05, 18:45   #487
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

2·2,213 Posts
Default

Quote:
Originally Posted by chalsall View Post
Hey George et al.

I noticed something which seemed a bit strange to me in one of my Colab jobs's log file:

It seems like mprime slows down considerably while working on the smaller number of relative primes right at the end of the run. Is this expected?

Thanks.
Having done lots of P1 I can tell you this is absolutely normal.
The more RPs at a time the better.
In the worst case I've seen the last 1 RP take almost as long as 10.
One actual example
42 RPs in 63 minutes
207 RPs in 225 minutes.

That's why if I have PCs on the low end of RAM I'll run less workers each with 2 or more cores.
petrw1 is offline   Reply With Quote
Old 2020-03-05, 19:48   #488
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·4,643 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Having done lots of P1 I can tell you this is absolutely normal.
Copy. Thanks, guys. Totally messes with the estimated completion though. Probably can't easily be algorithmically envisioned. Not a big deal.
chalsall is offline   Reply With Quote
Old 2020-03-07, 22:00   #489
franz
 
Feb 2020

2·3 Posts
Default

Quick update about Prime95 failing on AMD Threadripper 3970X:

AMD was able to reproduce the issue and has engineered a fix that is now available in the form of new BIOS versions, at least for GIGABYTE motherboards (TRX40 Aorus Master and TRX40 Aorus Xtreme).

I've tried an alpha version of the BIOS that AMD provided me with (which should be similar to the latest official BIOS that was released today) and Prime95 has been error-free on a 9 hours torture test run at the most intense preset.
franz is offline   Reply With Quote
Old 2020-03-07, 23:02   #490
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·4,643 Posts
Default

Quote:
Originally Posted by franz View Post
AMD was able to reproduce the issue and has engineered a fix that is now available in the form of new BIOS versions, at least for GIGABYTE motherboards (TRX40 Aorus Master and TRX40 Aorus Xtreme).
Yet another example of George's extreme code helping the CPU manufacturers!
chalsall is offline   Reply With Quote
Old 2020-03-08, 10:50   #491
franz
 
Feb 2020

68 Posts
Default

Quote:
Yet another example of George's extreme code helping the CPU manufacturers
Absolutely. And AMD confirmed to me that Prime95 is part of their (and undoubtedly Intel's) validation test suite.
franz is offline   Reply With Quote
Old 2020-03-09, 02:33   #492
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

873610 Posts
Default


Uncwilly is offline   Reply With Quote
Old 2020-03-17, 21:13   #493
hansl
 
hansl's Avatar
 
Apr 2019

5·41 Posts
Default

Hi, I wanted to test out some different work configurations recently, and in doing so managed to mess up my settings quite a bit. Looking for any advice to get things running properly again, or how to go about trying different configs in general.

What I did (which in hindsight I realize was a bad idea) was copy a running mprime directory, cleared the worktodo and checkpoints from this new copy, and tried setting up a second simultaneous instance on the same computer, with different work types on its worker threads, as I wanted to test performance.

This ended up confusing the first running instance's about its worktype preference, and now I'm not entirely sure the best way to fix it. I guess that means that, as far as the server is concerned, each Worker# per computer is unique, and it doesn't expect multiple instances to be run?
So I think I need to merge my worker thread settings and worktodo's back into a single instance now, but the setup is still confusing to me.

I'm not sure if I need to fix some of the various SrvrPO* options in my local.txt
Could anyone explain the meaning of them? It would be nice to have this info added to undoc.txt at the very least I think. Or are these intentionally obfuscated to prevent messing with the server?
It seems that SrvrPO1 is the preferred work type, but I'm not sure about the others.

Also what is the difference in [Worker #X] settings under prime.txt vs local.txt?

And lastly, a small(I think) feature request:
When running "mprime -m", could it possibly wait until command: "4. Test/Continue" before it grabs new work from the server?
This would make it easier to test out (re)configuring workers on a copied folder etc. without getting a bunch of new unintended work.
hansl is offline   Reply With Quote
Old 2020-03-17, 21:46   #494
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×43×83 Posts
Default

Quote:
Originally Posted by hansl View Post
This ended up confusing the first running instance's about its worktype preference, and now I'm not entirely sure the best way to fix it. I guess that means that, as far as the server is concerned, each Worker# per computer is unique, and it doesn't expect multiple instances to be run?
So I think I need to merge my worker thread settings and worktodo's back into a single instance now, but the setup is still confusing to me. .
You can run multiple mprimes on the same machine (there is no need to do this). However, each must have its own ComputerGUID which is in prime.txt or local.txt. This value is automatically generated on a new install. In other words, what you tried would have worked if you had also deleted prime.txt and local.txt.

To repair, try going to https://www.mersenne.org/cpus/ and click on the CPU. Change the settings there and see if it sticks.
Prime95 is offline   Reply With Quote
Old 2020-04-01, 15:12   #495
jdhedden
 
Mar 2020

28 Posts
Default AMD Ryzen 7 Throughput

Quote:
Originally Posted by franz View Post
Absolutely. And AMD confirmed to me that Prime95 is part of their (and undoubtedly Intel's) validation test suite.
I'm trying to run mprime on a AMD Ryzen 7 3800X 8-Core Processor. I've run with different numbers of threads with the following approximate throughputs:
1 thread : 20ms/iter
2 threads: 31ms/iter
4 threads: 60ms/iter
7 threads: 100ms/iter


This seems to show that there is no advantage to running more that 2 threads. I've even tried running multiple instances of mprime, but with similarly dismal results.


Suggestion on how to 'fix' this would be most welcome.
jdhedden is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 version 29.2 Prime95 Software 71 2017-09-16 16:55
Prime95 version 29.1 Prime95 Software 95 2017-08-22 22:46
Prime95 version 26.5 Prime95 Software 175 2011-04-04 22:35
Prime95 version 25.9 Prime95 Software 143 2010-01-05 22:53
Prime95 version 25.8 Prime95 Software 159 2009-09-21 16:30

All times are UTC. The time now is 11:17.

Tue Oct 27 11:17:26 UTC 2020 up 47 days, 8:28, 0 users, load averages: 1.16, 1.35, 1.40

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.