mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2017-10-10, 03:08   #1
bplenhart
 
"Brian Lenhart"
Oct 2013

7 Posts
Question Worker iteration speed slow down

I am seeing workers significantly slow down when completing an exponent and starting a new exponent. Worker is running almost double ms/iter on the new exponent. I believe the FFT size is the same for both exponents. Stopping the worker and restarting has no impact. Stopping all workers and exiting the program, then restarting allows all workers to run at 'normal' ms/iter. Dfference is 30ms/iter increased to 48ms/iter for same size FFT.
This is happening on 8 different machines.

I have screen shots of 2 machines attached.

I apologize if this behavior is documented in the forum else where, I was unable to find any information.

bplenhart
Attached Thumbnails
Click image for larger version

Name:	Prime95_1aa.png
Views:	100
Size:	75.2 KB
ID:	16953   Click image for larger version

Name:	Prime95_2aa.png
Views:	84
Size:	74.4 KB
ID:	16954  

Last fiddled with by bplenhart on 2017-10-10 at 03:15 Reason: fix attachments
bplenhart is offline   Reply With Quote
Old 2017-10-10, 06:18   #2
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

1011011100102 Posts
Default

Both of your previous exponents barely fit into a 4096k FFT with an acceptable round-off error. Your new ones require a larger FFT, and Prime95 chose a 4480k FFT.
Mark Rose is offline   Reply With Quote
Old 2017-10-10, 11:16   #3
bplenhart
 
"Brian Lenhart"
Oct 2013

7 Posts
Default

Ok, the FFT size may be the next size larger. I would expect a modest increase in ms/iter. Not almost double.

Also, exiting the Prime95 program and restarting clears this behavior -- meaning I get the expected ms/iter throughput for that worker.
bplenhart is offline   Reply With Quote
Old 2017-10-10, 11:49   #4
bplenhart
 
"Brian Lenhart"
Oct 2013

7 Posts
Default

Data

Machine 1:
LL exponent M77896541, 29.4 ms/iter --> complete LL & start next exponent --> LL exponent M78591281, 47.7 ms/iter --> stop & exit Prime95 --> restart Prime95 --> LL exponent M78591281, 31.5 ms/iter.

Machine 2:
LL exponent M77842553, 30.7 ms/iter --> complete LL & start next exponent --> LL exponent M78564151, 46.7 ms/iter --> stop & exit Prime95 --> restart Prime95 --> LL exponent M78564151, 32.1 ms/iter.


Restarting the program seems to fix the throughput.
Attached Thumbnails
Click image for larger version

Name:	Prime95_1b.PNG
Views:	79
Size:	90.6 KB
ID:	16956   Click image for larger version

Name:	Prime95_2b.PNG
Views:	63
Size:	80.6 KB
ID:	16957  
bplenhart is offline   Reply With Quote
Old 2017-10-10, 12:32   #5
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

49816 Posts
Default

Is this with the new Prime95 version?
VictordeHolland is offline   Reply With Quote
Old 2017-10-10, 13:01   #6
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

55628 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
Is this with the new Prime95 version?
Yes, it has the Jacobi check.


bplenhart, what CPUs are you running?
Mark Rose is offline   Reply With Quote
Old 2017-10-10, 21:25   #7
bplenhart
 
"Brian Lenhart"
Oct 2013

1112 Posts
Default

Machine 1: haswell i5-4570
Machine 2: haswell i5-4430

Prime95 v29.3.
I noticed this behavior in v28.10 also.

All of my machines will do this including i5-4690K and i5-6600K
bplenhart is offline   Reply With Quote
Old 2017-10-13, 18:50   #8
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

3,313 Posts
Default

If I had to guess, after extended runtime the CPU is heating up and clock throttling. Then you stop it, the CPU gets a chance to cool off so when you start it again, it goes fast, then heats up, throttles, etc.

Use something like CPU-Z to look at the actual CPU clock rate while running and also when idle, and also note the rate when Prime95 first starts up compared to after it's been running a while and the iteration times are slower.

Note that your idle clock rates may be low thanks to power savings. If your BIOS has the option, you can set your system to run at full turbo speed all the time, but otherwise the default will slow the clock rate if nothing's happening.

On a few servers, I see this happening... they're all set for max performance but when Prime95 starts, it will throttle by a few turbo boosts due to thermal or TDP issues. On dual socket systems sometimes that means one socket runs faster than the other, or because of thermal interactions, starting up one worker makes that CPU run nice and fast, but firing up another worker using the 2nd socket will cause that first worker to slow down because its CPU got a little warmer and clocked down a bit.
Madpoo is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Worker #5 and Worker#7 not running (Error ILLEGAL SUMOUT skrupian08 Information & Answers 9 2016-08-23 16:35
One slow worker process aweyhau Hardware 5 2016-02-27 18:57
Iteration times slow compared to benchmark cd1940 Software 4 2004-07-05 17:55
slow iteration times PLeopard Hardware 9 2003-10-29 05:48
Slow iteration times with 23.7 smoffat Software 13 2003-10-22 22:50

All times are UTC. The time now is 07:57.


Mon Aug 2 07:57:13 UTC 2021 up 10 days, 2:26, 0 users, load averages: 1.66, 1.58, 1.48

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.