mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2010-12-10, 20:29   #1
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

227610 Posts
Default Prime95 slowed down after I restarted computer

As I mentioned in another thread, I recently got access to a laptop with an i7-720QM processor. I installed Prime95 and configured it to do P-1 factoring on four worker windows, one for each physical core. I set the number of iterations between screen outputs to 500, with each output taking about 65 seconds.

However, it went up to about 115 seconds after I restarted the computer. At one point, it even slowed down to about ten minutes! Restarting the computer again seemed to solve the second problem, although the time between outputs is still about 115 seconds.

I am using Windows 7, and I never changed the priority of Prime95 (either through the task manager or Prime95 itself). Interestingly, the CPU usage was around 50% the whole time. Does anyone know what might be happening?

When I first ran Prime95, I let it use eight worker threads (one for each thread/logical core) by default. However, I later reduced it to four following Mini-Geek's suggestion. Could this have something to do with the slowdown?

edit: local.txt has the following line:

Code:
ThreadsPerTest=1
Should I change it?

Last fiddled with by ixfd64 on 2010-12-10 at 20:39
ixfd64 is offline   Reply With Quote
Old 2010-12-10, 21:45   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×3×1,151 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
However, it went up to about 115 seconds after I restarted the computer.
I've seen this kind of thing before. It seems that there is a problem in figuring out which logical processors are real processors and which are the hyperthreaded ones. This results in two workers getting assigned to the same core.

The easiest workaround may be to change the affinity settings letting each worker run on any CPU. Hopefully, the OS will do a better job distributing the workload among the CPUs than Prime95 is doing.
Prime95 is offline   Reply With Quote
Old 2010-12-10, 21:56   #3
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

22·569 Posts
Default

I changed the CPU affinity from "Smart assignment" to "Run on any CPU," and the average time between outputs is now even longer!

Would it help to disable hyperthreading?

Last fiddled with by ixfd64 on 2010-12-10 at 21:57
ixfd64 is offline   Reply With Quote
Old 2010-12-10, 21:59   #4
garo
 
garo's Avatar
 
Aug 2002
Termonfeckin, IE

2,753 Posts
Default

Yes I bet disabling hyper-threading will help!
garo is offline   Reply With Quote
Old 2010-12-11, 00:50   #5
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

22×569 Posts
Default

OK, I went to BIOS and turned off hyperthreading.

For a short time, Prime95 was faster than when I first installed it. However, the time between outputs is now several hundred seconds! Not to mention, my other programs are now lagging. What now?
ixfd64 is offline   Reply With Quote
Old 2010-12-11, 00:57   #6
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

200516 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
OK, I went to BIOS and turned off hyperthreading.

For a short time, Prime95 was faster than when I first installed it. However, the time between outputs is now several hundred seconds! Not to mention, my other programs are now lagging. What now?
Is it thermal throttling?
Uncwilly is offline   Reply With Quote
Old 2010-12-11, 00:58   #7
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Maybe you should try re-enabling hyper-threading, run 4 workers on 1 thread each, and set the affinities using this option:
Code:
You can arbitrarily change how the program assigns affinity to CPUs.
The program makes its best guess at assigning workers and helper threads
to CPUs for optimal speed.  However, new architectures or situations we
haven't considered may make different affinity setting desirable.  In
local.txt set
	AffinityScramble=string
Where the string "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz()"
is the 64-core "make no changes" string.  For example, let's say you have a system
with 8 logical cores with 4 workers each using a helper thread.  The program would
ordinarily assign the worker and helper threads to [0,1], [2,3], [4,5], [6,7].
However, if you think [0,2], [1,3], [4,6], [5,7] would give better performance,
you would set AffinityScramble=02134657 to test out your theory.
I think this would be what you want:
Code:
AffinityScramble=0246
Quote:
Originally Posted by Uncwilly View Post
Is it thermal throttling?
Good point. Another "obvious" thing we haven't asked (and maybe you haven't fully checked out): is anything else using large amounts of CPU, RAM, or hard drive access that might be interfering? Do you see similar slowdowns if you boot into safe mode?
Quote:
Originally Posted by ixfd64 View Post
I am using Windows 7, and I never changed the priority of Prime95 (either through the task manager or Prime95 itself). Interestingly, the CPU usage was around 50% the whole time. Does anyone know what might be happening?
Windows thinks of the HT cores the same as the physical ones (for that purpose anyway), so when it sees that they aren't being used, it thinks the CPU usage is 50% when it's really at max capacity for Prime95. Showing 50% usage is normal for HT-enabled machines.

Last fiddled with by Mini-Geek on 2010-12-11 at 01:01
Mini-Geek is offline   Reply With Quote
Old 2010-12-11, 02:42   #8
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

22×569 Posts
Default

OK, I re-enabled hyperthreading and added "AffinityScramble=0246" to local.txt. The time between screen outputs is now close to when I first ran Prime95. I still get spikes of over 100 seconds from time to time, but I guess it's better than the earlier average of 115!

By the way, is it a bad idea to run P-1 on all four worker windows?
ixfd64 is offline   Reply With Quote
Old 2010-12-11, 03:22   #9
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

102538 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
OK, I re-enabled hyperthreading and added "AffinityScramble=0246" to local.txt. The time between screen outputs is now close to when I first ran Prime95. I still get spikes of over 100 seconds from time to time, but I guess it's better than the earlier average of 115!
Sounds like the suggestion helped. Not perfect, but better...
Quote:
Originally Posted by ixfd64 View Post
By the way, is it a bad idea to run P-1 on all four worker windows?
Hmm...in stage 2 you might get a big slow-down due to memory bandwidth. But then LL needs fast bandwidth too, just not as much storage, so it might be a non-issue. Plus, there's the fact that if you have X MB of RAM for Prime95, each worker can only get between X and X/4 MB of RAM, (closer to X/4 than X, usually) so the P-1 tests aren't as efficient as they could be. If you want to do it, and you first test it out to see how it works out, (like you're doing now - I just mean don't leave it on that 24/7 without seeing whether it works well or not) there's nothing wrong with doing P-1 on all four workers, but it's simplest to just put one worker on P-1 and the rest on other tasks.

Last fiddled with by Mini-Geek on 2010-12-11 at 03:27
Mini-Geek is offline   Reply With Quote
Old 2010-12-11, 03:40   #10
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3×2,083 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
Hmm...in stage 2 you might get a big slow-down due to memory bandwidth. But then LL needs fast bandwidth too, just not as much storage, so it might be a non-issue. Plus, there's the fact that if you have X MB of RAM for Prime95, each worker can only get between X and X/4 MB of RAM, (closer to X/4 than X, usually) so the P-1 tests aren't as efficient as they could be. If you want to do it, and you first test it out to see how it works out, (like you're doing now - I just mean don't leave it on that 24/7 without seeing whether it works well or not) there's nothing wrong with doing P-1 on all four workers, but it's simplest to just put one worker on P-1 and the rest on other tasks.
Note that you can also set MaxHighMemWorkers=1 in prime.txt to ensure that only one worker runs stage 2 at any given time. (If two workers need to run stage 2, the second one will put its stage 2 on hold and move on to stage 1 of the next assignment until the first worker's stage 2 finishes.) With this, you can "safely" run all four cores on P-1.
mdettweiler is offline   Reply With Quote
Old 2010-12-11, 03:43   #11
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17·251 Posts
Default

Quote:
Originally Posted by mdettweiler View Post
Note that you can also set MaxHighMemWorkers=1 in prime.txt to ensure that only one worker runs stage 2 at any given time. (If two workers need to run stage 2, the second one will put its stage 2 on hold and move on to stage 1 of the next assignment until the first worker's stage 2 finishes.) With this, you can "safely" run all four cores on P-1.
MaxHighMemWorkers=2 would be better, because otherwise you'd always have 3 cores running stage 1 and 1 core running stage 2. Since they take roughly (in a test of CPU credit just now, the whole thing should take about 2.076 times longer than stage 1, so that's a pretty close "roughly") the same time, stage 2 would continually fall behind stage 1. Obviously, that is not a good situation.
Note that "Default [setting for MaxHighMemWorkers] is available memory / 200MB."
Also remember that Memory can be worker-specific: (particularly useful for a situation where you want each worker to use up to, say, 250 MB, regardless of what the other workers are doing; this didn't work correctly in the past, but newer versions have it fixed)
Code:
The Memory=n setting in local.txt refers to the total amount of memory the
program can use.  You can also put this in the [Worker #n] section to place
a maximum amount of memory that one particular worker can use.

Last fiddled with by Mini-Geek on 2010-12-11 at 03:54
Mini-Geek is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
RPS 8th Drive Restarted From 700k Kosmaj Riesel Prime Search 482 2017-07-27 06:26
17 or bust - get restarted Jud McCranie PrimeNet 3 2014-04-27 04:25
P-1 & LL wavefront slowed down? otutusaus PrimeNet 159 2013-12-17 09:13
Guantanamo trials to be restarted garo Soap Box 39 2011-03-22 23:07
computer seems to LAg when Prime95 is running Firedog18 Hardware 2 2003-07-25 22:03

All times are UTC. The time now is 12:26.

Sat Jul 4 12:26:31 UTC 2020 up 101 days, 9:59, 1 user, load averages: 1.66, 1.45, 1.44

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.