mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2011-10-24, 23:12   #1
bcp19
 
bcp19's Avatar
 
Oct 2011

7·97 Posts
Default A CPU Question

I have an Intel Core i7 Q740 @ 1.73 GHz and with Prime95 running I have 4 worker windows. When I go into task manager, it shows 8 CPUs and that I am running at 50%. I realize the i7 is hyperthreaded which is why I have '8' cores, but I guess my question is, am I losing out on computational capacity if my system says I am running at 50% and if so, what can be done?
bcp19 is offline   Reply With Quote
Old 2011-10-25, 03:12   #2
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by bcp19 View Post
I have an Intel Core i7 Q740 @ 1.73 GHz and with Prime95 running I have 4 worker windows. When I go into task manager, it shows 8 CPUs and that I am running at 50%. I realize the i7 is hyperthreaded which is why I have '8' cores, but I guess my question is, am I losing out on computational capacity if my system says I am running at 50% and if so, what can be done?
No, you're not losing out, but what I do on my quad core Intel is add a few options to local.txt:

NumCPUs=4
AffinityScramble2=01234567
ThreadsPerTest=2

ThreadsPerTest says use two logical "cores". NumCPU's tells Prime95 that there are only 4 physical cores, regardless of Windows. OTOH, P95 should also automatically detect hyperthreading. Try adding these two lines and see what happens.

AffinityScramble2 tells Prime95 where to run the second thread for each worker. If there's a worker with affinity set to 0, then the helper thread will be set to 1. If there's a worker on 2, the helper thread will be on 3.

For instance, to set 0,4 1,5 2,6 3,7 worker/helper pairs, you would set AffinityScramble2=04152637 . Try changing the others without this though, Prime95 ought to be able to tell which logical core goes with each physical core.

Note: Each OS assigns numbers to each core differently. In Windows, I use AffinityScramble2=01234567, but in Linux I use AffinityScrramble2=04152637. But again, Prime95 should be able to detect this.

The key thing here is ThreadsPerTest=2. Prime95 should be able to do the rest, though obviously the tools to override it are in this post.

Edit: Some passages from undoc.txt:
Quote:
The program automatically computes the number of CPUs, hyperthreading, and speed.
This information is used to calculate how much work to get.
If the program did not correctly figure out your CPU information,
you can override the info in local.txt:
NumCPUs=n
CpuNumHyperthreads=1 or 2
CpuSpeed=s
Where n is the number of physical CPUs or cores, not logical CPUs created by
hyperthreading. Choose 1 for non-hyperthreaded and 2 for hyperthreaded. Finally,
s is the speed in MHz.

As an alternative to the above, one can set NumPhysicalCores=n in local.txt.
This is useful on machines that are somtimes booted with hyperthreading enabled
and sometimes without. Normally, the program can detect this situation, but one
notable problem case is a dual-CPU hyperthreaded machine, For example, take a
dual-CPU quad-core hyperthreaded machine. When booted with hyperthreading enabled
this is properly detected as an 8-core hyperthreaded machine. When booted
with hyperthreading disabled, this is improperly detected as a 4-core hyperthreaded
machine. If you set NumPhysicalCores=8, then the program will set the
hyperthreading state properly no matter how the machine is booted.
Quote:
The program makes its best guess at how the OS maps hyperthreaded logical CPU
numbers to physical CPUs. It also assigns workers and helper threads
to CPUs for optimal speed. However, bugs, new architectures, or situations we
haven't considered may make different affinity settings desirable. In
local.txt set
AffinityScramble2=string
Where the characters in "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz()"
represent 64 logical CPU numbers. For example, let's say you have a system
with 8 logical cores with 4 workers each using a helper thread. Also, assume
your system has logical CPUs 0 & 4 on the same physical CPU core, 1 & 5, etc.
If the program is properly determining which logical CPUs share the same physical
CPU, then the program internally generates an affinity scramble string of "04152637".
The program's default policy is to assign the worker and helper threads to the same
physical CPU. If the program is not properly determining which logical CPUs share the
same physical CPU, or you think a different affinity policy would result in better
performance, then set AffinityScramble2 accordingly. Let's say you think
running the helper threads on a different physical core would be better, then
you might set AffinityScramble2=02134657 to test out your theory.

Last fiddled with by Dubslow on 2011-10-25 at 03:22
Dubslow is offline   Reply With Quote
Old 2011-10-25, 03:40   #3
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5×359 Posts
Default

What hyperthreading does is reduces the cost of a context switch, by making jumping from one set of registers to another relatively low-cost in time. It might be better to just run 4 straight workers. Look at your throughput and see.
Christenson is offline   Reply With Quote
Old 2011-10-25, 04:40   #4
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Way back over the summer when I got my proc, I found that (with a very minimal amount of testing) that the helper threads helped. (I'd estimate from the measly data set a 10, maybe 15% gain.)
I'd guess that its because if only one thread is being used, it's real easy for something that doesn't actually need time to take time. Or, it could be that (because of OS overhead and having 10's of threads available, or just proc design not based about maximizing single thread output) that one thread is never able to fully use the proc's resources. Or some combination of those two, they're complementary. OTOH, it is still worth bcp19 testing it.
Dubslow is offline   Reply With Quote
Old 2011-10-25, 16:46   #5
bcp19
 
bcp19's Avatar
 
Oct 2011

12478 Posts
Default

Quote:
Originally Posted by Dubslow View Post
No, you're not losing out, but what I do on my quad core Intel is add a few options to local.txt:

The key thing here is ThreadsPerTest=2. Prime95 should be able to do the rest, though obviously the tools to override it are in this post.

Edit: Some passages from undoc.txt:
That did the trick, CPU usage is running around 92% instead of 50, will see what difference this makes.
bcp19 is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 22:23.


Fri Aug 6 22:23:19 UTC 2021 up 14 days, 16:52, 1 user, load averages: 3.25, 3.29, 3.18

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.