![]() |
|
|
#23 |
|
Sep 2002
17·47 Posts |
Yes, it is supposed to be that way until it is finalized that version 27.x is stable and doesn't need any further changes or enhancements and its creators say so.
|
|
|
|
|
|
#24 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Code:
bill@Gravemind:~/MPrime∰∂ mprime -d [Main thread May 22 04:52] Mersenne number primality test program version 27.7 [Main thread May 22 04:52:35] Optimizing for CPU architecture: Core i3/i5/i7, L2 cache size: 256 KB, L3 cache size: 8 MB [Main thread May 22 04:52:35] Logical CPUs 1,5 form one physical CPU. [Main thread May 22 04:52:35] Logical CPUs 2,6 form one physical CPU. [Main thread May 22 04:52:35] Logical CPUs 3,7 form one physical CPU. [Main thread May 22 04:52:35] Logical CPUs 4,8 form one physical CPU. [Main thread May 22 04:52:35] Starting workers. [Comm thread May 22 04:52:35] Exchanging program options with server [Worker #1 May 22 04:52:35] Worker starting [Worker #1 May 22 04:52:35] Setting affinity to run worker on logical CPU #1 [Worker #2 May 22 04:52:35] Waiting 5 seconds to stagger worker starts. [Worker #3 May 22 04:52:35] Waiting 10 seconds to stagger worker starts. [Worker #4 May 22 04:52:35] Waiting 15 seconds to stagger worker starts. [Comm thread May 22 04:52:35] Done communicating with server. [Worker #1 May 22 04:52:36] Setting affinity to run helper thread 1 on logical CPU #5 [Worker #1 May 22 04:52:36] Resuming primality test of M54197029 using AVX FFT length 2880K, Pass1=384, Pass2=7680, 2 threads [Worker #1 May 22 04:52:36] Iteration: 44942274 / 54197029 [82.9238%]. [Worker #2 May 22 04:52:40] Worker starting [Worker #2 May 22 04:52:40] Setting affinity to run worker on logical CPU #5 [Worker #2 May 22 04:52:41] Setting affinity to run helper thread 1 on logical CPU #2 [Worker #2 May 22 04:52:41] Resuming primality test of M25318487 using AVX FFT length 1344K, Pass1=448, Pass2=3K, 2 threads [Worker #2 May 22 04:52:41] Iteration: 175637 / 25318487 [0.6937%]. [Worker #3 May 22 04:52:45] Worker starting [Worker #3 May 22 04:52:45] Setting affinity to run worker on logical CPU #2 [Worker #3 May 22 04:52:45] Setting affinity to run helper thread 1 on logical CPU #6 [Worker #3 May 22 04:52:46] Resuming primality test of M25572683 using AVX FFT length 1344K, Pass1=448, Pass2=3K, 2 threads [Worker #3 May 22 04:52:46] Iteration: 14725704 / 25572683 [57.5837%]. [Worker #4 May 22 04:52:50] Worker starting [Worker #4 May 22 04:52:50] Setting affinity to run worker on logical CPU #6 [Worker #4 May 22 04:52:50] Setting affinity to run helper thread 1 on logical CPU #3 [Worker #4 May 22 04:52:51] Resuming primality test of M25353589 using AVX FFT length 1344K, Pass1=448, Pass2=3K, 2 threads [Worker #4 May 22 04:52:51] Iteration: 11108729 / 25353589 [43.8152%]. Code:
bill@Gravemind:~/MPrime∰∂ cat local.txt <snip> WorkerThreads=4 NumCPUs=4 ThreadsPerTest=2 <snip> [Worker #1] Affinity=0 [Worker #2] Affinity=1 [Worker #3] Affinity=2 [Worker #4] Affinity=3 I remember this once happened when I first installed MPrime (v27) on my laptop, and it was very frustrating; however, I also recall figuring out some stupid user error after which it started working again, so I never said anything. But, this is exactly the same sort of symptoms, and I can't figure out for the life of me what I'm doing wrong this time. Edit: Another example: Code:
bill@Gravemind:~/MPrime∰∂ cat local.txt <snip> WorkerThreads=4 NumCPUs=4 ThreadsPerTest=2 <snip> [Worker #1] Affinity=1 [Worker #2] Affinity=2 [Worker #3] Affinity=3 [Worker #4] Affinity=4 bill@Gravemind:~/MPrime∰∂ mprime -d [Main thread May 22 04:58] Mersenne number primality test program version 27.7 [Main thread May 22 04:58:44] Optimizing for CPU architecture: Core i3/i5/i7, L2 cache size: 256 KB, L3 cache size: 8 MB [Main thread May 22 04:58:44] Unable to detect some of the hyperthreaded logical CPUs. [Main thread May 22 04:58:44] Enough information obtained to make a reasonable guess. [Main thread May 22 04:58:44] Logical CPUs 1,5 form one physical CPU. [Main thread May 22 04:58:44] Logical CPUs 2,6 form one physical CPU. [Main thread May 22 04:58:44] Logical CPUs 3,7 form one physical CPU. [Main thread May 22 04:58:44] Logical CPUs 4,8 form one physical CPU. [Main thread May 22 04:58:44] Starting workers. [Worker #1 May 22 04:58:44] Worker starting [Worker #3 May 22 04:58:44] Waiting 10 seconds to stagger worker starts. [Worker #1 May 22 04:58:44] Setting affinity to run worker on logical CPU #5 [Worker #2 May 22 04:58:44] Waiting 5 seconds to stagger worker starts. [Worker #4 May 22 04:58:44] Waiting 15 seconds to stagger worker starts. [Worker #1 May 22 04:58:45] Setting affinity to run helper thread 1 on logical CPU #2 [Worker #1 May 22 04:58:45] Resuming primality test of M54197029 using AVX FFT length 2880K, Pass1=384, Pass2=7680, 2 threads [Worker #1 May 22 04:58:45] Iteration: 44947544 / 54197029 [82.9335%]. [Worker #2 May 22 04:58:49] Worker starting [Worker #2 May 22 04:58:49] Setting affinity to run worker on logical CPU #2 [Worker #2 May 22 04:58:50] Setting affinity to run helper thread 1 on logical CPU #6 [Worker #2 May 22 04:58:50] Resuming primality test of M25318487 using AVX FFT length 1344K, Pass1=448, Pass2=3K, 2 threads [Worker #2 May 22 04:58:50] Iteration: 183276 / 25318487 [0.7238%]. [Worker #3 May 22 04:58:54] Worker starting [Worker #3 May 22 04:58:54] Setting affinity to run worker on logical CPU #6 [Worker #3 May 22 04:58:55] Setting affinity to run helper thread 1 on logical CPU #3 [Worker #3 May 22 04:58:55] Resuming primality test of M25572683 using AVX FFT length 1344K, Pass1=448, Pass2=3K, 2 threads [Worker #3 May 22 04:58:55] Iteration: 14733012 / 25572683 [57.6123%]. [Worker #4 May 22 04:58:59] Worker starting [Worker #4 May 22 04:58:59] Setting affinity to run worker on logical CPU #3 [Worker #4 May 22 04:59:00] Setting affinity to run helper thread 1 on logical CPU #7 [Worker #4 May 22 04:59:00] Resuming primality test of M25353589 using AVX FFT length 1344K, Pass1=448, Pass2=3K, 2 threads [Worker #4 May 22 04:59:00] Iteration: 11123987 / 25353589 [43.8753%]. If I set Affinity(1)=0, Affinity(2)=5, Affinity(3)=2, Affinity(4)=7, then what I wind up getting is 04 (correct), 62 (should be 51), 15 (should be 26), 8* (should be 73) respectively, where * is "[Worker #4 May 22 05:03:04] Setting affinity to run helper thread 1 on any logical CPU." I'll leave it like this overnight, since at least there are no overlaps as in the first two examples. Last fiddled with by Dubslow on 2012-05-22 at 10:07 |
|
|
|
|
|
#25 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
Here is what is happening. First, consider a CPU with 8 physical cores and no hyperthreading. These cores are numbered from 1 to 8. If you use two threads for your four workers, worker 1 gets assigned CPUs 1&2, worker 2 gets assigned CPUs 3&4, etc.
Now assume you have 4 cores with hyperthreading and logical CPUs 1 & 2 form physical CPU 1. Again, using two threads for your four workers, worker 1 gets assigned logical CPUs 1&2, worker 2 gets assigned logical CPUs 3&4, etc. Now assume you have 4 cores with hyperthreading and logical CPUs 1 & 5 form physical CPU 1. An affinity scramble mask of "05162738" is generated. Again, using two threads for your four workers, worker 1 gets assigned scrambled CPUs 1&2 which maps to logical CPUs 0&5, worker 2 gets assigned scrambled CPUs 3&4 which maps to logical CPUs 1&6, etc. Now look at your case. An affinity scramble mask of "05162738" was generated. You specifically told prime95 to have worker 1 used scrambled CPU 1 (add 1 to the Affinity= setting) and 2 (hyperthreading always uses the next scrambled CPU number). Worker 2 was told to use scrambled CPUs 2 & 3, etc. This explains the assignments you are seeing. In local.txt, remove all the Affinity= settings. Things should get better. |
|
|
|
|
|
#26 | |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
160658 Posts |
Quote:
The thing is, I've had it exactly like this before and it worked just fine. I did this, and got things like "Setting affinity to run worker on logical CPUs 3,7" and "Setting affinity to run helper thread 1 on logical CPUs 3,7" with each worker getting a different physical core; however CPU usage is not quite at 100% anymore, presumably because threads are occasionally still switching between each of the pair they're assigned. That's why I used Affinity= for each thread before. As I said above, it worked fine like that before. |
|
|
|
|
|
|
#27 | ||
|
P90 years forever!
Aug 2002
Yeehaw, FL
165618 Posts |
Quote:
Any use of multithreading may well cause CPU usage to drop below 100% as occasionally one thread must wait on the other to finish up. BTW, if you are doing P-1, prime95 will do both big multiplies and big adds. The multiplies are multithreaded, the adds are not (further degrading the CPU utilization figure). Quote:
You'll probably get best throughput by not using multithreading at all. Last fiddled with by Prime95 on 2012-05-22 at 20:16 |
||
|
|
|
|
|
#28 |
|
Romulan Interpreter
Jun 2011
Thailand
72·197 Posts |
I would agree with George. For me the best is 4 workers in 4 threads, no helpers (in a 8-HT-cores machine). Hyper-threading is generally generating too much heat and it takes too much energy for the plus of performance it brings, especially when we are talking about programs so cache-optimized as p95. Running 8 workers (or 4 workers in 8 cores, one main plus one helper thread for each worker) generally brings about 20% more performance, for a 50-80% more energy (and heat!). Additionally, running 4 single-threaded workers lets some free firing-power for other daily working stuff (no, I don't talk about writing/sending mails and browsing the forum).
|
|
|
|
|
|
#29 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
I figured out (thanks to fivemack) for the other stuff to do "south of here", they're generally not optimized like P95, so HT for them does help. I realized though that ATM I'm not running any of those, so I did turn off HT for now. (My statement about it working before (some months ago) still stands though.)
|
|
|
|
|
|
#30 |
|
Aug 2002
North San Diego County
10101011012 Posts |
Windows is not 100% consistent in enumerating cores. For example, I have a dual Opteron 6128 box (16 physical cores). Prime64 runs 16 individual worker threads, each assigned to a unique core. Windows/Prime64 cores correlate like this under the current Win7Pro install: 1-8 match, Windows 9-12 => Prime 13-16, Windows 13-16 => Prime 9-12.
On different OS (Win2k3 server, Win7Pro without SP1, and W2K8R2 [IIRC, might have been plain W2K8]) but EXACT same hardware and BIOS settings, they correlate exactly. Another Win7 install swapped 5-8 and 1-4. With hyperthreading, the permutations get even weirder. Luckily, the enumerations appear to remain consistent once established; once you figure out the particular setup it doesn't change until a new OS is installed. Ubuntu 10.4(? LTS) enumeration matched Mprime numbering the one time I installed it. Last fiddled with by sdbardwick on 2012-05-23 at 05:43 Reason: Ubuntu info. |
|
|
|
|
|
#31 |
|
"Oliver"
Mar 2005
Germany
45716 Posts |
Some affinity "fun" on some big iron...
prime95 v27.7 x86-64, Linux Wrong usage of Affinity + AffinityScramble2 or bug? There seems to be an issue with the AffinityScramble2 and small letters, capital letters work fine! local.txt Code:
WorkerThreads=1 ThreadsPerTest=10 Affinity=0 AffinityScramble2=0123456789 [...] Pid=30642 Code:
pid 30642's current affinity mask: ffffffffffffffffffff # main thread pid 30806's current affinity mask: ffffffffffffffffffff # communication thread pid 30807's current affinity mask: 1 pid 30869's current affinity mask: 2 pid 30870's current affinity mask: 4 pid 30871's current affinity mask: 8 pid 30872's current affinity mask: 10 pid 30873's current affinity mask: 20 pid 30874's current affinity mask: 40 pid 30875's current affinity mask: 80 pid 30876's current affinity mask: 100 pid 30877's current affinity mask: 200 local.txt Code:
WorkerThreads=1 ThreadsPerTest=10 Affinity=0 AffinityScramble2=UVWXYZabcd [...] Pid=31147 Code:
pid 31147's current affinity mask: ffffffffffffffffffff # main thread pid 31238's current affinity mask: ffffffffffffffffffff # communication thread pid 31239's current affinity mask: 40000000 pid 31382's current affinity mask: 80000000 pid 31383's current affinity mask: 100000000 # not limited to 32 cores anymore! pid 31384's current affinity mask: 200000000 # not limited to 32 cores anymore! pid 31385's current affinity mask: 400000000 # not limited to 32 cores anymore! pid 31386's current affinity mask: 800000000 # not limited to 32 cores anymore! pid 31387's current affinity mask: 40000000 pid 31388's current affinity mask: 40000000 pid 31389's current affinity mask: 40000000 pid 31390's current affinity mask: 40000000 local.txt Code:
WorkerThreads=1 ThreadsPerTest=10 Affinity=0 AffinityScramble2=efghijklmn [...] Pid=31317 Code:
pid 31317's current affinity mask: ffffffffffffffffffff # main thread pid 31408's current affinity mask: ffffffffffffffffffff # communication thread pid 31409's current affinity mask: ffffffffffffffffffff pid 31552's current affinity mask: ffffffffffffffffffff pid 31553's current affinity mask: ffffffffffffffffffff pid 31554's current affinity mask: ffffffffffffffffffff pid 31555's current affinity mask: ffffffffffffffffffff pid 31556's current affinity mask: ffffffffffffffffffff pid 31557's current affinity mask: ffffffffffffffffffff pid 31558's current affinity mask: ffffffffffffffffffff pid 31559's current affinity mask: ffffffffffffffffffff pid 31560's current affinity mask: ffffffffffffffffffff |
|
|
|
|
|
#32 |
|
"Oliver"
Mar 2005
Germany
111110 Posts |
in commonb.c line 576 to 589:
Code:
for (i = 0; i < MAX_NUM_WORKER_THREADS; i++) {
if (scramble[i] >= '0' && scramble[i] <= '9')
AFFINITY_SCRAMBLE[i] = scramble[i] - '0';
else if (scramble[i] >= 'A' && scramble[i] <= 'Z')
AFFINITY_SCRAMBLE[i] = scramble[i] - 'A' + 10;
else if (scramble[i] >= 'a' && scramble[i] <= 'z')
AFFINITY_SCRAMBLE[i] = scramble[i] - 'A' + 36;
else if (scramble[i] == '(')
AFFINITY_SCRAMBLE[i] = 62;
else if (scramble[i] == ')')
AFFINITY_SCRAMBLE[i] = 63;
else
AFFINITY_SCRAMBLE[i] = i; /* Illegal entry = no mapping */
}
Oliver |
|
|
|
|
|
#33 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
7,537 Posts |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Prime95 version 27.3 | Prime95 | Software | 148 | 2012-03-18 19:24 |
| Prime95 version 26.3 | Prime95 | Software | 76 | 2010-12-11 00:11 |
| Prime95 version 25.5 | Prime95 | PrimeNet | 369 | 2008-02-26 05:21 |
| Prime95 version 25.4 | Prime95 | PrimeNet | 143 | 2007-09-24 21:01 |
| When the next prime95 version ? | pacionet | Software | 74 | 2006-12-07 20:30 |