![]() |
|
|
#826 | |
|
Oct 2011
7·97 Posts |
Quote:
Hmm, is there a recommended type of work for each core to minimize problems like this? I notice my other system seems to run faster with core 1 and 3 doing LL/DC and core 2 and 4 doing TF (with 4 cores running LL the time per iteration was ~.090 for all 4, with the LL/TF/LL/TF setup the LL's are at ~.060). I know when the P-1 finishes and core 1 moves onto LL it won't be as much of a memory hog (plus the ECM will be done), would running a P-1 be better on core 2, 3 or 4? From my other system it seems 1/2 and 3/4 are kinda linked, so any thoughts would be appreciated. |
|
|
|
|
|
|
#827 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
1101010111012 Posts |
|
|
|
|
|
|
#828 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
160658 Posts |
Ah, the GPU FAQ PDF guide that Brain created, available in the GPU FAQ threads.
|
|
|
|
|
|
#829 |
|
Oct 2011
Maryland
2·5·29 Posts |
|
|
|
|
|
|
#830 | |
|
Nov 2002
Anchorage, AK
35710 Posts |
Quote:
So mfakto 0.09 should be the latest version and fixed. Last fiddled with by delta_t on 2011-11-04 at 21:24 Reason: Added links to mfakto thread |
|
|
|
|
|
|
#831 | |||
|
Jun 2003
7·167 Posts |
Quote:
The second issue I mentioned (which is probably the more significant one in causing the effects you have mentioned), doesn't depend much at all upon the specific amount of memory any particular thread is using. Rather it depends upon the nature of the work that thread is doing. Quote:
GIMPS worktypes fall into three categories: Low Bandwidth: TF - all types. Medium Bandwidth: LL - all types, ECM Stage 1, P-1 Stage 1 High Bandwidth: ECM Stage 2, P-1 Stage 2. I wouldn't recommend doing TF at all on a CPU any more. GPUs are so much faster at this type of work, that doing it with a CPU is a waste of a core. What I would recommend you do is put MaxHighMemWorkers=1, into local.txt. (You need to shut down P95 before you make changes to local.txt or they will be reverted.) Then run the program with P-1 on all four cores. As soon as you have one core running stage 2, note the timings of both the stage 2 core and the stage 1 cores. Change MaxHighMemWorkers to 2. Wait for a second core to go to stage 2, and again note the timings. Decide if you are willing to take the hit. If yes, then run ECM/P-1 on all four cores, with MaxHighMemWorkers equal to 2. If not then run ECM/P-1 on two cores, and LLs/doublechecks on the other two, with MaxHighMemWorkers equal to 1. This assumes you have high memory available all the time. If you don't then you are likely to quickly accumulate a backlog of uncompleted stage 2. Even if you do, with twice as many cores running P-1 as MaxHighMemWorkers, you will slowly accumulate uncompleted stage 2. Clear the backlog by occasionally running an LL test on one of your P-1 cores. Quote:
|
|||
|
|
|
|
|
#832 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
11×311 Posts |
|
|
|
|
|
|
#833 |
|
Oct 2011
7×97 Posts |
|
|
|
|
|
|
#834 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
11100001101012 Posts |
Not hyperthreaded.
http://ark.intel.com/products/36547/...z-1333-MHz-FSB) |
|
|
|
|
|
#835 | ||
|
"James Heinrich"
May 2004
ex-Northern Ontario
11×311 Posts |
Ah, I thought so. The early Intel Quads (including your Q8200 and my slightly older Q6600 are actually dual-dual-core CPUs rather than true quad-cores:
Quote:
Quote:
|
||
|
|
|
|
|
#836 | |
|
Oct 2011
7×97 Posts |
Quote:
I tried testing an LL/ECM/LL/TF and an LL/P-1/LL/TF, but in each of those, core 1 would run ~.090 per iteration while core 3 was .063 and the TF was like 2 seconds slower (237 to 239 sec per .14%) during S1. Interestingly during S2 (on the ECM), core 1 would drop to .084 while Core 3 had a few extra blips at .064 and core 4 jumped to about 248sec. The ECM took 11 hours to run compared to 9 hours with all 4 cores doing ECM. The P-1 I did not run to completion and moved the assignment to another machine as it was looking like a 3-4 day run on it. And the only time all 4 cores ran P-1 was when I first started and got 4 LL's that needed P-1 and I have no idea how long they took VS P-1 and something else. |
|
|
|
|