![]() |
|
|
#1 |
|
Feb 2012
Athens, Greece
47 Posts |
What are the optimal worker settings for LL testing on Sandy Bridge Core i7-2600? And what would be a sensible upgrade path for me? (SNB-E or Ivy Bridge?)
I overclock to 3.9 GHz (the max for non-K CPU), I use mprime with Debian GNU/Linux and my CPU has 4 hyperthreaded cores for 8 threads total with 8MB of L3 cache (95W TDP). I use 16 GB of dual-channel RAM (at 1333MHz), allowing up to 13GB for P-1, and a 500MB/s SSD for the swap partition. The PC is also used for other tasks and operates 24/7 (eating up 350W of power). My goal is to maximize my chances of finding a prime as soon as possible while minimizing power usage and keep my CPU temperature low. At the same time, I want to be able to use the PC for other tasks without mprime affecting me much. The other tasks include CPU-intensive graphics/video tasks but not gaming, as well as web browsing. With 8 workers (1 thread each) doing LL testing on 8 exponents my per-iteration times are in the 0.060s and my CPU temperature is up to 71-73 C with CPU usage at 100% (it's 40 C when it's 0% in use). When I use te PC for other tasks, mprime slows down a bit (iteration time 0.075). With 4 workers (2 threads each) doing LL on 4 exponents the per-iteration time is 0.034 and CPU temp is 68 C with CPU usage at 97-99%. With 4 workers (1 thread each) doing LL on 4 exponents the per-iteration time is 0.031 and CPU usage is 55%. This frees up 4 threads on the processor for other tasks. With 2 workers (2 threads each) doing LL on 2 exponents the per-iteration time is 0.025. With 2 workers (4 threads each) doing LL on 2 exps the per-iteration time is 0.018 at 86% CPU usage. In all cases each worker runs on a different core. What settings should I use? What upgrade path would you recommend for me? There is 4-core Sandy Bridge E (SNB-E) for increased memory bandwidth (quad-channel vs dual-channel) and more overclockability at 130W TDP (300 Euro). There's also the 6-core/12-thread SNB-E (same watts and memory) but it's too pricey (500 Euro). But if I wait I'd get an Ivy Bridge (higher IPC at 77W TDP) or wait even more for Ivy Bridge E. (I don't use Intel HD graphics). How much would the quad-channel memory bandwidth of SNB-E help me for mprime? Is it worth to get the 4-core/quad-channel 3820 or go for the 6-core 3860? I'll overclock in any case. Any idea what the TDP of Ivy Bridge E is going to be and when it will be available in EU? I believe AMD doesn't have anything to offer to me as an upgrade for the i7, right? Last fiddled with by emily on 2012-02-18 at 20:11 |
|
|
|
|
|
#2 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
165618 Posts |
First step is to go here to get v27.3: http://mersenneforum.org/showthread.php?t=16535
Next up, try disabling hyperthreading in the BIOS. You very likely will run cooler and maybe a tiny bit faster and a little less power draw. If you can, try running memory at 1600 MHz. (Can a non-K Sandy Bridge do that?) As for upgrading, you have a great machine right now. A SNB-E is an expensive upgrade as you'll need a new motherboard too. Faster memory is your cheapest upgrade (if your machine can run memory faster). Ivy Bridge may not be a cost-effective upgrade as mprime may be memory bandwidth limited. The thread above is trying to figure that out. Forget about AMD. |
|
|
|
|
|
#3 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Doing more than one worker per physical core is usually less efficient. The thread mentioned above is mostly about memory bandwidth limitations (besides the new MPrime version), and has quite a few SB-E benchmarks comparing different settings.
Also, if you have a discrete GPU, consider running mfakto/mfaktc/CUDALucas (see the GPU subforum here). Last fiddled with by Dubslow on 2012-02-18 at 21:54 |
|
|
|
|
|
#4 |
|
Feb 2012
Athens, Greece
47 Posts |
Non-SNB-E i7 CPUs officially support up to 1333 MHz RAM, but very often they can go higher. SNB-E support 1600 MHz. Thanks for the suggestions to use the discrete GPU and mprime 27.3, I'll try both!
As for the HT, MPrime might prefer it disabled but won't this slow down other tasks and the OS? If I run 4 workers on 4 threads on the 4 cores, wouldn't it help the other tasks to have 4 more threads available? |
|
|
|
|
|
#5 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
When I had hyperthreading active, I'd do as I just posted in the other thread:
In Linux (Ubuntu at least) logicals pair to physicals as [0,4] [1,5] [2,6] [3,7]. So I run four workers with two threads each. Worker 1 thread 1 runs on core 0, worker 1 thread 2 runs on core 4, Worker 2 thread 1 runs on core 1, etc. I get about the same throughput that was as if HT was off. On the other hand, all logical cores are occupied, so it won't increase responsiveness. However, I've generally found everything to be responsive anyways. Various things you can try are listed in undoc.txt: particular things I'd put you to is setting PauseWhileRunning, LowMemWhileRunning, and Nice (for the last one it'd be Nice=19). |
|
|
|
|
|
#6 |
|
Feb 2012
Athens, Greece
47 Posts |
Thanks for the useful answers. I wonder, what does the helper thread do in mprime's workers?
I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no? |
|
|
|
|
|
#7 |
|
Feb 2012
Athens, Greece
578 Posts |
by the way, with mprime 27.3 which uses AVX on my 3.9GHz i7, with 4 workers (1 thread each, HT still enabled, RAM still at 1333) the per-iteration time is 0.027-0.033ms :)
|
|
|
|
|
|
#8 | |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
11100001101012 Posts |
Quote:
|
|
|
|
|
|
|
#9 |
|
Jan 2008
France
2·52·11 Posts |
In fact, hyperthreading will switch threads when an operation stalls the processor for a rather long time, for instance when there's a need to fetch data from main memory. That's the reason why highly tuned programs such as mprime don't benefit from it.
|
|
|
|
|
|
#10 |
|
Feb 2012
Athens, Greece
4710 Posts |
Attention MSI Z68A-GD65 Gen3 motherboard and Corsair Vengeance DDR3-1600/C8 owners: BIOS 23.4 and lower won't let you run this RAM at 1600MHz, you'll need BIOS 23.6!
Now I disabled hyperthreading, run the RAM at 1600MHz (CPU still 3.9GHz) and I try mprime (AVX) with 4 single-thread workers spanned at the four cores: the per-iteration time is 0.021-0.024ms :D (down from 0.027-0.033ms when HT was enabled and RAM was 1333MHz). Looks like I'll let HT disabled :) And the temperature without HT is 65 C now... cores 1 and 4 are 62-63 C and cores 2 and 3 are 65-66 C. This is I think a little bit lower than when running 4 LL threads with HT on, and a lot lower than when running 8 LL threads with HT (73 C on that case!) Tip: to see temps on i7 under GNU/Linux use the i7z program. Last fiddled with by emily on 2012-02-20 at 19:27 |
|
|
|
|
|
#11 | |
|
(loop (#_fork))
Feb 2006
Cambridge, England
3×2,141 Posts |
Quote:
|
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| The Fastest Path | a1call | Puzzles | 23 | 2016-03-23 17:46 |
| Path Counting | henryzz | Puzzles | 13 | 2014-09-17 11:21 |
| overclocking an i7-2600 to finish an 100M exponent in less than a year :) | emily | Hardware | 4 | 2013-02-28 20:11 |
| Expected Path Length | davar55 | Puzzles | 12 | 2008-02-26 21:53 |
| Are you in the path of Isabel | jocelynl | Soap Box | 14 | 2003-09-22 20:38 |