![]() |
Best settings and upgrade path for i7-2600
What are the optimal worker settings for LL testing on Sandy Bridge Core i7-2600? And what would be a sensible upgrade path for me? (SNB-E or Ivy Bridge?)
I overclock to 3.9 GHz (the max for non-K CPU), I use mprime with Debian GNU/Linux and my CPU has 4 hyperthreaded cores for 8 threads total with 8MB of L3 cache (95W TDP). I use 16 GB of dual-channel RAM (at 1333MHz), allowing up to 13GB for P-1, and a 500MB/s SSD for the swap partition. The PC is also used for other tasks and operates 24/7 (eating up 350W of power). My goal is to maximize my chances of finding a prime as soon as possible while minimizing power usage and keep my CPU temperature low. At the same time, I want to be able to use the PC for other tasks without mprime affecting me much. The other tasks include CPU-intensive graphics/video tasks but not gaming, as well as web browsing. With 8 workers (1 thread each) doing LL testing on 8 exponents my per-iteration times are in the 0.060s and my CPU temperature is up to 71-73 C with CPU usage at 100% (it's 40 C when it's 0% in use). When I use te PC for other tasks, mprime slows down a bit (iteration time 0.075). With 4 workers (2 threads each) doing LL on 4 exponents the per-iteration time is 0.034 and CPU temp is 68 C with CPU usage at 97-99%. With 4 workers (1 thread each) doing LL on 4 exponents the per-iteration time is 0.031 and CPU usage is 55%. This frees up 4 threads on the processor for other tasks. With 2 workers (2 threads each) doing LL on 2 exponents the per-iteration time is 0.025. With 2 workers (4 threads each) doing LL on 2 exps the per-iteration time is 0.018 at 86% CPU usage. In all cases each worker runs on a different core. What settings should I use? What upgrade path would you recommend for me? There is 4-core Sandy Bridge E (SNB-E) for increased memory bandwidth (quad-channel vs dual-channel) and more overclockability at 130W TDP (300 Euro). There's also the 6-core/12-thread SNB-E (same watts and memory) but it's too pricey (500 Euro). But if I wait I'd get an Ivy Bridge (higher IPC at 77W TDP) or wait even more for Ivy Bridge E. (I don't use Intel HD graphics). How much would the quad-channel memory bandwidth of SNB-E help me for mprime? Is it worth to get the 4-core/quad-channel 3820 or go for the 6-core 3860? I'll overclock in any case. Any idea what the TDP of Ivy Bridge E is going to be and when it will be available in EU? I believe AMD doesn't have anything to offer to me as an upgrade for the i7, right? |
First step is to go here to get v27.3: [url]http://mersenneforum.org/showthread.php?t=16535[/url]
Next up, try disabling hyperthreading in the BIOS. You very likely will run cooler and maybe a tiny bit faster and a little less power draw. If you can, try running memory at 1600 MHz. (Can a non-K Sandy Bridge do that?) As for upgrading, you have a great machine right now. A SNB-E is an expensive upgrade as you'll need a new motherboard too. Faster memory is your cheapest upgrade (if your machine can run memory faster). Ivy Bridge may not be a cost-effective upgrade as mprime may be memory bandwidth limited. The thread above is trying to figure that out. Forget about AMD. |
Doing more than one worker per physical core is usually less efficient. The thread mentioned above is mostly about memory bandwidth limitations (besides the new MPrime version), and has quite a few SB-E benchmarks comparing different settings.
Also, if you have a discrete GPU, consider running mfakto/mfaktc/CUDALucas (see the GPU subforum [url=http://www.mersenneforum.org/forumdisplay.php?f=92]here[/url]). |
Non-SNB-E i7 CPUs officially support up to 1333 MHz RAM, but very often they can go higher. SNB-E support 1600 MHz. Thanks for the suggestions to use the discrete GPU and mprime 27.3, I'll try both!
As for the HT, MPrime might prefer it disabled but won't this slow down other tasks and the OS? If I run 4 workers on 4 threads on the 4 cores, wouldn't it help the other tasks to have 4 more threads available? |
When I had hyperthreading active, I'd do as I just posted in the other thread:
In Linux (Ubuntu at least) logicals pair to physicals as [0,4] [1,5] [2,6] [3,7]. So I run four workers with two threads each. Worker 1 thread 1 runs on core 0, worker 1 thread 2 runs on core 4, Worker 2 thread 1 runs on core 1, etc. I get about the same throughput that was as if HT was off. On the other hand, all logical cores are occupied, so it won't increase responsiveness. However, I've generally found everything to be responsive anyways. Various things you can try are listed in undoc.txt: particular things I'd put you to is setting PauseWhileRunning, LowMemWhileRunning, and Nice (for the last one it'd be Nice=19). |
Thanks for the useful answers. I wonder, what does the helper thread do in mprime's workers?
I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no? |
by the way, with mprime 27.3 which uses AVX on my 3.9GHz i7, with 4 workers (1 thread each, HT still enabled, RAM still at 1333) the per-iteration time is 0.027-0.033ms :)
|
[QUOTE=emily;289954]Thanks for the useful answers. I wonder, what does the helper thread do in mprime's workers?
I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no?[/QUOTE] I can't really answer these questions directly, but I can make the general comment that I personally believe the extra thread is more for the OS's scheduler benefit than for actually speeding up the test, at least when it comes to HT. If you have 2 threads for one worker on two cores, in other words actual multithreading, then someone else will have to answer your questions. |
[QUOTE=emily;289954]I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no?[/QUOTE]
In fact, hyperthreading will switch threads when an operation stalls the processor for a rather long time, for instance when there's a need to fetch data from main memory. That's the reason why highly tuned programs such as mprime don't benefit from it. |
Attention MSI Z68A-GD65 Gen3 motherboard and Corsair Vengeance DDR3-1600/C8 owners: BIOS 23.4 and lower won't let you run this RAM at 1600MHz, you'll need BIOS 23.6!
Now I disabled hyperthreading, run the RAM at 1600MHz (CPU still 3.9GHz) and I try mprime (AVX) with 4 single-thread workers spanned at the four cores: the per-iteration time is 0.021-0.024ms :D (down from 0.027-0.033ms when HT was enabled and RAM was 1333MHz). Looks like I'll let HT disabled :) And the temperature without HT is 65 C now... cores 1 and 4 are 62-63 C and cores 2 and 3 are 65-66 C. This is I think a little bit lower than when running 4 LL threads with HT on, and a lot lower than when running 8 LL threads with HT (73 C on that case!) Tip: to see temps on i7 under GNU/Linux use the i7z program. |
[QUOTE=emily;290128]Now I disabled hyperthreading, run the RAM at 1600MHz (CPU still 3.9GHz) and I try mprime (AVX) with 4 single-thread workers spanned at the four cores: the per-iteration time is 0.021-0.024ms :D (down from 0.027-0.033ms when HT was enabled and RAM was 1333MHz).
Looks like I'll let HT disabled :)[/QUOTE] Umm, you have changed two things at once, and I suspect it's the faster RAM that makes most of the difference - what changes when you turn HT back on? |
| All times are UTC. The time now is 01:23. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.