mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2012-02-18, 19:43   #1
emily
 
Feb 2012
Athens, Greece

47 Posts
Default Best settings and upgrade path for i7-2600

What are the optimal worker settings for LL testing on Sandy Bridge Core i7-2600? And what would be a sensible upgrade path for me? (SNB-E or Ivy Bridge?)

I overclock to 3.9 GHz (the max for non-K CPU), I use mprime with Debian GNU/Linux and my CPU has 4 hyperthreaded cores for 8 threads total with 8MB of L3 cache (95W TDP). I use 16 GB of dual-channel RAM (at 1333MHz), allowing up to 13GB for P-1, and a 500MB/s SSD for the swap partition. The PC is also used for other tasks and operates 24/7 (eating up 350W of power).

My goal is to maximize my chances of finding a prime as soon as possible while minimizing power usage and keep my CPU temperature low. At the same time, I want to be able to use the PC for other tasks without mprime affecting me much. The other tasks include CPU-intensive graphics/video tasks but not gaming, as well as web browsing.

With 8 workers (1 thread each) doing LL testing on 8 exponents my per-iteration times are in the 0.060s and my CPU temperature is up to 71-73 C with CPU usage at 100% (it's 40 C when it's 0% in use). When I use te PC for other tasks, mprime slows down a bit (iteration time 0.075).

With 4 workers (2 threads each) doing LL on 4 exponents the per-iteration time is 0.034 and CPU temp is 68 C with CPU usage at 97-99%.

With 4 workers (1 thread each) doing LL on 4 exponents the per-iteration time is 0.031 and CPU usage is 55%. This frees up 4 threads on the processor for other tasks.

With 2 workers (2 threads each) doing LL on 2 exponents the per-iteration time is 0.025.

With 2 workers (4 threads each) doing LL on 2 exps the per-iteration time is 0.018 at 86% CPU usage.

In all cases each worker runs on a different core. What settings should I use?

What upgrade path would you recommend for me? There is 4-core Sandy Bridge E (SNB-E) for increased memory bandwidth (quad-channel vs dual-channel) and more overclockability at 130W TDP (300 Euro). There's also the 6-core/12-thread SNB-E (same watts and memory) but it's too pricey (500 Euro). But if I wait I'd get an Ivy Bridge (higher IPC at 77W TDP) or wait even more for Ivy Bridge E. (I don't use Intel HD graphics).

How much would the quad-channel memory bandwidth of SNB-E help me for mprime? Is it worth to get the 4-core/quad-channel 3820 or go for the 6-core 3860? I'll overclock in any case. Any idea what the TDP of Ivy Bridge E is going to be and when it will be available in EU?

I believe AMD doesn't have anything to offer to me as an upgrade for the i7, right?

Last fiddled with by emily on 2012-02-18 at 20:11
emily is offline   Reply With Quote
Old 2012-02-18, 20:19   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

752610 Posts
Default

First step is to go here to get v27.3: http://mersenneforum.org/showthread.php?t=16535

Next up, try disabling hyperthreading in the BIOS. You very likely will run cooler and maybe a tiny bit faster and a little less power draw.

If you can, try running memory at 1600 MHz. (Can a non-K Sandy Bridge do that?)

As for upgrading, you have a great machine right now. A SNB-E is an expensive upgrade as you'll need a new motherboard too. Faster memory is your cheapest upgrade (if your machine can run memory faster). Ivy Bridge may not be a cost-effective upgrade as mprime may be memory bandwidth limited. The thread above is trying to figure that out. Forget about AMD.
Prime95 is offline   Reply With Quote
Old 2012-02-18, 21:53   #3
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

Doing more than one worker per physical core is usually less efficient. The thread mentioned above is mostly about memory bandwidth limitations (besides the new MPrime version), and has quite a few SB-E benchmarks comparing different settings.

Also, if you have a discrete GPU, consider running mfakto/mfaktc/CUDALucas (see the GPU subforum here).

Last fiddled with by Dubslow on 2012-02-18 at 21:54
Dubslow is offline   Reply With Quote
Old 2012-02-19, 00:06   #4
emily
 
Feb 2012
Athens, Greece

578 Posts
Default

Non-SNB-E i7 CPUs officially support up to 1333 MHz RAM, but very often they can go higher. SNB-E support 1600 MHz. Thanks for the suggestions to use the discrete GPU and mprime 27.3, I'll try both!

As for the HT, MPrime might prefer it disabled but won't this slow down other tasks and the OS? If I run 4 workers on 4 threads on the 4 cores, wouldn't it help the other tasks to have 4 more threads available?
emily is offline   Reply With Quote
Old 2012-02-19, 00:53   #5
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts
Default

When I had hyperthreading active, I'd do as I just posted in the other thread:

In Linux (Ubuntu at least) logicals pair to physicals as [0,4] [1,5] [2,6] [3,7]. So I run four workers with two threads each. Worker 1 thread 1 runs on core 0, worker 1 thread 2 runs on core 4, Worker 2 thread 1 runs on core 1, etc. I get about the same throughput that was as if HT was off. On the other hand, all logical cores are occupied, so it won't increase responsiveness. However, I've generally found everything to be responsive anyways. Various things you can try are listed in undoc.txt: particular things I'd put you to is setting PauseWhileRunning, LowMemWhileRunning, and Nice (for the last one it'd be Nice=19).
Dubslow is offline   Reply With Quote
Old 2012-02-19, 11:08   #6
emily
 
Feb 2012
Athens, Greece

4710 Posts
Default

Thanks for the useful answers. I wonder, what does the helper thread do in mprime's workers?

I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no?
emily is offline   Reply With Quote
Old 2012-02-19, 12:39   #7
emily
 
Feb 2012
Athens, Greece

47 Posts
Default

by the way, with mprime 27.3 which uses AVX on my 3.9GHz i7, with 4 workers (1 thread each, HT still enabled, RAM still at 1333) the per-iteration time is 0.027-0.033ms :)
emily is offline   Reply With Quote
Old 2012-02-19, 18:50   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by emily View Post
Thanks for the useful answers. I wonder, what does the helper thread do in mprime's workers?

I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no?
I can't really answer these questions directly, but I can make the general comment that I personally believe the extra thread is more for the OS's scheduler benefit than for actually speeding up the test, at least when it comes to HT. If you have 2 threads for one worker on two cores, in other words actual multithreading, then someone else will have to answer your questions.
Dubslow is offline   Reply With Quote
Old 2012-02-19, 23:10   #9
ldesnogu
 
ldesnogu's Avatar
 
Jan 2008
France

2×52×11 Posts
Default

Quote:
Originally Posted by emily View Post
I know hyperthreading is about allowing two instructions to use the processor pipeline concurrently as long as they need different execution resources on the pipeline. But I thought all mprime's calculation iterations do the same thing, no?
In fact, hyperthreading will switch threads when an operation stalls the processor for a rather long time, for instance when there's a need to fetch data from main memory. That's the reason why highly tuned programs such as mprime don't benefit from it.
ldesnogu is offline   Reply With Quote
Old 2012-02-20, 18:41   #10
emily
 
Feb 2012
Athens, Greece

47 Posts
Default

Attention MSI Z68A-GD65 Gen3 motherboard and Corsair Vengeance DDR3-1600/C8 owners: BIOS 23.4 and lower won't let you run this RAM at 1600MHz, you'll need BIOS 23.6!

Now I disabled hyperthreading, run the RAM at 1600MHz (CPU still 3.9GHz) and I try mprime (AVX) with 4 single-thread workers spanned at the four cores: the per-iteration time is 0.021-0.024ms :D (down from 0.027-0.033ms when HT was enabled and RAM was 1333MHz).

Looks like I'll let HT disabled :)

And the temperature without HT is 65 C now... cores 1 and 4 are 62-63 C and cores 2 and 3 are 65-66 C. This is I think a little bit lower than when running 4 LL threads with HT on, and a lot lower than when running 8 LL threads with HT (73 C on that case!)

Tip: to see temps on i7 under GNU/Linux use the i7z program.

Last fiddled with by emily on 2012-02-20 at 19:27
emily is offline   Reply With Quote
Old 2012-02-21, 00:11   #11
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

641910 Posts
Default

Quote:
Originally Posted by emily View Post
Now I disabled hyperthreading, run the RAM at 1600MHz (CPU still 3.9GHz) and I try mprime (AVX) with 4 single-thread workers spanned at the four cores: the per-iteration time is 0.021-0.024ms :D (down from 0.027-0.033ms when HT was enabled and RAM was 1333MHz).

Looks like I'll let HT disabled :)
Umm, you have changed two things at once, and I suspect it's the faster RAM that makes most of the difference - what changes when you turn HT back on?
fivemack is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
The Fastest Path a1call Puzzles 23 2016-03-23 17:46
Path Counting henryzz Puzzles 13 2014-09-17 11:21
overclocking an i7-2600 to finish an 100M exponent in less than a year :) emily Hardware 4 2013-02-28 20:11
Expected Path Length davar55 Puzzles 12 2008-02-26 21:53
Are you in the path of Isabel jocelynl Soap Box 14 2003-09-22 20:38

All times are UTC. The time now is 15:45.


Fri Jul 16 15:45:47 UTC 2021 up 49 days, 13:33, 1 user, load averages: 1.89, 1.79, 1.69

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.