mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2015-12-30, 15:46   #12
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

3×23×89 Posts
Default

Xeon would have advantages. It would allow you to get a head start on developing for AVX512.
You could potentially use 1 motherboard for the same number of cores. 1 power supply. 1 os. Simpler case etc.
You have looked a 8x2 arrangement. What about 4x4?
Are 4 socket systems really expensive?
henryzz is offline   Reply With Quote
Old 2015-12-30, 16:35   #13
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

2×7×461 Posts
Default

Four-socket Xeon systems are amazingly expensive, and the processors available for them are quite slow (e.g. 12 2.1GHz HSW cores for \$3800).

The Skylake Xeons that you'd need for AVX512 development are unlikely to be available before Spring 2017.

I have an account somewhere containing some money intended to turn into a dual 12-core SKL Xeon machine in mid-2017, but will have to save fairly intently to get there by then.
fivemack is offline   Reply With Quote
Old 2015-12-30, 17:46   #14
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

133368 Posts
Default

Quote:
Originally Posted by bgbeuning View Post
Add the cost of power from your electric company.
Someone else suggested this, and it changed my plans.
In George's OP, he lists a cost for parts, followed by a cost for power. In his location, a watt-year happens to be close to $1.
VBCurtis is online now   Reply With Quote
Old 2015-12-30, 19:53   #15
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

22×863 Posts
Default

Did you consider Haswell-E like the 5820K ? or the equivalent Xeon LGA 2011-v3 processors? The quad channel DDR4 is really good for LL-testing as my Dual vs Quad channel tests showed.

On my 5960X it is most efficient to use all 8 cores on a single exponent rather than running several at once. 62M exponent at 3360K FFT took 38.5 hours.

Last fiddled with by ATH on 2015-12-30 at 19:54
ATH is offline   Reply With Quote
Old 2015-12-30, 21:15   #16
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

17·487 Posts
Default Pick the sweet spot (CPU speed vs. RAM speed)

For your consideration, here is some raw data 11 different cpu speeds vs 2 different ram speeds
Attached Files
File Type: txt results.txt (24.3 KB, 363 views)
Prime95 is offline   Reply With Quote
Old 2015-12-30, 22:18   #17
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

1101010011102 Posts
Default

Quote:
Originally Posted by Prime95 View Post
For your consideration, here is some raw data 11 different cpu speeds vs 2 different ram speeds
Those are with a 4M FFT size... have you tried it with something like 2M FFT?

That equates to exponents in the 36M'ish range which is in the middle of that "sweet spot" that I've found for running multiple workers with minimal memory thrashing.

I'm not sure what exponent range the 4M FFT size is used... I guess 70-72M? That's well beyond where I start to see serious degradation when multiple workers are using that size (for me it was generally anything above 58M, whatever FFT size that would be).

In other words, what works good for optimal throughput at one FFT size won't be as good for other FFT sizes. You may get better throughput using multiple cores on one worker so that you're stressing the CPU more and the memory won't start bottlenecking.
Madpoo is offline   Reply With Quote
Old 2015-12-30, 22:54   #18
chappy
 
chappy's Avatar
 
"Jeff"
Feb 2012
St. Louis, Missouri, USA

13×89 Posts
Default

Have you thought about harnessing the power of the Dark Side?


http://robot6.comicbookresources.com...s-a-gaming-pc/
chappy is offline   Reply With Quote
Old 2015-12-31, 03:37   #19
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

205716 Posts
Default

Quote:
Originally Posted by Madpoo View Post
Those are with a 4M FFT size... have you tried it with something like 2M FFT?
2M FFTs should behave almost identically, but I did not actually measure it. The 2M FFT does slightly less FPU activity for each floating point value, but the difference is pretty small.

4M FFTs are used for 73M to 78M. This is near where my itx box would be doing a lot of first time LL testing.
Prime95 is offline   Reply With Quote
Old 2015-12-31, 22:18   #20
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

827910 Posts
Default

Quote:
Originally Posted by Prime95 View Post
For your consideration, here is some raw data 11 different cpu speeds vs 2 different ram speeds
I've studied the data and it seems there is no "sweet spot" for cpu speed vs. ram speed. It is simply the case of applying more and more CPU speed leads to ever more diminishing returns.

The good news is that I believe I can extrapolate from the data the most cost effective build at today's hardware prices and my estimated electric rates. Details to follow.
Prime95 is offline   Reply With Quote
Old 2015-12-31, 23:35   #21
bgbeuning
 
Dec 2014

25510 Posts
Default

Quote:
Originally Posted by Prime95 View Post
For your consideration, here is some raw data 11 different cpu speeds vs 2 different ram speeds
Results for HP 8300 (Small Form Factor) with DDR3-1600, cost $200 on ebay, power usage = 80W (Measured by Kill-A-Watt)

[Thu Dec 10 17:40:00 2015]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
CPU speed: 2464.28 MHz, 4 cores
CPU features: Prefetch, SSE, SSE2, SSE4, AVX
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 6 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 64-bit version 28.7, RdtscTiming=

Timings for 4096K FFT length (1 cpu, 1 worker): 24.18 ms. Throughput: 41.36 iter/sec.
Timings for 4096K FFT length (2 cpus, 2 workers): 24.70, 24.51 ms. Throughput: 81.28 iter/sec.
Timings for 4096K FFT length (3 cpus, 3 workers): 26.83, 26.82, 26.83 ms. Throughput: 111.84 iter/sec.
Timings for 4096K FFT length (4 cpus, 4 workers): 30.77, 30.77, 30.88, 30.87 ms. Throughput: 129.77 iter/sec.
bgbeuning is offline   Reply With Quote
Old 2016-01-01, 02:15   #22
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

201278 Posts
Default

This is how I went about deciding my optimal dream build. Let's start with a base line 5 CPU system using overclocked memory:

5 ASRock Z170M-ITX/ac motherboards @130 = 650
5 2x4GB DDR4-3200 @60= 300
5 I5-6600 CPUs (3.3GHz, 65W) @230 = 1150
1 Samsung 850 EVO SSD @90 = 90
4 PicoPSU picoPSU-120 @40 = 160
1 Case, power supply, network switch -- approximate value $$100
Each of the 5 units will consume 65W CPU, 4W memory, 15W(?) mobo or about 425W total. Add in 15% power supply inefficiency for a total of 500W at the wall.
Total cost of 3 year ownership = 2450 parts + 3 * 500 = 3950
Total cost of 4 year ownership = 2450 parts + 4 * 500 = 4450

Now lets guess the throughput of this system using the Haswell data posted earlier. A 2.2GHz Haswell with DDR3-2133 gets 131.8 thoughput. In this system, each CPU will run 50% faster (3.3GHz vs. 2.2GHz) with 50% faster memory (DDR4-3200 vs. DDR3-2133). Thus 131.8 + 50% = 197.7. Actually should be better than that since Skylake CPU is slightly more efficient than a Haswell CPU. But we'll leave the expected throughput number at 197.7

Now lets define a metric to optimize -- expected throughput per dollar (TPD).
3 year TPD = 5 CPUs * 197.7 * 3 years / 3950 = 0.7508
4 year TPD = 5 CPUs * 197.7 * 4 years / 4450 = 0.8885

Let's compare that to a second system built with cheaper motherboards that do not allow overclocking. We will save $60 for each motherboard and $20 for each RAM pair, for a total of $400. Expected throughput for each CPU is 165.4 (that is what a 3.4GHz Haswell gets using DDR3-2133). Now let's look at our TPD metric:
3 year TPD = 5 CPUs * 165.4 * 3 years / 3550 = 0.6989
4 year TPD = 5 CPUs * 165.4 * 4 years / 4050 = 0.8168
Not nearly as good as the previous system.

Now we'll try a cheaper 3.2 GHz CPU in the base system. This saves 25 dollars per CPU. Expected TPD won't go down much probably to 193 or 194.
3 year TPD = 5 CPUs * 193.5 * 3 years / 3825 = 0.7588
4 year TPD = 5 CPUs * 193.5 * 4 years / 4325 = 0.8948
That's better than the first system

How about overclocking? Each K-series CPU will cost $50 more. I assume power draw is proportial to frequency and the square of the voltage. As an example, lets target a 200MHz frequency increase. I'll assume a frequency increase requires a tiny voltage bump. Thus, CPU power goes from 65W to 65 * 3.5/3.3 * (1.17/1.15)^2 or an increase of 7.5W when taking power supply ineffiency. This is less than the 91W TDP listed for K-series CPUs. A 5% increase in throughput is a fairly generous assumption -- 197.7 * 1.05 is 207.6.
3 year TPD = 5 CPUs * 207.6 * 3 years / (3950+5*50+5*7.5*3) = 0.7221
4 year TPD = 5 CPUs * 207.6 * 4 years / (4450+5*50+5*7.5*4) = 0.8561
Not worth the money.


Conclusion: It is best to create an overclocked memory system using the cheaper I5-6500 locked processor.
Prime95 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
A dream, will stay a dream ( new Nvidia Quadro) firejuggler GPU Computing 0 2018-03-28 16:02
@ George Gordon GMP-ECM 2 2017-09-04 04:05
Dream Build cappy95833 Hardware 10 2014-03-29 15:02
Dream PC plandon Hardware 39 2009-08-30 09:36
He had a dream fetofs Puzzles 8 2006-07-09 09:33

All times are UTC. The time now is 16:35.


Fri Jul 7 16:35:18 UTC 2023 up 323 days, 14:03, 1 user, load averages: 2.59, 2.35, 2.03

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔