mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   Which DDR4 RAM is best for LL on intel CPUs? (https://www.mersenneforum.org/showthread.php?t=23718)

simon389 2018-10-17 10:53

Which DDR4 RAM is best for LL on intel CPUs?
 
I’m seeing a lot of options and am getting confused about which to buy, from expensive 4700mhz DDR4 RAM with terrible CAS latency (28-45-45) to affordable DDR4 RAM at 3000mhz with great CAS latency (15-15-15). Is it all the same for LL crunching? Or is there a sweet spot for MHz/CAS?

tServo 2018-10-17 14:09

[QUOTE=simon389;498174]I’m seeing a lot of options and am getting confused about which to buy, from expensive 4700mhz DDR4 RAM with terrible CAS latency (28-45-45) to affordable DDR4 RAM at 3000mhz with great CAS latency (15-15-15). Is it all the same for LL crunching? Or is there a sweet spot for MHz/CAS?[/QUOTE]

I read a paper sometime last year written by a ram manufacturer ( Micron, I think ) that said faster speed ALWAYS is better than lower CAS.

For LL crunching, assuming you have a decent number of fast cores, since memory access is the bottleneck, the faster the better. I believe George wrote a post within the past 10 days that his 7820x ( 8 cores ) using fast memory ( somewhere in the 3200+ range ) completely saturates the memory. This was to point out that additional cores wouldn't help thruput much.

Every post about memory I have ever read in the last x years ( 3 < x < 10 ) has reiterated that Samsung has THE best chips so look for dimms made using their chips.

Actually using 4700mhz memory would require other system components, particularly the motherboard, that supports it. Also, case ventilation to keep those dimms cool would help also.

Batalov 2018-10-17 14:36

[QUOTE=tServo;498182]... look for dimms made using [B]their [/B]chips.[/QUOTE]
The companies that make chips are largely unknown to the public and are frequently exactly the same for quite a few large brand names.

<<Put you pet company name here>> just slaps a sticker on them.


The real question that good reviews are addressing is how a particular memory model is organized. Most of the time, though, this is a rather useless information because many sellers put a generic picture and description and don't pass the detailed information on to buyers - they know that 90%+ of the buyers will buy it anyway based on whim and rumors.

ATH 2018-10-17 14:37

Remember to run dual-channel or quad-channel (or 6-channel for highend Xeons) RAM, whatever the best your motherboard and cpu supports, that is the biggest performance gain (not to be confused with [B]dual-rank[/B] RAM)

mackerel 2018-10-17 15:59

In order or priority (highest first) if money is no objective:
1, fill the ram channels the system allows
2, get the fastest ram that is compatible
3a, if possible, get dual-rank ram.
3b, if you have single rank ram, aim for 2 modules per channel
4, timings - it makes a small difference compared to the above

The problem with 2 above, apart from cost, is knowing if your system will run stably with high speed ram. Compatibility is a minefield.

At a practical level, I'd just get something nice around 3200 and you get whatever performance you get.

My 6700k system has G.Skill Ripjaws V 3200C16 2x8GB, which are dual rank modules. That gave ~25% more throughput than my other system which only had 3000 single rank modules. I spent weeks questioning my sanity when I saw that difference, before I figured out it was due to rank. Note the difference wont always be of that magnitude, but it definitely helps a lot where ram bandwidth is the limiting factor. I believe the 3200C14 or C15 in that same series use Samsung B-die in single rank configuration though (on 8GB modules). Ram manufacturers may also change the chips used without changing the marketing name, so that's an extra complication. 4GB DDR4 modules are almost certainly all single rank. 8GB modules may be single or dual rank, but the trend has been towards single rank for a long time. I don't have any 16GB modules but suspect there is a good chance of dual rank there.

Mark Rose 2018-10-17 17:20

All the 16 GB modules I have dual rank. But that's an expensive way to buy ranks.

simon389 2018-10-17 18:10

[QUOTE]get the fastest ram that is compatible[/QUOTE]

So, basically, ALWAYS prioritize MHz over CAS. So 4700mhz CAS 45 is the best?

Prime95 2018-10-17 18:19

[QUOTE=simon389;498201]So, basically, ALWAYS prioritize MHz over CAS. So 4700mhz CAS 45 is the best?[/QUOTE]

Yes, but if your motherboard or CPU will not run stably with memory that fast then you are just wasting money.

Another thing to consider is that memory prices escalate quite quickly once you get past the "sweet spot". I haven't looked at today's prices, but you might find that you could build two complete systems with DDR4-3200 for the same price as one system with DDR4-4700.

simon389 2018-10-17 18:40

[QUOTE=Prime95;498202]Yes, but if your motherboard or CPU will not run stably with memory that fast then you are just wasting money.

Another thing to consider is that memory prices escalate quite quickly once you get past the "sweet spot". I haven't looked at today's prices, but you might find that you could build two complete systems with DDR4-3200 for the same price as one system with DDR4-4700.[/QUOTE]

If I get the Corsair 4700mhz 19-19-19 on NewEgg (and compatible MSI Z370i Gaming Pro Carbon AC mobo), will the i3 8300 CPU be enough to saturate it, or should I get a faster CPU?

paulunderwood 2018-10-17 18:55

[QUOTE=simon389;498204]If I get the Corsair 4700mhz 19-19-19 on NewEgg (and compatible MSI Z370i Gaming Pro Carbon AC mobo), will the i3 8300 CPU be enough to saturate it, or should I get a faster CPU?[/QUOTE]

Consider Xeons if you have the money. More cores, bigger cache, ECC ram and more memory channels, and maybe more sockets!

mackerel 2018-10-17 19:06

Is the system primarily for doing work like this, or will it also be used for general tasks?

For the 8300 (4 cores, 3.7 GHz) if you want 16GB of ram, I'd go 4x4GB of 3200 of a recently launched set from a well known brand. Maybe a higher speed grade if the price difference isn't too much. 3600 would be better but it'll be a question of diminishing returns much above that.

You can look at the mobo's QVL if you want to try and check ram compatibility, but their lists will always be limited.

I would add, I'd be extremely cautious about the support and stability of 4700 speed ram. For this particular CPU I'd keep below 4000. I previously worked out a rule of thumb that for a quad core Intel, aim for a ram speed comparable to the CPU clock to be largely not limited by ram (>90% of potential). So for a 3.7 GHz CPU, look around 3700 ram speed. I forgot if that took into consideration rank or not though, too long ago. Since most of my quads run 3200 or slower, I knew they were all losing some performance from the ram but even 3200 speed was expensive then.

Note the overall throughput is dependent on the balance between the CPU and ram, and we're kinda in that transition state from one to the other, so don't over-think about getting the fastest ram.

simon389 2018-10-17 20:02

[QUOTE=mackerel;498206]Is the system primarily for doing work like this, or will it also be used for general tasks[/QUOTE]

It will be 1-4 machines 100% dedicated to LL

[QUOTE]For this particular CPU I'd keep below 4000. [/QUOTE]

What about a faster CPU? By your example would 4700mhz RAM be best for a 4.7Ghz CPU?

science_man_88 2018-10-17 20:28

[QUOTE=simon389;498207]It will be 1-4 machines 100% dedicated to LL



What about a faster CPU? By your example would 4700mhz RAM be best for a 4.7Ghz CPU?[/QUOTE]
computation section of:
[url]https://en.m.wikipedia.org/wiki/Memory_bandwidth[/url]

and performance section of:
[url]https://en.m.wikipedia.org/wiki/Central_processing_unit[/url]

plus the types of algorithms run by GIMPS, all need to be taken into account. If memory can't keep up to what the CPU can throw at it, memory speed affects performance. If you use a non parallel algorithm, that also affects performance.

mackerel 2018-10-17 21:41

[QUOTE=simon389;498207]What about a faster CPU? By your example would 4700mhz RAM be best for a 4.7Ghz CPU?[/QUOTE]

I said for a quad core CPU (with dual channel ram). If you went 6 core CPU for example, you're going to want much faster ram if running dual channel.

Also, don't take it too exactly, it is ball park indicative of where things switch from being more CPU limited to more ram limited. It isn't linearly related to one or the other in that region, so even if you're not at the ideal point, you don't lose as much going either side. As you're building for this task, you'd probably be better off optimising for overall system performance vs cost. I can't recommend really high speed ram (4000+) due to the uncertainty of its compatibility.

I still think if you went with the quad core, look around 3200 or a bit faster if the price difference isn't too much. Presuming you wont really need more than 8GB of ram, so 2x4GB in dual channel would keep costs down. It will probably be difficult to find higher speed modules in 4GB capacity anyway.

kladner 2018-10-18 00:12

I don't know what those high-end DRAM sticks cost. I have gotten the impression that your are willing to more upscale on the memory and CPU. Consider boards and CPUs which support 4 RAM channels. RAM running at 3200mhz takes on new meaning with twice the channels.

There has to be a cross-over point for the performance/price of ultra-fast RAM, versus the performance/price of going to an x99 family chipset, with 'normal' high speed RAM (3200-3600mz), and an appropriate CPU. I bet the average hex-core Intel chip with quad channel would not be as RAM-limited as a fast quad core with dual channel.

Prime95 2018-10-18 01:09

Seriously, you are better off with "commodity parts" rather than "enthusiast parts". You will get much more throughput buying six mundane systems rather than four high-end systems.

I highly recommend reading the thread on George's dream build. Memory prices have doubled since then but the principles still apply. Today that translate into 4-core i3 parts with memory in the 3200 to 3600 range.


If you insist on buying 4700 memory you should pair that with a six-core CPU.

irowiki 2018-10-18 16:57

[QUOTE=Prime95;498215]Seriously, you are better off with "commodity parts" rather than "enthusiast parts". You will get much more throughput buying six mundane systems rather than four high-end systems.
[/QUOTE]

This is interesting for me, as the wife is putting together a new gaming system, and wanted to go with a fairly recent AMD Ryzen, and I was trying to talk her into a high end i5/i7 since the Intels seem to run prime95 better.

So basically I should take the money saved from her going AMD and just buy an "older" i5/i7 system and it would probably combined do more than just one beefy computer.

retina 2018-10-18 17:16

[QUOTE=irowiki;498243]So basically I should take the money saved from her going AMD and just buy an "older" i5/i7 system and it would probably combined do more than just one beefy computer.[/QUOTE][url=https://en.wikipedia.org/wiki/Diminishing_returns]Yup[/url]

science_man_88 2018-10-18 17:54

[QUOTE=irowiki;498243]This is interesting for me, as the wife is putting together a new gaming system, and wanted to go with a fairly recent AMD Ryzen, and I was trying to talk her into a high end i5/i7 since the Intels seem to run prime95 better.

So basically I should take the money saved from her going AMD and just buy an "older" i5/i7 system and it would probably combined do more than just one beefy computer.[/QUOTE]

[url]https://en.m.wikipedia.org/wiki/List_of_interface_bit_rates[/url] may also be an interesting read for you.

M344587487 2018-10-19 11:32

[QUOTE=irowiki;498243]This is interesting for me, as the wife is putting together a new gaming system, and wanted to go with a fairly recent AMD Ryzen, and I was trying to talk her into a high end i5/i7 since the Intels seem to run prime95 better.

So basically I should take the money saved from her going AMD and just buy an "older" i5/i7 system and it would probably combined do more than just one beefy computer.[/QUOTE]


A new Ryzen or 2nd hand older i5/i7 are your best options. In all cases you'll be limited by dual channel RAM and the new intel parts are far too expensive to be a bang for buck option (even the previous gen new parts are expensive, more expensive than they already were due to recent price hikes).


As for your wife's gaming PC Ryzen is a good choice, most recommend 2600(X) for gaming but there's an argument for a 2700(X) if the price difference is not large. Higher models come with better stock coolers which makes a difference in how high and long the chip can boost. If you were planning to get a third party cooler for a 2600 I'd think about just getting a 2600X or 2700X instead (I think 2600 comes with low profile wraith spire, 2600X and 2700 come with normal wraith spire, 2700X comes with wraith prism which is comparable to a cheap third party solution).

irowiki 2018-10-20 00:38

[QUOTE=M344587487;498281]

As for your wife's gaming PC Ryzen is a good choice, most recommend 2600(X) for gaming but there's an argument for a 2700(X) if the price difference is not large. Higher models come with better stock coolers which makes a difference in how high and long the chip can boost. If you were planning to get a third party cooler for a 2600 I'd think about just getting a 2600X or 2700X instead (I think 2600 comes with low profile wraith spire, 2600X and 2700 come with normal wraith spire, 2700X comes with wraith prism which is comparable to a cheap third party solution).[/QUOTE]

I have a giant cooler ( [URL="https://smile.amazon.com/Cooler-Master-Hyper-212-RR-212E-20PK-R2/dp/B005O65JXI/ref=sr_1_1?s=pc&ie=UTF8&qid=1451098299&sr=1-1&keywords=Cooler+Master+212+EVO"]link[/URL] ) on my FX-8350, running flat out with prime95 it never gets above 100F, so I'll probably get her the same unless there's something better now. Unless we get the 2700x, then we'll try that one!

Thanks for the Ryzen feedback, it helps a lot!

Mark Rose 2018-10-22 17:02

[QUOTE=irowiki;498243]So basically I should take the money saved from her going AMD and just buy an "older" i5/i7 system and it would probably combined do more than just one beefy computer.[/QUOTE]

Haswell (4000 series) or better though. You'll get much more throughput/watt.

simon389 2018-10-22 18:41

Let me clarify why I'm going with 3 enthusiast machines and not 5-8 consumer machines. I don't want a ton of heat being produced in my basement. I just want to keep things small and orderly. So I'd rather have three really fast machines even if it costs more.

I ended up buying:

Case: DIYPC DIY-F2-O Black/Orange USB 3.0 Micro-ATX Mini Tower ([URL="https://www.newegg.com/Product/Product.aspx?Item=N82E16811353095"]link[/URL])

CPU: Intel Core i7-9700K Coffee Lake 8-Core 3.6 GHz ([URL="https://www.newegg.com/Product/Product.aspx?Item=N82E16819117958"]link[/URL])

Mobo: MSI MPG Z390I GAMING EDGE AC ([URL="https://www.newegg.com/Product/Product.aspx?Item=N82E16813144215"]link[/URL])

RAM: G.SKILL TridentZ RGB Series 32GB (2 x 16GB) DDR4 3866Mhz ([URL="https://www.newegg.com/Product/Product.aspx?Item=N82E16820232606"]link[/URL])

Note: 3866mhz RAM has been approved for this mobo

PSU: SeaSonic 400W Platinum ([URL="https://www.newegg.com/Product/Product.aspx?Item=N82E16817151097"]link[/URL])

Cooler: ArcticCooling Freezer i32 ([URL="https://www.newegg.com/Product/Product.aspx?Item=9SIA4RE4M24012"]link[/URL])

SSD (w/ Windows 7): Old 240GB I have laying around

Total cost: $1300


Problem: I realize now that I got a dual-channel RAM setup, when Skylake-X has quad-channel RAM. If this will result in slower LL crunching, should I return everything and get a quad-channel setup considering RAM is the biggest bottleneck?

VBCurtis 2018-10-22 19:14

Two channels of DDR4 are maxed out by 4 cores. So, if you have something that isn't memory-bandwidth-intensive for the other 4 cores to do, your setup is fine. If you plan on Prime95 or LLR, you'll have 4 idle cores and an utterly wasted expensive CPU.
Your memory speed might allow you to get 4.5 cores worth of production, but you really really need quad channel and 4 memory sticks for that CPU.

irowiki 2018-10-22 19:25

[QUOTE=Mark Rose;498509]Haswell (4000 series) or better though. You'll get much more throughput/watt.[/QUOTE]

It figures you would say that as I was looking at a bunch of cheap ivy bridge i5-3470s!

My favorite in my whole lineup are my machines with the i5-4590, that thing moves!

mackerel 2018-10-22 19:52

[QUOTE=VBCurtis;498527]TIf you plan on Prime95 or LLR, you'll have 4 idle cores and an utterly wasted expensive CPU.[/QUOTE]

If LLR, the CPU could still be interesting. "Smaller" tasks could fit in the L3 cache and not be ram bandwidth limited. Above that, bandwidth is king, even a 6 core quad channel would likely out perform the 8 core dual channel setup.

Mark Rose 2018-10-22 20:13

[QUOTE=irowiki;498529]It figures you would say that as I was looking at a bunch of cheap ivy bridge i5-3470s![/quote]

That chips shouldn't be too bad, actually, if it's coupled with DDR3-1600. But keep in mind a new i3-8100 with dual rank DDR-2400 will be something like twice as power efficient.

[quote]
My favorite in my whole lineup are my machines with the i5-4590, that thing moves![/QUOTE]

That one is certainly bottlenecked by its maximum clocked DDR3-1600 RAM. I'd look at underclocking/undervolting it.

irowiki 2018-10-23 18:19

[QUOTE=Mark Rose;498533]That chips shouldn't be too bad, actually, if it's coupled with DDR3-1600. But keep in mind a new i3-8100 with dual rank DDR-2400 will be something like twice as power efficient.[/quote]

Ah, dude has a few dozen dell mini towers leftover from bitcoin mining and he wants $100 for each, was going to try and strike a deal for five or so if they were worth it.


[quote]
That one is certainly bottlenecked by its maximum clocked DDR3-1600 RAM. I'd look at underclocking/undervolting it.[/QUOTE]

Interesting. So even though it's maxing every core out, it's "wasting" energy so to speak?

It's interesting because I have a new Dell with an i5 6500 and the 4590 beats it handily.

Mark Rose 2018-10-23 20:03

[QUOTE=irowiki;498592]Ah, dude has a few dozen dell mini towers leftover from bitcoin mining and he wants $100 for each, was going to try and strike a deal for five or so if they were worth it.[/quote]

Not a bad price.

[quote]
Interesting. So even though it's maxing every core out, it's "wasting" energy so to speak?[/QUOTE]

Yeah. Basically, the CPUs are busy enough to stay fully clocked, but are still waiting on memory.

[quote]
It's interesting because I have a new Dell with an i5 6500 and the 4590 beats it handily.
[/quote]

Almost certainly comes down to either insufficient cooling on the 6500, or the memory configuration. I have a bunch of i5-6600 that I disable turbo on, then undervolt a lot. Why? They're limited to 4 ranks of DDR4-2133, so going any faster than 3.3 GHz is pointless.

I also have a 4770k system with 8 ranks of DDR4-2400 that's faster than the 6600's.

irowiki 2018-10-24 04:18

[QUOTE=Mark Rose;498604]Yeah. Basically, the CPUs are busy enough to stay fully clocked, but are still waiting on memory.[/quote]

Ah, I shall look into it, but not sure if you can undervolt a stock Dell.





[quote]
Almost certainly comes down to either insufficient cooling on the 6500, or the memory configuration. I have a bunch of i5-6600 that I disable turbo on, then undervolt a lot. Why? They're limited to 4 ranks of DDR4-2133, so going any faster than 3.3 GHz is pointless.

I also have a 4770k system with 8 ranks of DDR4-2400 that's faster than the 6600's.[/QUOTE]

Not a cooling issue I wouldn't think, it was doing the same when it first started and an hour later, also it's an Optiplex and has overkill cooling. I'm guessing it's just cheap motherboard/memory issues.

Brandon 2018-10-25 21:59

So this seems like a good place to ask a RAM related question at the moment.

I have a Dell Inspiron 15 7000 Gaming laptop, running an i7-7700HQ, with 8GB DDR4 2400, just one stick, so single channel.

I've noticed when running an LL exponent(1 exponent, 4 cores), in any range, the difference in ms/iter between 2.8 GHz(power saving mode) and 3.8 GHz(high performance mode), is practically none. 3.8GHz is no faster, and of course just gets hotter(though I have done some undervolting to help thermals)

Is this simply a case of RAM being a bottleneck?
Probably a really obvious question, I just want to be sure. I see no other explanation...
(sorry if this is not the appropriate place to ask this, 'tis my first post!)

Prime95 2018-10-25 23:52

[QUOTE=Brandon;498752]Is this simply a case of RAM being a bottleneck? [/QUOTE]

Yes.

I think it a case of gross negligence that Dell would sell a laptop with that memory configuration (esp. calling it a "Gaming" laptop). Single channel RAM will impact all applications you run.

Brandon 2018-10-26 03:27

Well darn. I'll have to order a matching stick at some point. Thanks for confirming that though.

mackerel 2018-10-26 08:30

Because laptops have limited upgrade potential I think they want to leave the other slot free so the user can expand without having to partially remove existing. If custom configured I think many companies allow the option of dual channel vs single.

In my personal laptop it came with 1x8GB module. I put in another one taken from a deceased laptop, gives a nice boost. Modules are mismatched but still give dual channel operation (2400 single rank + 2133 dual rank!).

simon389 2018-10-26 22:28

With quad channel RAM, how much faster percentage wise is LL world record crunching on Skylake X vs some thing like Coffee Lake?

Mark Rose 2018-10-27 20:17

[QUOTE=simon389;498854]With quad channel RAM, how much faster percentage wise is LL world record crunching on Skylake X vs some thing like Coffee Lake?[/QUOTE]

Depends on RAM speed, ranks, core speed, and core count.

But given that RAM is usually the bottleneck, double the RAM should give about double the throughput with double the cores.

The best value is still the low cost quad core systems with fast dual rank RAM: it's cheaper to build two quad core systems than one octocore system with quad channel RAM due to motherboard and CPU costs.

retina 2018-10-28 08:07

[QUOTE=Mark Rose;498930]it's cheaper to build two quad core systems than one octocore system with quad channel RAM due to motherboard and CPU costs.[/QUOTE]To [i]build[/i], okay. To [i]run[/i]? Don't forget about the electricity cost. If you run it 24/7 at 100% then those costs add up. It's double for two systems.

Prime95 2018-10-28 13:16

[QUOTE=retina;498954]To [i]build[/i], okay. To [i]run[/i]? Don't forget about the electricity cost. If you run it 24/7 at 100% then those costs add up. It's double for two systems.[/QUOTE]

Actually SkylakeX burns more than twice as much electricity as two of my 4-core systems.

simon389 2018-10-30 04:01

Strange benchmark scores
 
A few RAM benchmarks of interest:

[B]OLD i5 4690K @ stock 3.5Ghz with 16GB (4GBx4) 2400Mhz DDR3 RAM[/B]

Timings for 2048K FFT length (4 cores, 1 worker): 4.35 ms. Throughput: 230.05 iter/sec.
Timings for 2304K FFT length (4 cores, 1 worker): 4.78 ms. Throughput: 209.00 iter/sec.
Timings for 2400K FFT length (4 cores, 1 worker): 4.90 ms. Throughput: 203.89 iter/sec
Timings for 2560K FFT length (4 cores, 1 worker): 5.21 ms. Throughput: 191.78 iter/sec.
Timings for 2688K FFT length (4 cores, 1 worker): 5.76 ms. Throughput: 173.55 iter/sec.
Timings for 2880K FFT length (4 cores, 1 worker): 5.93 ms. Throughput: 168.57 iter/sec.
Timings for 3072K FFT length (4 cores, 1 worker): 6.61 ms. Throughput: 151.36 iter/sec.
Timings for 3200K FFT length (4 cores, 1 worker): 6.75 ms. Throughput: 148.04 iter/sec.
Timings for 3360K FFT length (4 cores, 1 worker): 7.27 ms. Throughput: 137.64 iter/sec.
Timings for 3456K FFT length (4 cores, 1 worker): 7.47 ms. Throughput: 133.79 iter/sec.
Timings for 3584K FFT length (4 cores, 1 worker): 7.77 ms. Throughput: 128.73 iter/sec.
Timings for 3840K FFT length (4 cores, 1 worker): 8.45 ms. Throughput: 118.30 iter/sec.
Timings for 4096K FFT length (4 cores, 1 worker): 8.57 ms. Throughput: 116.64 iter/sec.
Timings for 4480K FFT length (4 cores, 1 worker): 9.36 ms. Throughput: 106.81 iter/sec.
Timings for 4608K FFT length (4 cores, 1 worker): 9.96 ms. Throughput: 100.41 iter/sec.
Timings for 4800K FFT length (4 cores, 1 worker): 10.19 ms. Throughput: 98.12 iter/sec.
Timings for 5120K FFT length (4 cores, 1 worker): 11.15 ms. Throughput: 89.69 iter/sec.
Timings for 5376K FFT length (4 cores, 1 worker): 11.97 ms. Throughput: 83.53 iter/sec.
Timings for 5760K FFT length (4 cores, 1 worker): 12.94 ms. Throughput: 77.29 iter/sec.
Timings for 6144K FFT length (4 cores, 1 worker): 13.08 ms. Throughput: 76.46 iter/sec.
Timings for 6400K FFT length (4 cores, 1 worker): 14.38 ms. Throughput: 69.54 iter/sec.
Timings for 6720K FFT length (4 cores, 1 worker): 14.89 ms. Throughput: 67.18 iter/sec.
Timings for 6912K FFT length (4 cores, 1 worker): 15.10 ms. Throughput: 66.21 iter/sec.
Timings for 7168K FFT length (4 cores, 1 worker): 15.91 ms. Throughput: 62.86 iter/sec.
Timings for 7680K FFT length (4 cores, 1 worker): 16.93 ms. Throughput: 59.07 iter/sec.
Timings for 8064K FFT length (4 cores, 1 worker): 18.22 ms. Throughput: 54.87 iter/sec.
Timings for 8192K FFT length (4 cores, 1 worker): 18.80 ms. Throughput: 53.20 iter/sec.

[B]NEW i5 9600K @ stock 3.7Ghz with 32GB (16GBx2) of DDR4 3600Mhz:[/B]

Timings for 2048K FFT length (4 cores, 1 worker): 2.07 ms. Throughput: 483.82 iter/sec.
Timings for 2304K FFT length (4 cores, 1 worker): 2.30 ms. Throughput: 434.59 iter/sec.
Timings for 2400K FFT length (4 cores, 1 worker): 2.43 ms. Throughput: 411.18 iter/sec.
Timings for 2560K FFT length (4 cores, 1 worker): 2.67 ms. Throughput: 374.39 iter/sec.
Timings for 2688K FFT length (4 cores, 1 worker): 2.79 ms. Throughput: 358.41 iter/sec.
Timings for 2880K FFT length (4 cores, 1 worker): 3.02 ms. Throughput: 330.90 iter/sec.
Timings for 3072K FFT length (4 cores, 1 worker): 3.26 ms. Throughput: 306.33 iter/sec.
Timings for 3200K FFT length (4 cores, 1 worker): 3.37 ms. Throughput: 296.88 iter/sec.
Timings for 3360K FFT length (4 cores, 1 worker): 3.68 ms. Throughput: 271.65 iter/sec.
Timings for 3456K FFT length (4 cores, 1 worker): 3.71 ms. Throughput: 269.45 iter/sec.
Timings for 3584K FFT length (4 cores, 1 worker): 3.84 ms. Throughput: 260.69 iter/sec.
Timings for 3840K FFT length (4 cores, 1 worker): 4.17 ms. Throughput: 239.54 iter/sec.
Timings for 4096K FFT length (4 cores, 1 worker): 4.45 ms. Throughput: 224.57 iter/sec.
Timings for 4480K FFT length (4 cores, 1 worker): 4.87 ms. Throughput: 205.14 iter/sec.
Timings for 4608K FFT length (4 cores, 1 worker): 4.96 ms. Throughput: 201.74 iter/sec.
Timings for 4800K FFT length (4 cores, 1 worker): 5.33 ms. Throughput: 187.67 iter/sec.
Timings for 5120K FFT length (4 cores, 1 worker): 5.64 ms. Throughput: 177.28 iter/sec.
Timings for 5376K FFT length (4 cores, 1 worker): 5.99 ms. Throughput: 167.06 iter/sec.
Timings for 5760K FFT length (4 cores, 1 worker): 6.45 ms. Throughput: 155.06 iter/sec.
Timings for 6144K FFT length (4 cores, 1 worker): 6.86 ms. Throughput: 145.75 iter/sec.
Timings for 6400K FFT length (4 cores, 1 worker): 7.20 ms. Throughput: 138.97 iter/sec.
Timings for 6720K FFT length (4 cores, 1 worker): 7.60 ms. Throughput: 131.62 iter/sec.
Timings for 6912K FFT length (4 cores, 1 worker): 7.97 ms. Throughput: 125.52 iter/sec.
Timings for 7168K FFT length (4 cores, 1 worker): 8.16 ms. Throughput: 122.56 iter/sec.
Timings for 7680K FFT length (4 cores, 1 worker): 8.68 ms. Throughput: 115.25 iter/sec.
Timings for 8064K FFT length (4 cores, 1 worker): 9.27 ms. Throughput: 107.92 iter/sec.
Timings for 8192K FFT length (4 cores, 1 worker): 9.51 ms. Throughput: 105.17 iter/sec.

[B]NEW i7 9700K @ stock 3.6Ghz with 16GB (2x8GB) DDR4 4500Mhz (downclocked to 4360 Mhz for stability)[/B]

Timings for 2048K FFT length (4 cores, 1 worker): 2.12 ms. Throughput: 472.02 iter/sec.
Timings for 2304K FFT length (4 cores, 1 worker): 2.45 ms. Throughput: 408.10 iter/sec.
Timings for 2400K FFT length (4 cores, 1 worker): 2.57 ms. Throughput: 388.40 iter/sec.
Timings for 2560K FFT length (4 cores, 1 worker): 3.00 ms. Throughput: 333.33 iter/sec.
Timings for 2688K FFT length (4 cores, 1 worker): 3.14 ms. Throughput: 318.78 iter/sec.
Timings for 2880K FFT length (4 cores, 1 worker): 3.47 ms. Throughput: 288.44 iter/sec.
Timings for 3072K FFT length (4 cores, 1 worker): 3.78 ms. Throughput: 264.21 iter/sec.
Timings for 3200K FFT length (4 cores, 1 worker): 4.03 ms. Throughput: 248.41 iter/sec.
Timings for 3360K FFT length (4 cores, 1 worker): 4.33 ms. Throughput: 230.71 iter/sec.
Timings for 3456K FFT length (4 cores, 1 worker): 4.44 ms. Throughput: 225.21 iter/sec.
Timings for 3584K FFT length (4 cores, 1 worker): 4.55 ms. Throughput: 219.63 iter/sec.
Timings for 3840K FFT length (4 cores, 1 worker): 5.22 ms. Throughput: 191.70 iter/sec.
Timings for 4096K FFT length (4 cores, 1 worker): 5.40 ms. Throughput: 185.13 iter/sec.
Timings for 4480K FFT length (4 cores, 1 worker): 6.13 ms. Throughput: 163.15 iter/sec.
Timings for 4608K FFT length (4 cores, 1 worker): 6.40 ms. Throughput: 156.16 iter/sec.
Timings for 4800K FFT length (4 cores, 1 worker): 6.88 ms. Throughput: 145.42 iter/sec.
Timings for 5120K FFT length (4 cores, 1 worker): 6.81 ms. Throughput: 146.79 iter/sec.
Timings for 5376K FFT length (4 cores, 1 worker): 7.70 ms. Throughput: 129.84 iter/sec.
Timings for 5760K FFT length (4 cores, 1 worker): 7.49 ms. Throughput: 133.53 iter/sec.
Timings for 6144K FFT length (4 cores, 1 worker): 7.92 ms. Throughput: 126.26 iter/sec.
Timings for 6400K FFT length (4 cores, 1 worker): 9.55 ms. Throughput: 104.75 iter/sec.
Timings for 6720K FFT length (4 cores, 1 worker): 9.02 ms. Throughput: 110.83 iter/sec.
Timings for 6912K FFT length (4 cores, 1 worker): 9.41 ms. Throughput: 106.26 iter/sec.
Timings for 7168K FFT length (4 cores, 1 worker): 9.49 ms. Throughput: 105.43 iter/sec.
Timings for 7680K FFT length (4 cores, 1 worker): 11.58 ms. Throughput: 86.34 iter/sec.
Timings for 8064K FFT length (4 cores, 1 worker): 11.43 ms. Throughput: 87.51 iter/sec.
Timings for 8192K FFT length (4 cores, 1 worker): 11.06 ms. Throughput: 90.43 iter/sec.

It is interesting to me that the 9600K actually performs 10-20% [U]faster[/U] than the 9700K, even though the RAM is actually 800Mhz slower. That 4500Mhz RAM was expensive, and I thought it would make everything faster. But it did not. Is the 9600K faster because it's using dual rank RAM (16GB sticks)? Or is it the 100Mhz slower processor? Or both?

Mark Rose 2018-10-30 15:14

[QUOTE=simon389;499079]It is interesting to me that the 9600K actually performs 10-20% [U]faster[/U] than the 9700K, even though the RAM is actually 800Mhz slower. That 4500Mhz RAM was expensive, and I thought it would make everything faster. But it did not. [b]Is the 9600K faster because it's using dual rank RAM (16GB sticks)?[/b] Or is it the 100Mhz slower processor? Or both?[/QUOTE]

Probably. Dual rank usually makes a 10% to 15% difference from what I've seen.

When Prime95/mprime hits RAM, it hits it hard, faster than a single rank can operate.

Useful numbers for future reference. Thanks!

ATH 2018-10-30 17:08

Try with CPU-Z to make sure both are running dual channel RAM:
[url]https://www.cpuid.com/softwares/cpu-z.html[/url]

On the "Memory" tab there is a "Channel #" that should say "Dual".

It seems weird to me that dual rank should be that much faster when the other RAM is higher clock rate, I read previously on the forum that dual rank only gave a few percent extra.

Did you try 6 cores 1 worker on both?

simon389 2018-10-30 22:28

CPU Z says it is dual. And yes, I tested it with 6 cores.

Lorenzo 2018-11-02 08:55

For me your result is not surprise. Latency matter! I'm sure that 4500 ram has huge values for CL and so on ...

But try to downgrade your 4500 kit to 3600 but for smallest CL as you can run in stable. Interesting to see the difference in this case.

Trilo 2019-12-26 19:09

Planning a build to perform LLR tests on numbers from 1.5m-5m bits. With FFT's of this size (much smaller than world record sizes), would memory bandwidth be as much of a concern? Could FFT's of this size fit on L3 cache?

I've decided on using a hexacore i5-9600k with an msi mpg z390 motherboard which supports quad channel & up to 4400 mhz ram.

I have 2 1060 6gb and a 980ti laying around and would like to connect them to the build as well for crypto mining (I don't pay electricity costs)/LLR testing/neural network training.

What should I get? I was looking around at 4x 8gb 3600 mHz ram sticks for $180 total. Certainly pricey for RAM. Thoughts? Overkill for my processor? Or still too slow?

scan80269 2019-12-26 19:33

Z390 motherboards all support dual channel memory, not quad channel. Most ATX form factor motherboards feature 4 DIMM sockets, thus 2 DIMMs per memory channel.

Options for true quad channel memory:
- Intel X99 motherboard supporting Intel Haswell-E / Broadwell-E Core X series CPUs
- Intel X299 motherboard supporting Intel Skylake-X / Cascade Lake-X Core X series CPUs
- AMD X399 motherboard supporting AMD TR4 socket Ryen Threadripper X series CPUs
- AMD TRX40 motherboard supporting AMD 3rd gen Ryzen Threadripper series CPUs

Other motherboards featuring Intel Z390, Z370, etc. and AMD X570, X470 etc. are all dual-channel designs.

My two fastest systems for LL/PRP are both X99 with i7-5960X Haswell-E 8-core CPUs and quad channel DDR4 dual-rank memory configurations, at 3000MHz CL15 and 2133MHz CL13 respectively (the second X99 board lacks XMP support so cannot run DDR4 memory above 2133). Several other systems based on Z390 / Coffee Lake 6/8-core CPUs and Z270 / Kaby Lake 4-core CPUs with dual-channel DDR4 memory are all significantly slower.

Trilo 2019-12-26 21:17

[QUOTE=scan80269;533604]Z390 motherboards all support dual channel memory, not quad channel. Most ATX form factor motherboards feature 4 DIMM sockets, thus 2 DIMMs per memory channel.

Options for true quad channel memory:
- Intel X99 motherboard supporting Intel Haswell-E / Broadwell-E Core X series CPUs
- Intel X299 motherboard supporting Intel Skylake-X / Cascade Lake-X Core X series CPUs
- AMD X399 motherboard supporting AMD TR4 socket Ryen Threadripper X series CPUs
- AMD TRX40 motherboard supporting AMD 3rd gen Ryzen Threadripper series CPUs

Other motherboards featuring Intel Z390, Z370, etc. and AMD X570, X470 etc. are all dual-channel designs.

My two fastest systems for LL/PRP are both X99 with i7-5960X Haswell-E 8-core CPUs and quad channel DDR4 dual-rank memory configurations, at 3000MHz CL15 and 2133MHz CL13 respectively (the second X99 board lacks XMP support so cannot run DDR4 memory above 2133). Several other systems based on Z390 / Coffee Lake 6/8-core CPUs and Z270 / Kaby Lake 4-core CPUs with dual-channel DDR4 memory are all significantly slower.[/QUOTE]

Thank you for the clarification I'm new to building PC's and assumed #of DIMM sockets= #of channels.

My original basis for choosing the i5 9600k is the high gflops/core scoring based on primegrid and gimps benchmarks.

[url]https://www.primegrid.com/cpu_list.php[/url]

As for the 9600k build suppose I only fill 2 DIMMs with ram thus saving about $80. Would this have a non-negligable drop in speed vs. filling all 4 DIMMs with ram. Both would be dual channel regardless.

Would memory still be a hurdle for LLR tests of numbers between 1.5-5m bits? This is an order of magnitude smaller than numbered used in LL tests. The 9600k has 9MiB of L3 cache. Could the FFT's of smaller candidates be stored in there? If so then perhaps the bottleneck for candidates of these sizes would once again be CPU speed. Or is it too small for FFT's of those sizes?

VBCurtis 2019-12-27 05:53

I run LLR on exponents from 3M to 7M on Haswell-era quad-core dual-channel machines, and I do run into memory saturation on the 4th test. That is, 3 copies of LLR running is nearly as fast as 4, when all run single-threaded. That's with an office-machine-grade Dell, with stock DDR4 at stock speed (2133?). If you run 3200 memory, you're getting 50% more bandwidth, and it's possible that running each test 2- or 3- threaded will not saturate the memory (that is, enough data will stay in cache).
Without relying on cache to save you, I'd think 3600 memory and that 6-core would match up well for tests around 3-5mbits each. As your tests grow in the future, you can go more multithreaded.
Hopefully, someone with more recent/similar hardware can amplify my hand-waving about bandwidth...


All times are UTC. The time now is 17:21.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.