![]() |
![]() |
#1 |
"David Kirkby"
Jan 2021
Althorne, Essex, UK
1110000002 Posts |
![]()
Due to a fault on my motherboard, my computer (Dell 7920) would not power on if RAM modules were inserted in certain sockets. The CPUs have 6 memory channels, but the only way to use 6 DIMMs on the first CPU was to place them in sockets which resulted in only 4 memory channels being used. (There are 12 DIMM sockets per CPU, so I had the flexability to use the wrong sockets). The second CPU had no such issues, but the BIOS indicated only 4 of the 12 memory channels were being used, despite there were 12 DIMMs. (I would have expected 4+4=8 or 4+6=10 memory channels, but the BIOS said only 4). I would expect this to be a non-optimal configuration. With this sub-optimal memory configuration, I had run the benchmarks on mprime and found the optimal throughput was to use 2 workers. With 2 workers, each iteration of a PRP test of a 104 million exponent was taking about 1.5 ms.
The Dell motherboard was changed for an identical model and the fault went away. With the same 12 DIMMs as before, the memory channels have been increased from 4 to 12. Much to my surprise, this increased the time per iteration of mprime 30.6b4 from around 1.5 to 2.0 ms for the same exponents! Essentially the throughput had decreased. Can anyone explain why this might happen? I re-run the benchmarks and found the configuration giving optimal throughput had changed from 2 workers to 4 workers. So I increased the number of workers from 2 to 4. Unsurprisingly reducing the number of cores per worker from 26 to 13 increased the iteration time further. The iteration time is now 3 ms. Since now 4 exponents are being tested simultaneously, at 3 ms/iteration, this is essentially the same total throughput as testing 2 exponents at 1.5 ms/iteration. So the increase in memory channels from 4 to 12 has not resulted in any change in total throughput, but has resulted in the optimal number of workers changing from 2 to 4. The CPUs are only clocking the RAM at 2400 MHz, not the 2933 MHz that the motherboard is capable of. But that's due to CPUs I have. I'm assuming this means that the performance of the computer is not set by memory bandwidth, but by CPU speed (only 2.0 GHz). But I'm still puzzled why changing the memory channels from 4 to 12 resulted in decreased throughput until I increased the number of workers from 2 to 4. Any thoughts? Last fiddled with by drkirkby on 2021-07-09 at 09:36 |
![]() |
![]() |
![]() |
#2 |
Sep 2002
Database er0rr
414210 Posts |
![]()
I doubt your Dell runs "12 channel" -- most likely it will be a quad channel box. Check you motherboard manual for optimal channel settings. (If you find a manual online please provide a link to it.)
Last fiddled with by paulunderwood on 2021-07-09 at 09:53 |
![]() |
![]() |
![]() |
#3 |
Jun 2003
5·29·37 Posts |
![]()
How much L3 does your CPUs have? The more L3 you have, the less the impact of poor memory bandwidth.
Also, previously, was the memory running at 2400 only or was it faster? Finally, if truly 12 ram channels are active, you might look at other configurations with more workers. |
![]() |
![]() |
![]() |
#4 | |
Einyen
Dec 2003
Denmark
3,313 Posts |
![]()
I have only heard about up to 8-channel memory so far on the newest servers.
Searching for Dell 7920: https://i.dell.com/sites/csdocuments...Spec-Sheet.pdf Quote:
Try CPU-Z: https://www.cpuid.com/ You do not have to install anything, there is a zip file with CPU-Z: https://download.cpuid.com/cpu-z/cpu-z_1.96-en.zip Just run the "cpuz_x64.exe" and on the Memory tab it will show you how many memory channels you use, mine is running 4-channel memory (quad): Last fiddled with by ATH on 2021-07-09 at 10:23 |
|
![]() |
![]() |
![]() |
#5 | |
"David Kirkby"
Jan 2021
Althorne, Essex, UK
26·7 Posts |
![]() Quote:
https://www.cpu-world.com/CPUs/Xeon/...n%208167M.html Here's the user manual. https://dl.dell.com/topicspdf/precis...nual_en-us.pdf Page 98 says up to 6 memory channels per CPU. Someone kindly sent me a technical reference on this workstation, which Dell don't make public, which I can't send at the moment, but I will send a link later. But I think the photograph shows it is using 12 memory channels. Note there's a rackmount version of this, so if you Google it, ignore the rackmount version, although I think they are pretty similar. The rackmount version has dual PSUs and enhanced security features, whereas the tower does not. Last fiddled with by drkirkby on 2021-07-09 at 10:36 |
|
![]() |
![]() |
![]() |
#6 | |
"David Kirkby"
Jan 2021
Althorne, Essex, UK
26·7 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#7 | |
Jun 2003
14F516 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#8 | |
"David Kirkby"
Jan 2021
Althorne, Essex, UK
26×7 Posts |
![]() Quote:
Edit - I will try those utilities later, but I need to reboot into Windows, and I have some real work to do just now. Last fiddled with by drkirkby on 2021-07-09 at 11:06 |
|
![]() |
![]() |
![]() |
#9 |
"David Kirkby"
Jan 2021
Althorne, Essex, UK
26·7 Posts |
![]() |
![]() |
![]() |
![]() |
#10 |
Sep 2002
Database er0rr
2×19×109 Posts |
![]()
It is two hex channel chips. It looks like you have installed the DIMMS correctly. You can run mprime/Prime95's benchmarking to automatically get the maximum throughput based on the number of workers and the current wavefront FFT sizes.
Last fiddled with by paulunderwood on 2021-07-09 at 13:01 |
![]() |
![]() |
![]() |
#11 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·7·463 Posts |
![]() Quote:
Dmidecode appears to provide the necessary info, at least indirectly. "Bank locator: CHAN A DIMM 0" https://www.cyberciti.biz/faq/check-ram-speed-linux/ Last fiddled with by kriesel on 2021-07-09 at 13:47 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Assigning too much memory slows down P-1 stage 2? | ZFR | Software | 11 | 2020-12-13 10:19 |
Allow mprime to use more memory | ZFR | Software | 1 | 2020-12-10 09:50 |
Mini ITX with LGA 2011 (4 memory channels) | bgbeuning | Hardware | 7 | 2016-06-18 10:32 |
mprime checking available memory | tha | Software | 7 | 2015-12-07 15:56 |
Cheesy memory slows down prime95? | nomadicus | Hardware | 9 | 2003-03-01 00:15 |