mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   RDRAM vs. DDR SDRAM (https://www.mersenneforum.org/showthread.php?t=442)

RickC 2003-03-15 01:53

RDRAM vs. DDR SDRAM
 
I have a P4 with PC1066 RDRAM. My benchmarks are below. Judging by the Mersenne benchmark page, high bandwidth RDRAM doesn't really have any advantage over DDR SDRAM for Prime95. Does anyone know the memory bandwidth that Prime95 uses for 18M exponents?

Interesting, it looks like Intel is going away from RDRAM in favor of Dual Channel DDR400 SDRAM with ECC to match their new 800Mhz bus.
http://www.extremetech.com/article2/0,3973,907061,00.asp


My system:
Bus Speed: 533Mhz
L2 cache speed: Full
Memory Type: PC1066 RDRAM
OS: Windows 2000 SP3
Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2386.69 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 version 22.12, RdtscTiming=1
Best time for 256K FFT length: 9.813 ms.
Best time for 320K FFT length: 13.089 ms.
Best time for 384K FFT length: 15.910 ms.
Best time for 448K FFT length: 19.076 ms.
Best time for 512K FFT length: 21.482 ms.
Best time for 640K FFT length: 27.875 ms.
Best time for 768K FFT length: 34.025 ms.
Best time for 896K FFT length: 41.684 ms.
Best time for 1024K FFT length: 45.211 ms.
Best time for 1280K FFT length: 63.859 ms.
Best time for 1536K FFT length: 79.060 ms.
Best time for 1792K FFT length: 98.161 ms.

xtreme2k 2003-03-15 05:35

RDRAM PC1066 will give you fastest performance at a given clock speed. If you run DDR at stock 266/333 it is quite a bit slower than RDRAM 1066. Only if you start running DDR at faster clock or at dual channel that it come close. Basically if you don't overclock, RDRAM will give you best performance, but at a high cost.

However, the problem with RDRAM is that they dont tend to overclock well. For example, my Kingston RDRAM PC1066 ECC dont even like to be overclock from its over 142FSB. That is a very poor overclock of only 6.7%. On the other hand, if you get a DDR board with DDR333/400 you will have NO PROBLEM running at 166FSB+ as long as your CPU allows. Now, if the CPU is running at such a higher speed RDRAM has no chance.

outlnder 2003-03-15 07:02

Rick, one thing I would suggest is to get the Prime95 ver. 23.2.

Here are my benchmarks running DDR400.

Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2400.80 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 version 23.2, RdtscTiming=1
Best time for 384K FFT length: 15.480 ms.
Best time for 448K FFT length: 18.248 ms.
Best time for 512K FFT length: 21.118 ms.
Best time for 640K FFT length: 26.871 ms.
Best time for 768K FFT length: 33.090 ms.
Best time for 896K FFT length: 39.989 ms.
Best time for 1024K FFT length: 43.713 ms.
Best time for 1280K FFT length: 58.318 ms.
Best time for 1536K FFT length: 71.319 ms.
Best time for 1792K FFT length: 86.908 ms.
Best time for 2048K FFT length: 95.344 ms.

Notice mine start at the 384 FFT level while yours starts at the 256 FFT level.

Your system seems to be running rather slowly. Update the version and post your new benchmarks. My system should not be that much faster than yours.

xtreme2k 2003-03-15 08:00

His system should kick yours ass ;) ;)

arjanscholl 2003-03-15 10:56

Isn't it true that DDR memory has lower latencys as RDRAM? Maybe that causes a slowdown too, i don't whats more important with Prime95, bandwidth or latency.

xtreme2k 2003-03-15 12:04

Only very slightly lower. But RDRAM really excel at lots of random read+write.

RickC 2003-03-15 13:59

outlnder, I did take the lead over your 2.4 when I upgraded Prime95. I have a stock Dell 8250. It's interesting that my 2.4 is a few Mhz short of 2400.

Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2386.70 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 version 23.2, RdtscTiming=1
Best time for 384K FFT length: 15.179 ms.
Best time for 448K FFT length: 17.883 ms.
Best time for 512K FFT length: 20.774 ms.
Best time for 640K FFT length: 26.421 ms.
Best time for 768K FFT length: 32.468 ms.
Best time for 896K FFT length: 39.261 ms.
Best time for 1024K FFT length: 43.023 ms.
Best time for 1280K FFT length: 57.438 ms.
Best time for 1536K FFT length: 70.477 ms.
Best time for 1792K FFT length: 86.128 ms.
Best time for 2048K FFT length: 94.622 ms.

RickC 2003-03-15 14:39

My RDRAM has a maximum transfer rate of 4.2GB/s. I find it hard to believe that Prime95 would use that amount. I think the bottleneck on my system would be doing the number crunching. I'm thinking the reason that RDRAM performs better is a matter of efficiency, like that the CPU has to do less work to get data in a out of memory so it can spend more cycles on math.
Any thoughts? I could be wrong.

RickC 2003-03-15 14:50

I have 4 RDRAM slots configured like this:

1. 256MB module
2. 256MB module
3. continuity module
4. continuity module

Something I will try if the price goes down is fill the other two slots and see the impact on performace not due to total MB but due to more modules in use.

Xyzzy 2003-03-16 04:13

[quote="RickC"]Something I will try if the price goes down is fill the other two slots and see the impact on performace not due to total MB but due to more modules in use.[/quote]

More devices/modules in a RDRAM chain means more latency...

http://arstechnica.com/paedia/r/ram_guide/ram_guide.part3-1.html

xtreme2k 2003-03-19 20:55

I tend to agree more chips means higher latency.

But I remember reading from some reputable sites that more sticks does offer MORE performance, which is weird. I think it was aceshardware.com but I really cant be sure.

wfzelle 2003-03-21 00:55

[quote="xtreme2k"]I tend to agree more chips means higher latency.

But I remember reading from some reputable sites that more sticks does offer MORE performance, which is weird. I think it was aceshardware.com but I really cant be sure.[/quote]
That happens when you do interleaving. Two sticks work in concert to give a higher throughput. Sort of like RAID 1. It isn't used in RDRAM machines though, but the newest DDR PIV chipsets are using it.

sdbardwick 2003-03-21 06:59

[quote="wfzelle"]That happens when you do interleaving. Two sticks work in concert to give a higher throughput. Sort of like RAID 1. It isn't used in RDRAM machines though, but the newest DDR PIV chipsets are using it.[/quote]

Others can undoubtedly provide a more accurate and detailed explaination, but I need a CNN break, so here goes:

Interleaving reduces latency, making more efficient use of bandwidth.
Multiple [i.e. dual] memory channels (alluded to above with regards to the newest PIV chipsets) increase bandwidth.

Interleaving has been around for a long time, but dual memory channels [in consumer PCs] are a new development.

Interleaving does not necessarily require multiple physical modules; it just needs multiple memory banks, and a single DIMM can have multiple banks. It allows one bank to be accessed while another bank is simultaneously being refreshed and thus cannot be accessed. However, it doesn't make the pipe bigger, just prevents air bubbles.

Dual channel memory does make the pipe bigger; interleaving can be implemented on each of the channels as well.

outlnder 2003-03-21 19:29

I have 4-way interleaving on 3 machines. It doesn't make any difference in LL speeds.

RickC 2004-01-03 22:44

[QUOTE][i]Originally posted by RickC [/i]
[B]I have 4 RDRAM slots configured like this:

1. 256MB module
2. 256MB module
3. continuity module
4. continuity module

Something I will try if the price goes down is fill the other two slots and see the impact on performace not due to total MB but due to more modules in use. [/B][/QUOTE]


I just ordered two more 256MB modules. I will run some benchmarks out of curiosity to see if there's any difference with two RDRAM modules vs. four RDRAM modules. The two I ordered are identical to the two that are already in there: SAMSUNG PC1066 256MB RAMBUS non-ECC 16-Bit 184-Pin (MR16R1628DF0-CT9).

Angular 2004-01-14 02:28

What happened to the 32-bit Rambus RAM that was hyped in the news?

RickC 2004-01-19 00:27

SiS has an RDRAM chipset:

[url]http://www.sis.com/products/chipsets/oa/pentium4/r659.htm[/url]

It will be interesting to see if Intel ever goes back to RDRAM. It's too bad it hasn't caught on enough to bring the price down.

Rambus came out with a new kind of memory called XDR DRAM:

[url]http://www.rambus.com/products/xdr/[/url]

RickC 2004-01-24 13:47

hard to find
 
The first 3 places I ordered from ended up not being able to get them to me even though they claimed to have them in stock. They then took it off their websites and don't sell it anymore. I ended up getting them from zipzoomfly.com for $95 each.

Interesting that no timings are reported in SPD info for RDRAM.

SPD info:

CPU-Z version 1.20a
Memory Modules Serial Presence Detect (SPD)

Module #1

General
Memory type RDRAM
Manufacturer (ID) Samsung (CE59414532453030)
Max bandwidth PC1066 (533 MHz)
Part number MR16R 1628DF0-CT9

Attributes


Module #2

General
Memory type RDRAM
Manufacturer (ID) Samsung (CE59414532453030)
Max bandwidth PC1066 (533 MHz)
Part number MR16R 1628DF0-CT9

Attributes


Module #3

General
Memory type RDRAM
Manufacturer (ID) Samsung (CE48414330463059)
Max bandwidth PC1066 (533 MHz)
Part number MR16R 1628DF0-CT9

Attributes


Module #4

General
Memory type RDRAM
Manufacturer (ID) Samsung (CE48414330463059)
Max bandwidth PC1066 (533 MHz)
Part number MR16R 1628DF0-CT9

Attributes

RickC 2004-01-24 13:50

2 benchmarks with only two modules in (and two continuity modules)
 
Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2386.59 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 14.891 ms.
Best time for 448K FFT length: 17.628 ms.
Best time for 512K FFT length: 20.082 ms.
Best time for 640K FFT length: 24.039 ms.
Best time for 768K FFT length: 29.315 ms.
Best time for 896K FFT length: 34.824 ms.
Best time for 1024K FFT length: 39.079 ms.
Best time for 1280K FFT length: 51.067 ms.
Best time for 1536K FFT length: 63.074 ms.
Best time for 1792K FFT length: 74.860 ms.
Best time for 2048K FFT length: 84.957 ms.

Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2386.24 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 14.906 ms.
Best time for 448K FFT length: 17.644 ms.
Best time for 512K FFT length: 20.052 ms.
Best time for 640K FFT length: 24.023 ms.
Best time for 768K FFT length: 29.213 ms.
Best time for 896K FFT length: 34.742 ms.
Best time for 1024K FFT length: 38.989 ms.
Best time for 1280K FFT length: 50.953 ms.
Best time for 1536K FFT length: 62.899 ms.
Best time for 1792K FFT length: 74.811 ms.
Best time for 2048K FFT length: 84.821 ms.

RickC 2004-01-24 13:51

2 benchmarks with all 4 modules in
 
Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2386.60 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 14.981 ms.
Best time for 448K FFT length: 17.761 ms.
Best time for 512K FFT length: 20.210 ms.
Best time for 640K FFT length: 24.258 ms.
Best time for 768K FFT length: 29.358 ms.
Best time for 896K FFT length: 34.891 ms.
Best time for 1024K FFT length: 39.276 ms.
Best time for 1280K FFT length: 51.338 ms.
Best time for 1536K FFT length: 63.220 ms.
Best time for 1792K FFT length: 75.079 ms.
Best time for 2048K FFT length: 85.432 ms.

Intel(R) Pentium(R) 4 CPU 2.40GHz
CPU speed: 2386.47 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 14.957 ms.
Best time for 448K FFT length: 17.731 ms.
Best time for 512K FFT length: 20.179 ms.
Best time for 640K FFT length: 24.205 ms.
Best time for 768K FFT length: 29.366 ms.
Best time for 896K FFT length: 34.912 ms.
Best time for 1024K FFT length: 39.215 ms.
Best time for 1280K FFT length: 51.401 ms.
Best time for 1536K FFT length: 63.358 ms.
Best time for 1792K FFT length: 74.921 ms.
Best time for 2048K FFT length: 85.429 ms.

RickC 2004-01-24 13:59

prime slower but machine runs better
 
It looks like more RDRAM modules might make prime a little slower. I did all the benchmarks exactly the same: rebooted Windows 2000 and made the sure the boot and log in completely finished. Made sure nothing else was running and then ran the benchmark.

The computer is much more responsive now. It looks like any unused memory gets used as "System Cache" according to Windows Task Manager.


All times are UTC. The time now is 13:57.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.