mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
Thread Tools
Old 2017-04-13, 05:29   #751
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

The @ 4.0GHz is part of the model name when queried from the processor. It has nothing to do with the actual running processor frequency. It has confused me in the past as well.


Kieren, are you running 12 GB of RAM? Kind of an odd amount. Not having matched sticks is probably hampering performance. That being said, you're getting 32% more throughput with a 27% higher CPU clock and a 50% higher memory clock compared to my systems (for 4 cores, 1 worker, 4096K FFT). That extra memory bandwidth is helping.

Your benchmark also tells me I'm still memory constrained at 2133 with 4 cores at 3.3 GHz. I may try poking around the bios to see if there's a way to under a locked CPU besides disabling turbo.
Mark Rose is offline   Reply With Quote
Old 2017-04-13, 12:00   #752
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
The @ 4.0GHz is part of the model name when queried from the processor. It has nothing to do with the actual running processor frequency. It has confused me in the past as well.


Kieren, are you running 12 GB of RAM? Kind of an odd amount. Not having matched sticks is probably hampering performance. That being said, you're getting 32% more throughput with a 27% higher CPU clock and a 50% higher memory clock compared to my systems (for 4 cores, 1 worker, 4096K FFT). That extra memory bandwidth is helping.

Your benchmark also tells me I'm still memory constrained at 2133 with 4 cores at 3.3 GHz. I may try poking around the bios to see if there's a way to under a locked CPU besides disabling turbo.
Does it indicate 12 GB somewhere? It should be 16GB, dual channel, dual rank. They are rated at 2666MHz, running at 3200.
Attached Thumbnails
Click image for larger version

Name:	16GB-dual rank.JPG
Views:	139
Size:	42.3 KB
ID:	15929  
kladner is offline   Reply With Quote
Old 2017-04-13, 13:23   #753
VictordeHolland
 
VictordeHolland's Avatar
 
"Victor de Hollander"
Aug 2011
the Netherlands

23·3·72 Posts
Default

Quote:
Originally Posted by db597 View Post
So from the benchmarks it looks like 8 Ryzen cores is still slower than 4 Skylake/Kabylake cores:

Ryzen @ 3.3GHz:
Code:
Timings for 4096K FFT length (8 cpus, 1 worker): 6.92 ms. Throughput: 144.58 iter/sec.
i7-6700K @ 4.2GHz:
Code:
Timings for 4096K FFT length (4 cpus, 1 worker):  4.07 ms.  Throughput: 245.91 iter/sec.
Ryzen is about as fast as my 5?6? year old SandyBridge

i5-2500k @4.0GHz DDR3-2133
Code:
Best time for 4096K FFT length: 6.839 ms., avg: 7.155 ms.
Timings for 4096K FFT length (4 cpus, 4 workers): 27.12, 26.85, 27.58, 27.00 ms.  Throughput: 147.41 iter/sec.
VictordeHolland is offline   Reply With Quote
Old 2017-04-13, 15:36   #754
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

B7216 Posts
Default

Quote:
Originally Posted by kladner View Post
Does it indicate 12 GB somewhere? It should be 16GB, dual channel, dual rank. They are rated at 2666MHz, running at 3200.
Yeah, it does:

Code:
 Machine#0 (total=12649168KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
I doubt it will affect Prime95 though, now that you've confirmed the install RAM.
Mark Rose is offline   Reply With Quote
Old 2017-04-13, 18:08   #755
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

D6316 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
The @ 4.0GHz is part of the model name when queried from the processor. It has nothing to do with the actual running processor frequency.
I don't think it's the "@ 4.00 GHz" that was concerning, it was the "CPU speed: 4008.14 MHz".

Quote:
Originally Posted by Mark Rose View Post
are you running 12 GB of RAM? Kind of an odd amount. Not having matched sticks is probably hampering performance.
That depends on the system configuration. My i7-920 system has 12GB, but it's triple-channel so 12GB is a balanced configuration. Granted most systems are dual-channel (I'm odd and my two systems are 3-channel [i7-920] and 4-channel [i7-3930K] )

As for the RAM reported, I believe that's what's available to Prime95, not total system RAM. In my case I have 64GB installed and it logs as
Code:
Machine#0 (total=54609356KB)
which is 52GB. Although I'm not entirely sure how it pulled up that number, since I have 5 workers, specified at 11000MB each, maximum 4 high-memory workers, and an overall maximum of 44000MB. But in any case, it's clearly not the installed system RAM amount.
James Heinrich is offline   Reply With Quote
Old 2017-04-13, 21:08   #756
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

100111101011102 Posts
Default

Quote:
I don't think it's the "@ 4.00 GHz" that was concerning, it was the "CPU speed: 4008.14 MHz".
I see numbers like that when running stock. I think it must come from variations in the base clock.
kladner is offline   Reply With Quote
Old 2017-04-13, 21:16   #757
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23×149 Posts
Default

Quote:
Originally Posted by kladner View Post
I see numbers like that when running stock. I think it must come from variations in the base clock.
It's not that it wasn't exactly 4000.00MHz, but rather that kladner was expecting ~4.2GHz, not ~4.0GHz, hence my suggestion to monitor the frequency in realtime.
James Heinrich is offline   Reply With Quote
Old 2017-04-14, 00:41   #758
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
It's not that it wasn't exactly 4000.00MHz, but rather that kladner was expecting ~4.2GHz, not ~4.0GHz, hence my suggestion to monitor the frequency in realtime.
I did try watching CPU-Z when starting the benchmark. It takes thinning out other CPU users to get a baseline. As mentioned, in the "Sync all cores" option on the Asus board seems to make the frequency (multiplier) jump around a lot. When things were quiet, and the core clock was only occasionally hitting 4200 MHz, moving the mouse of clicking on something would make it peak.

The jump to 42x seems virtually simultaneous with clicking to start the benchmark, at least to human-scaled perceptions.
kladner is offline   Reply With Quote
Old 2017-04-17, 14:12   #759
FSund
 
Apr 2017

2 Posts
Default

Quote:
Originally Posted by db597 View Post
The 8192K FFT performance looks incredible on this version of Prime95, especially when all 8 cores are thrown at it. Would be good if someone can post results from a similarly priced Intel i7 7700K on Prime95 v29.1 Build 15 for comparison (I expect the i7 is a lot faster per core, but at the end of the day having double the cores may make it a rather close competition).
I have just gotten a 7700k, and I'm in the process of overclocking it now. Will try to report back with benchmarks when I'm done.

At the moment it's looking like I'll have to accept 4.7 GHz for Prime95 runs, any higher and my temperatures get too high. Or to be more specific, at 4.7 GHz have to increase the voltage to 1.280 V to do the the Prime95 torture tests without errors, and at those voltages I get peak temperatures of 85 C, which is a bit too high for my comfort.
FSund is offline   Reply With Quote
Old 2017-04-17, 15:50   #760
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

Do remember that Haswell and later bump the VCore +0.1 when doing AVX2/FMA3 (which Prime95 will use).
Mark Rose is offline   Reply With Quote
Old 2017-04-18, 06:53   #761
db597
 
db597's Avatar
 
Jan 2003

7×29 Posts
Default

Quote:
Originally Posted by VictordeHolland View Post
Ryzen is about as fast as my 5?6? year old SandyBridge
The current Core architecture has been around for 6-7 years and has lots of time for software optimisations to be done for it. Ryzen has been out for 1 month... I remember back in the v26.xx days, Prime95 was a lot slower before optimisation.

My Ryzen is mainly working on World Community Grid projects (generally not highly optimised code). There I see a 2.5x the points per day compared to my 2600K @ stock clockspeeds. At the same clockspeed, the IPC is around Haswell level and it has double the number of cores.
db597 is offline   Reply With Quote
Old 2017-04-23, 11:23   #762
FSund
 
Apr 2017

2 Posts
Default

Quote:
Originally Posted by db597 View Post
@LaurV... thanks for the comparison benchmark.

So for the case of both systems running on 8 physical cores, it's 7.136ms for the i7-6950X @ 3.0GHz vs 12.69ms for the Ryzen 1700 @ 3.3GHz. Looks like Intel wins big in terms of IPC.

Would still be interesting to see the results from a i7-7700K (half the cores, but higher IPC and higher clockspeed)... to compare at a similar cost level (a Ryzen 1700 system being still a bit cheaper than a comparable i7-7700K system).
Here are my results

Hardware:
CPU: i7-7700K - at 4.9 GHz with AVX Core Ratio Negative Offset of 2 (which reduces the multiplier to 47 when running Prime95, giving a clock speed of 4.7 GHz)
CPU Cooler: Noctua NH-D15
Motherboard: ASUS TUF Z270 Mark 1 - BIOS version 0906
PSU: EVGA SuperNOVA 550 G3
RAM: 2x Corsair Vengeance LPX DDR4 3200MHz 8GB - at XMP 3200 MHz
OS Drive: Corsair Force MP500 240GB M.2 PCIe SSD

Code:
[Sun Apr 23 11:02:28 2017]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
CPU speed: 4551.15 MHz, 4 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Prime95 64-bit version 29.1, RdtscTiming=1
Timings for 2048K FFT length (4 cpus, 1 worker):  2.19 ms.  Throughput: 457.47 iter/sec.
Timings for 2048K FFT length (4 cpus, 4 workers): 10.28,  9.28,  9.05,  9.78 ms.  Throughput: 417.80 iter/sec.
Timings for 2048K FFT length (4 cpus hyperthreaded, 1 worker):  2.50 ms.  Throughput: 400.18 iter/sec.
Timings for 2048K FFT length (4 cpus hyperthreaded, 4 workers): 10.57, 10.02, 13.69, 10.16 ms.  Throughput: 365.92 iter/sec.
Timings for 2560K FFT length (4 cpus, 1 worker):  2.82 ms.  Throughput: 354.13 iter/sec.
Timings for 2560K FFT length (4 cpus, 4 workers): 12.16, 11.62, 11.63, 13.58 ms.  Throughput: 327.87 iter/sec.
Timings for 2560K FFT length (4 cpus hyperthreaded, 1 worker):  3.18 ms.  Throughput: 314.12 iter/sec.
Timings for 2560K FFT length (4 cpus hyperthreaded, 4 workers): 13.08, 12.59, 16.81, 12.99 ms.  Throughput: 292.38 iter/sec.
Timings for 3072K FFT length (4 cpus, 1 worker):  3.45 ms.  Throughput: 289.75 iter/sec.
Timings for 3072K FFT length (4 cpus, 4 workers): 15.42, 14.62, 14.27, 15.11 ms.  Throughput: 269.51 iter/sec.
Timings for 3072K FFT length (4 cpus hyperthreaded, 1 worker):  3.96 ms.  Throughput: 252.73 iter/sec.
Timings for 3072K FFT length (4 cpus hyperthreaded, 4 workers): 16.08, 15.22, 20.47, 15.33 ms.  Throughput: 242.01 iter/sec.
Timings for 3584K FFT length (4 cpus, 1 worker):  4.11 ms.  Throughput: 243.02 iter/sec.
Timings for 3584K FFT length (4 cpus, 4 workers): 19.07, 17.39, 16.95, 17.41 ms.  Throughput: 226.40 iter/sec.
Timings for 3584K FFT length (4 cpus hyperthreaded, 1 worker):  4.73 ms.  Throughput: 211.63 iter/sec.
Timings for 3584K FFT length (4 cpus hyperthreaded, 4 workers): 18.80, 18.03, 24.42, 17.97 ms.  Throughput: 205.23 iter/sec.
Timings for 4096K FFT length (4 cpus, 1 worker):  4.79 ms.  Throughput: 208.79 iter/sec.
Timings for 4096K FFT length (4 cpus, 4 workers): 21.20, 19.72, 19.60, 20.85 ms.  Throughput: 196.85 iter/sec.
[Sun Apr 23 11:07:29 2017]
Timings for 4096K FFT length (4 cpus hyperthreaded, 1 worker):  5.46 ms.  Throughput: 183.31 iter/sec.
Timings for 4096K FFT length (4 cpus hyperthreaded, 4 workers): 21.19, 20.58, 27.73, 21.08 ms.  Throughput: 179.28 iter/sec.
Timings for 5120K FFT length (4 cpus, 1 worker):  6.09 ms.  Throughput: 164.13 iter/sec.
Timings for 5120K FFT length (4 cpus, 4 workers): 28.35, 24.43, 24.09, 24.33 ms.  Throughput: 158.81 iter/sec.
Timings for 5120K FFT length (4 cpus hyperthreaded, 1 worker):  6.97 ms.  Throughput: 143.50 iter/sec.
Timings for 5120K FFT length (4 cpus hyperthreaded, 4 workers): 27.05, 25.88, 34.61, 25.76 ms.  Throughput: 143.33 iter/sec.
Timings for 6144K FFT length (4 cpus, 1 worker):  7.94 ms.  Throughput: 125.99 iter/sec.
Timings for 6144K FFT length (4 cpus, 4 workers): 33.12, 30.44, 30.03, 34.13 ms.  Throughput: 125.65 iter/sec.
Timings for 6144K FFT length (4 cpus hyperthreaded, 1 worker):  8.88 ms.  Throughput: 112.61 iter/sec.
Timings for 6144K FFT length (4 cpus hyperthreaded, 4 workers): 33.69, 32.84, 46.51, 32.49 ms.  Throughput: 112.42 iter/sec.
Timings for 7168K FFT length (4 cpus, 1 worker):  9.31 ms.  Throughput: 107.45 iter/sec.
Timings for 7168K FFT length (4 cpus, 4 workers): 35.75, 35.32, 35.11, 45.61 ms.  Throughput: 106.69 iter/sec.
Timings for 7168K FFT length (4 cpus hyperthreaded, 1 worker): 10.54 ms.  Throughput: 94.88 iter/sec.
Timings for 7168K FFT length (4 cpus hyperthreaded, 4 workers): 39.98, 38.35, 54.51, 40.08 ms.  Throughput: 94.39 iter/sec.
Timings for 8192K FFT length (4 cpus, 1 worker): 10.75 ms.  Throughput: 93.03 iter/sec.
Timings for 8192K FFT length (4 cpus, 4 workers): 43.70, 41.72, 40.36, 46.95 ms.  Throughput: 92.93 iter/sec.
Timings for 8192K FFT length (4 cpus hyperthreaded, 1 worker): 12.24 ms.  Throughput: 81.67 iter/sec.
Timings for 8192K FFT length (4 cpus hyperthreaded, 4 workers): 47.77, 45.73, 62.07, 45.13 ms.  Throughput: 81.07 iter/sec.
FSund is offline   Reply With Quote
Old 2017-04-23, 13:26   #763
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2×5×293 Posts
Default

If there ever were a case for more memory bandwidth...

It's nice to see a 4 core with over 200 iter/ms at 4096K FFT though!
Mark Rose is offline   Reply With Quote
Old 2017-04-23, 14:27   #764
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

13·479 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
It's nice to see a 4 core with over 200 iter/ms at 4096K FFT though!
I think you have an extra 'm' in there. Maybe with the magical optical computers (using yellow of course) we will see 200 iter/ms. But as of today, I think not.
retina is online now   Reply With Quote
Old 2017-04-23, 15:42   #765
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

Quote:
Originally Posted by retina View Post
I think you have an extra 'm' in there. Maybe with the magical optical computers (using yellow of course) we will see 200 iter/ms. But as of today, I think not.
lol yes. Good eyes.
Mark Rose is offline   Reply With Quote
Old 2017-05-28, 19:21   #766
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

16916 Posts
Default

CPU: Ryzen 7 1700X stock (mostly running 3.5GHz on all cores with Prime95)
CPU Cooler: Noctua NH-D15 SE-AM4
Motherboard: ASRock Taichi X370
BIOS version: 2.34 (Beta with AGESA 1.0.0.6)
RAM: 2x16GB G.Skill TridentZ DDR4 3200MHz 14-14-14-36-1T dual rank
OS: Win10Pro x64

Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
AMD Ryzen 7 1700X Eight-Core Processor         
CPU speed: 3400.29 MHz, 8 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 29.1, RdtscTiming=1
Best time for 2048K FFT length: 13.522 ms., avg: 13.721 ms.
Best time for 2560K FFT length: 17.576 ms., avg: 18.112 ms.
Best time for 3072K FFT length: 21.111 ms., avg: 21.577 ms.
Best time for 3584K FFT length: 25.393 ms., avg: 26.027 ms.
Best time for 4096K FFT length: 28.571 ms., avg: 30.210 ms.
Best time for 5120K FFT length: 36.353 ms., avg: 36.910 ms.
Best time for 6144K FFT length: 43.307 ms., avg: 43.586 ms.
Best time for 7168K FFT length: 51.546 ms., avg: 51.943 ms.
Best time for 8192K FFT length: 58.803 ms., avg: 60.538 ms.
Timing FFTs using 2 threads on 1 core.
Best time for 2048K FFT length: 15.871 ms., avg: 16.169 ms.
Best time for 2560K FFT length: 20.416 ms., avg: 20.792 ms.
Best time for 3072K FFT length: 24.707 ms., avg: 25.089 ms.
Best time for 3584K FFT length: 30.205 ms., avg: 30.626 ms.
Best time for 4096K FFT length: 34.890 ms., avg: 35.156 ms.
Best time for 5120K FFT length: 42.726 ms., avg: 44.333 ms.
Best time for 6144K FFT length: 50.388 ms., avg: 51.474 ms.
Best time for 7168K FFT length: 59.980 ms., avg: 60.897 ms.
Best time for 8192K FFT length: 70.055 ms., avg: 70.686 ms.
Timing FFTs using 8 threads on 8 cores.
Best time for 2048K FFT length: 2.107 ms., avg: 2.420 ms.
Best time for 2560K FFT length: 3.006 ms., avg: 3.540 ms.
Best time for 3072K FFT length: 3.573 ms., avg: 4.106 ms.
Best time for 3584K FFT length: 4.235 ms., avg: 4.553 ms.
Best time for 4096K FFT length: 4.794 ms., avg: 5.213 ms.
Best time for 5120K FFT length: 5.536 ms., avg: 5.879 ms.
Best time for 6144K FFT length: 6.778 ms., avg: 6.975 ms.
Best time for 7168K FFT length: 7.995 ms., avg: 8.094 ms.
Best time for 8192K FFT length: 9.107 ms., avg: 9.352 ms.
Timing FFTs using 16 threads on 8 cores.
Best time for 2048K FFT length: 2.381 ms., avg: 2.501 ms.
Best time for 2560K FFT length: 3.175 ms., avg: 3.610 ms.
Best time for 3072K FFT length: 3.882 ms., avg: 4.355 ms.
Best time for 3584K FFT length: 4.646 ms., avg: 5.146 ms.
Best time for 4096K FFT length: 5.256 ms., avg: 5.451 ms.
Best time for 5120K FFT length: 6.213 ms., avg: 6.490 ms.
Best time for 6144K FFT length: 7.405 ms., avg: 7.650 ms.
Best time for 7168K FFT length: 8.841 ms., avg: 8.956 ms.
Best time for 8192K FFT length: 10.258 ms., avg: 10.349 ms.
Throughput results (w/o SMT):
Code:
Prime95 64-bit version 29.1, RdtscTiming=1
Timings for 2048K FFT length (1 cpu, 1 worker): 14.30 ms.  Throughput: 69.93 iter/sec.
Timings for 2048K FFT length (8 cpus, 1 worker):  2.14 ms.  Throughput: 467.81 iter/sec.
Timings for 2048K FFT length (8 cpus, 8 workers): 16.24, 16.03, 15.99, 16.03, 16.02, 15.95, 16.18, 16.03 ms.  Throughput: 498.14 iter/sec.
Timings for 2560K FFT length (1 cpu, 1 worker): 18.11 ms.  Throughput: 55.23 iter/sec.
Timings for 2560K FFT length (8 cpus, 1 worker):  3.03 ms.  Throughput: 330.57 iter/sec.
Timings for 2560K FFT length (8 cpus, 8 workers): 22.72, 22.41, 22.23, 22.18, 22.13, 22.05, 22.34, 22.21 ms.  Throughput: 359.04 iter/sec.
Timings for 3072K FFT length (1 cpu, 1 worker): 21.96 ms.  Throughput: 45.54 iter/sec.
Timings for 3072K FFT length (8 cpus, 1 worker):  3.63 ms.  Throughput: 275.83 iter/sec.
Timings for 3072K FFT length (8 cpus, 8 workers): 27.70, 27.15, 26.61, 26.67, 26.34, 26.48, 26.85, 26.81 ms.  Throughput: 298.28 iter/sec.
Timings for 3584K FFT length (1 cpu, 1 worker): 26.18 ms.  Throughput: 38.20 iter/sec.
Timings for 3584K FFT length (8 cpus, 1 worker):  4.28 ms.  Throughput: 233.54 iter/sec.
Timings for 3584K FFT length (8 cpus, 8 workers): 32.17, 31.36, 31.35, 31.24, 31.32, 31.04, 31.43, 31.45 ms.  Throughput: 254.64 iter/sec.
[Sun May 28 21:29:40 2017]
Timings for 4096K FFT length (1 cpu, 1 worker): 29.84 ms.  Throughput: 33.51 iter/sec.
Timings for 4096K FFT length (8 cpus, 1 worker):  4.87 ms.  Throughput: 205.25 iter/sec.
Timings for 4096K FFT length (8 cpus, 8 workers): 36.36, 35.91, 35.76, 35.53, 35.79, 35.88, 36.13, 36.07 ms.  Throughput: 222.68 iter/sec.
Timings for 5120K FFT length (1 cpu, 1 worker): 37.73 ms.  Throughput: 26.50 iter/sec.
Timings for 5120K FFT length (8 cpus, 1 worker):  5.53 ms.  Throughput: 180.79 iter/sec.
Timings for 5120K FFT length (8 cpus, 8 workers): 43.10, 42.08, 42.24, 41.99, 41.81, 41.76, 41.92, 42.09 ms.  Throughput: 189.93 iter/sec.
Timings for 6144K FFT length (1 cpu, 1 worker): 44.95 ms.  Throughput: 22.25 iter/sec.
Timings for 6144K FFT length (8 cpus, 1 worker):  6.86 ms.  Throughput: 145.67 iter/sec.
Timings for 6144K FFT length (8 cpus, 8 workers): 51.06, 50.97, 50.29, 50.15, 50.18, 50.24, 50.41, 50.41 ms.  Throughput: 158.54 iter/sec.
Timings for 7168K FFT length (1 cpu, 1 worker): 53.47 ms.  Throughput: 18.70 iter/sec.
Timings for 7168K FFT length (8 cpus, 1 worker):  8.10 ms.  Throughput: 123.49 iter/sec.
Timings for 7168K FFT length (8 cpus, 8 workers): 61.59, 60.51, 59.82, 60.19, 60.22, 59.97, 60.62, 60.58 ms.  Throughput: 132.37 iter/sec.
Timings for 8192K FFT length (1 cpu, 1 worker): 60.89 ms.  Throughput: 16.42 iter/sec.
Timings for 8192K FFT length (8 cpus, 1 worker):  9.13 ms.  Throughput: 109.52 iter/sec.
Timings for 8192K FFT length (8 cpus, 8 workers): 70.80, 69.62, 69.27, 70.24, 69.69, 68.67, 69.21, 69.33 ms.  Throughput: 114.95 iter/sec.

Last fiddled with by Dresdenboy on 2017-05-28 at 19:58
Dresdenboy is offline   Reply With Quote
Old 2017-05-28, 21:41   #767
Dresdenboy
 
Dresdenboy's Avatar
 
Apr 2003
Berlin, Germany

16916 Posts
Default

With new data at hand, we could do some more comparisons.

Quote:
Originally Posted by db597 View Post
I posted the below results from my Ryzen 1700 (non-X) in the AMD Zen speculation thread earlier. Just thought I'd consolidate the results together with all the other benchmarks in this thread and also add a bit more detail on the setup.

CPU: AMD Ryzen 1700 (non-X)
Frequency: 3.32GHz @ 1.031V (stock rating 3GHz / Turbo 3.7GHz)
Heatsink: AMD Wraith Spire
Memory: Corsair 8GBx2 @ 2933GHz CAS16 (single rank)
Motherboard Asus X370-Pro
BIOS: 0604 (AGESA 1.0.0.4a)
Operating system: Windows 10 x64 Creators Update
Prime95 version: 29.1 Build 15

Code:
[snip]
Timings for 8192K FFT length (8 cpus, 8 workers): 99.83, 99.12, 96.13, 97.41, 96.20, 96.03, 96.76, 96.01 ms. Throughput: 82.33 iter/sec.
The 8192K FFT performance looks incredible on this version of Prime95, especially when all 8 cores are thrown at it. Would be good if someone can post results from a similarly priced Intel i7 7700K on Prime95 v29.1 Build 15 for comparison (I expect the i7 is a lot faster per core, but at the end of the day having double the cores may make it a rather close competition).
It seems, memory plays an important role. I got 30% lower times with the new beta BIOS, 3200-14-14-14-36 DR, and running at 3.5GHz (stock CPB):
Code:
Timings for 8192K FFT length (8 cpus, 8 workers): 70.80, 69.62, 69.27, 70.24, 69.69, 68.67, 69.21, 69.33 ms.  Throughput: 114.95 iter/sec.

Quote:
Originally Posted by LaurV View Post
Well, not exactly the same price range, but for a comparison term: i7-6950X @ 3.00GHz (yes, underclocked, having momentarily problems with cooling, April is Thai summer, the hottest period of the year, ~45°C outside), with single worker, working on 8 cores (from 10), on the required FFT size, Prime95 64-bit version 28.10:

<snip>
Timing FFTs using 8 threads on 8 physical CPUs.
<snip>
Best time for 8192K FFT length: 7.136 ms., avg: 7.291 ms.
<snip>
From my Ryzen 7 result:
Code:
Timing FFTs using 8 threads on 8 cores.
<snip>
Best time for 8192K FFT length: 9.107 ms., avg: 9.352 ms.
Normalized to 3GHz this would be 10.625 ms, or 67% the speed of your result. That's the penalty for half the AVX + cache throughput and mem channels.

Last fiddled with by Dresdenboy on 2017-05-28 at 21:44
Dresdenboy is offline   Reply With Quote
Old 2017-06-20, 22:26   #768
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

13·151 Posts
Default

Below are the timings for my new build. The hardware specs are as follows:

Intel i7-7700, 3.6 GHZ, Turbo 4.2 GHz.
RAM: Kingston Hyper Fury DDR-4 2400 2*4GB
Main Board: Asus Prime B250M-A
CPU Cooler: Arctic i11 (Recommended by Mark Rose.)
Prime95 v29.1 Build 14


Quote:
[Tue Jun 20 18:01:25 2017]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
CPU speed: 4078.95 MHz, 4 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Machine topology as determined by hwloc library:
Machine#0 (total=6990040KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
NUMANode#0 (local=6990040KB, total=6990040KB)
Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=158, CPUModel="Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz", CPUStepping=9)
L3 (size=8192KB, linesize=64, ways=16, Inclusive=1)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x00000003)
PU#0 (cpuset: 0x00000001)
PU#1 (cpuset: 0x00000002)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x0000000c)
PU#2 (cpuset: 0x00000004)
PU#3 (cpuset: 0x00000008)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x00000030)
PU#4 (cpuset: 0x00000010)
PU#5 (cpuset: 0x00000020)
L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
Core (cpuset: 0x000000c0)
PU#6 (cpuset: 0x00000040)
PU#7 (cpuset: 0x00000080)
Prime95 64-bit version 29.1, RdtscTiming=1
Timings for 2048K FFT length (4 cpus, 1 worker): 2.43 ms. Throughput: 412.20 iter/sec.
Timings for 2048K FFT length (4 cpus, 4 workers): 10.87, 10.83, 10.88, 10.84 ms. Throughput: 368.51 iter/sec.
Timings for 2048K FFT length (4 cpus hyperthreaded, 1 worker): 2.64 ms. Throughput: 379.40 iter/sec.
Timings for 2048K FFT length (4 cpus hyperthreaded, 4 workers): 11.82, 11.53, 11.55, 11.65 ms. Throughput: 343.76 iter/sec.
Timings for 2560K FFT length (4 cpus, 1 worker): 3.16 ms. Throughput: 316.92 iter/sec.
Timings for 2560K FFT length (4 cpus, 4 workers): 13.89, 13.85, 13.92, 13.88 ms. Throughput: 288.05 iter/sec.
Timings for 2560K FFT length (4 cpus hyperthreaded, 1 worker): 3.38 ms. Throughput: 295.71 iter/sec.
Timings for 2560K FFT length (4 cpus hyperthreaded, 4 workers): 14.69, 14.55, 14.54, 14.65 ms. Throughput: 273.83 iter/sec.
Timings for 3072K FFT length (4 cpus, 1 worker): 3.88 ms. Throughput: 257.76 iter/sec.
Timings for 3072K FFT length (4 cpus, 4 workers): 16.83, 16.78, 16.81, 16.90 ms. Throughput: 237.67 iter/sec.
Timings for 3072K FFT length (4 cpus hyperthreaded, 1 worker): 4.20 ms. Throughput: 238.23 iter/sec.
Timings for 3072K FFT length (4 cpus hyperthreaded, 4 workers): 17.86, 17.48, 17.81, 17.42 ms. Throughput: 226.76 iter/sec.
Timings for 3584K FFT length (4 cpus, 1 worker): 4.65 ms. Throughput: 214.84 iter/sec.
Timings for 3584K FFT length (4 cpus, 4 workers): 19.77, 19.75, 19.85, 19.80 ms. Throughput: 202.10 iter/sec.
Timings for 3584K FFT length (4 cpus hyperthreaded, 1 worker): 5.01 ms. Throughput: 199.58 iter/sec.
Timings for 3584K FFT length (4 cpus hyperthreaded, 4 workers): 20.65, 20.55, 20.56, 20.64 ms. Throughput: 194.19 iter/sec.
Timings for 4096K FFT length (4 cpus, 1 worker): 5.32 ms. Throughput: 188.08 iter/sec.
Timings for 4096K FFT length (4 cpus, 4 workers): 22.75, 22.28, 22.42, 22.61 ms. Throughput: 177.67 iter/sec.
[Tue Jun 20 18:06:28 2017]
Timings for 4096K FFT length (4 cpus hyperthreaded, 1 worker): 5.81 ms. Throughput: 172.10 iter/sec.
Timings for 4096K FFT length (4 cpus hyperthreaded, 4 workers): 23.71, 23.68, 23.61, 23.63 ms. Throughput: 169.10 iter/sec.
Timings for 5120K FFT length (4 cpus, 1 worker): 6.79 ms. Throughput: 147.20 iter/sec.
Timings for 5120K FFT length (4 cpus, 4 workers): 28.19, 27.94, 28.18, 28.03 ms. Throughput: 142.44 iter/sec.
Timings for 5120K FFT length (4 cpus hyperthreaded, 1 worker): 7.37 ms. Throughput: 135.77 iter/sec.
Timings for 5120K FFT length (4 cpus hyperthreaded, 4 workers): 29.78, 29.50, 29.67, 29.47 ms. Throughput: 135.12 iter/sec.
Timings for 6144K FFT length (4 cpus, 1 worker): 8.58 ms. Throughput: 116.61 iter/sec.
Timings for 6144K FFT length (4 cpus, 4 workers): 35.04, 35.72, 35.79, 35.68 ms. Throughput: 112.50 iter/sec.
Timings for 6144K FFT length (4 cpus hyperthreaded, 1 worker): 9.65 ms. Throughput: 103.64 iter/sec.
Timings for 6144K FFT length (4 cpus hyperthreaded, 4 workers): 38.22, 35.56, 40.69, 39.95 ms. Throughput: 103.89 iter/sec.
Timings for 7168K FFT length (4 cpus, 1 worker): 10.03 ms. Throughput: 99.73 iter/sec.
Timings for 7168K FFT length (4 cpus, 4 workers): 40.22, 39.93, 40.05, 40.20 ms. Throughput: 99.75 iter/sec.
Timings for 7168K FFT length (4 cpus hyperthreaded, 1 worker): 11.24 ms. Throughput: 88.94 iter/sec.
Timings for 7168K FFT length (4 cpus hyperthreaded, 4 workers): 45.00, 44.19, 44.71, 44.65 ms. Throughput: 89.62 iter/sec.
Timings for 8192K FFT length (4 cpus, 1 worker): 11.71 ms. Throughput: 85.42 iter/sec.
Timings for 8192K FFT length (4 cpus, 4 workers): 46.02, 45.88, 45.80, 45.96 ms. Throughput: 87.11 iter/sec.
Timings for 8192K FFT length (4 cpus hyperthreaded, 1 worker): 13.02 ms. Throughput: 76.83 iter/sec.
Timings for 8192K FFT length (4 cpus hyperthreaded, 4 workers): 52.32, 51.69, 52.79, 51.94 ms. Throughput: 76.65 iter/sec.
storm5510 is offline   Reply With Quote
Old 2017-06-21, 00:25   #769
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

1011011100102 Posts
Default

Quote:
Originally Posted by storm5510 View Post
Below are the timings for my new build. The hardware specs are as follows:

Intel i7-7700, 3.6 GHZ, Turbo 4.2 GHz.
RAM: Kingston Hyper Fury DDR-4 2400 2*4GB
Main Board: Asus Prime B250M-A
CPU Cooler: Arctic i11 (Recommended by Mark Rose.)
Prime95 v29.1 Build 14
Timings look good!

If you want to save a little power, you can go into your BIOS and tweak the 4-core Turbo speed and lower it to 3.6 GHz. You should still get almost the same throughput. That way if you're running mprime the power usage and heat will be lower, but when running anything that doesn't use all four 4 cores it will get the full turbo speed.
Mark Rose is offline   Reply With Quote
Old 2017-06-21, 02:54   #770
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

111101010112 Posts
Default

Quote:
Originally Posted by Mark Rose View Post
Timings look good!

If you want to save a little power, you can go into your BIOS and tweak the 4-core Turbo speed and lower it to 3.6 GHz. You should still get almost the same throughput. That way if you're running mprime the power usage and heat will be lower, but when running anything that doesn't use all four 4 cores it will get the full turbo speed.
I had a battle on my hands for a while. It was pulling 350 watts into the PSU. CPU cores running above 90°C. It was blowing hot air like a furnace. This was with Prime95 only!

I've made several trips back into the BIOS since. I loaded the default settings and everything started running much cooler. The power pull now is 142 watts and core temps are running in the low to mid 60's. Again, with Prime95 only, running a P-1. I don't know what settings the BIOS originally had. The first time I powered up, I just sat here and looked at it for a while. There is a lot in there.

The CPU is hovering between 3.96 and 4.02 GHz. However, and this is something I will probably need to put in the Prime95 area. It's only using four, of the eight, threads available, and will not allow me to use more than one worker. Curious!
storm5510 is offline   Reply With Quote
Old 2017-06-21, 04:31   #771
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2×5×293 Posts
Default

Quote:
Originally Posted by storm5510 View Post
The CPU is hovering between 3.96 and 4.02 GHz. However, and this is something I will probably need to put in the Prime95 area. It's only using four, of the eight, threads available, and will not allow me to use more than one worker. Curious!
That's working as expected. The CPU can't hit the top turbo speed when using all four cores. Also, Prime95 is so efficiently coded that using hyperthreads (which is basically two threads taking turns on the core) doesn't speed things up.

With regards to only one worker, I believe it will force that if you're working on very large exponents.
Mark Rose is offline   Reply With Quote
Old 2017-11-12, 13:41   #772
bayanne
 
bayanne's Avatar
 
"Tony Gott"
Aug 2002
Yell, Shetland, UK

14C16 Posts
Default

Intel(R) Core(TM) i7-4771 CPU @ 3.50GHz
CPU speed: 3491.94 MHz, 4 hyperthreaded cores
CPU features: Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Attached Files
File Type: txt results.txt (13.4 KB, 246 views)
bayanne is offline   Reply With Quote
Old 2017-11-27, 03:34   #773
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

100111101011102 Posts
Default

Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
CPU speed: 4034.80 MHz, 4 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
[Actual CPU speed 4300MHz]
Attached Files
File Type: txt bench results.txt (23.4 KB, 319 views)
kladner is offline   Reply With Quote
Old 2017-12-31, 23:03   #774
obiwantoby
 
Dec 2017

110 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Almost useful to me, except you only posted timing for 2 cores, not a the single-thread test I need for benchmarks.
Is there a standardization for prime.txt ? I'll send my machine through it.
obiwantoby is offline   Reply With Quote
Old 2018-01-07, 20:15   #775
charliedill
 
Jan 2018

2 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Almost useful to me, except you only posted timing for 2 cores, not a the single-thread test I need for benchmarks.
Hello.

As a brand shiny new noob, should I run the benchmarks non-HT and single core for best and most useful results?
charliedill is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perpetual "interesting video" thread... Xyzzy Lounge 43 2021-07-17 00:00
LLR benchmark thread Oddball Riesel Prime Search 5 2010-08-02 00:11
Perpetual I'm pi**ed off thread rogue Soap Box 19 2009-10-28 19:17
Perpetual autostereogram thread... Xyzzy Lounge 10 2006-09-28 00:36
Perpetual ECM factoring challenge thread... Xyzzy Factoring 65 2005-09-05 08:16

All times are UTC. The time now is 15:42.


Fri Aug 6 15:42:09 UTC 2021 up 14 days, 10:11, 1 user, load averages: 1.98, 2.39, 2.61

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.