mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2017-03-11, 20:56   #738
thedigitalone
 
Mar 2017
PNW

1 Posts
Default AMD Ryzen 7 1800X Eight-Core Benchmark

AMD Ryzen 7 1800X Eight-Core Processor
CPU speed: 3447.35 MHz, 16 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
AMD Ryzen 7 1800X Eight-Core Processor
CPU speed: 3816.00 MHz, 16 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 28.10, RdtscTiming=1
Prime95 64-bit version 28.10, RdtscTiming=1
AMD Ryzen 7 1800X Eight-Core Processor
CPU speed: 3545.99 MHz, 16 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
AMD Ryzen 7 1800X Eight-Core Processor
CPU speed: 3816.00 MHz, 16 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 28.10, RdtscTiming=1
AMD Ryzen 7 1800X Eight-Core Processor
CPU speed: 3592.12 MHz, 16 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 28.10, RdtscTiming=1
Prime95 64-bit version 28.10, RdtscTiming=1
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
AMD Ryzen 7 1800X Eight-Core Processor
CPU speed: 3422.28 MHz, 16 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 28.10, RdtscTiming=1
Best time for 1024K FFT length: 13.807 ms., avg: 14.273 ms.
Best time for 1280K FFT length: 18.159 ms., avg: 18.537 ms.
Best time for 1536K FFT length: 22.682 ms., avg: 23.107 ms.
Best time for 1792K FFT length: 26.061 ms., avg: 26.566 ms.
Best time for 2048K FFT length: 29.822 ms., avg: 30.188 ms.
Best time for 2560K FFT length: 37.698 ms., avg: 39.824 ms.
Best time for 3072K FFT length: 45.986 ms., avg: 46.468 ms.
Best time for 3584K FFT length: 53.277 ms., avg: 53.670 ms.
Best time for 4096K FFT length: 61.075 ms., avg: 61.465 ms.
Best time for 5120K FFT length: 79.219 ms., avg: 79.853 ms.
Best time for 6144K FFT length: 97.915 ms., avg: 98.519 ms.
Best time for 7168K FFT length: 114.295 ms., avg: 115.312 ms.
Best time for 8192K FFT length: 126.941 ms., avg: 127.963 ms.
Timing FFTs using 2 threads.
Best time for 1024K FFT length: 7.076 ms., avg: 7.169 ms.
Best time for 1280K FFT length: 9.170 ms., avg: 9.299 ms.
Best time for 1536K FFT length: 11.285 ms., avg: 11.411 ms.
Best time for 1792K FFT length: 13.462 ms., avg: 13.579 ms.
Best time for 2048K FFT length: 14.929 ms., avg: 15.151 ms.
Best time for 2560K FFT length: 18.810 ms., avg: 19.041 ms.
Best time for 3072K FFT length: 23.086 ms., avg: 23.249 ms.
Best time for 3584K FFT length: 27.528 ms., avg: 27.643 ms.
Best time for 4096K FFT length: 30.559 ms., avg: 30.740 ms.
Best time for 5120K FFT length: 39.988 ms., avg: 40.245 ms.
Best time for 6144K FFT length: 48.979 ms., avg: 49.186 ms.
Best time for 7168K FFT length: 58.807 ms., avg: 59.122 ms.
Best time for 8192K FFT length: 63.872 ms., avg: 64.181 ms.
Timing FFTs using 3 threads.
Best time for 1024K FFT length: 4.809 ms., avg: 4.861 ms.
Best time for 1280K FFT length: 6.261 ms., avg: 6.294 ms.
Best time for 1536K FFT length: 7.697 ms., avg: 7.875 ms.
Best time for 1792K FFT length: 9.046 ms., avg: 9.085 ms.
Best time for 2048K FFT length: 10.120 ms., avg: 10.228 ms.
Best time for 2560K FFT length: 12.718 ms., avg: 13.066 ms.
Best time for 3072K FFT length: 15.739 ms., avg: 15.773 ms.
Best time for 3584K FFT length: 18.524 ms., avg: 18.633 ms.
Best time for 4096K FFT length: 20.753 ms., avg: 20.859 ms.
Best time for 5120K FFT length: 27.089 ms., avg: 27.224 ms.
Best time for 6144K FFT length: 33.261 ms., avg: 33.396 ms.
Best time for 7168K FFT length: 39.614 ms., avg: 39.760 ms.
Best time for 8192K FFT length: 43.169 ms., avg: 43.302 ms.
Timing FFTs using 4 threads.
Best time for 1024K FFT length: 3.602 ms., avg: 3.636 ms.
Best time for 1280K FFT length: 4.677 ms., avg: 4.888 ms.
Best time for 1536K FFT length: 5.749 ms., avg: 5.790 ms.
Best time for 1792K FFT length: 6.857 ms., avg: 7.013 ms.
Best time for 2048K FFT length: 7.604 ms., avg: 7.692 ms.
Best time for 2560K FFT length: 9.624 ms., avg: 9.911 ms.
Best time for 3072K FFT length: 11.773 ms., avg: 11.844 ms.
Best time for 3584K FFT length: 14.034 ms., avg: 14.151 ms.
Best time for 4096K FFT length: 15.621 ms., avg: 15.658 ms.
Best time for 5120K FFT length: 20.389 ms., avg: 20.476 ms.
Best time for 6144K FFT length: 25.047 ms., avg: 25.197 ms.
Best time for 7168K FFT length: 30.019 ms., avg: 30.175 ms.
Best time for 8192K FFT length: 32.537 ms., avg: 32.675 ms.
Timing FFTs using 5 threads.
Best time for 1024K FFT length: 2.925 ms., avg: 2.953 ms.
Best time for 1280K FFT length: 3.802 ms., avg: 3.868 ms.
Best time for 1536K FFT length: 4.691 ms., avg: 4.757 ms.
Best time for 1792K FFT length: 5.526 ms., avg: 5.581 ms.
Best time for 2048K FFT length: 6.164 ms., avg: 6.211 ms.
Best time for 2560K FFT length: 7.778 ms., avg: 7.811 ms.
Best time for 3072K FFT length: 9.525 ms., avg: 10.635 ms.
Best time for 3584K FFT length: 11.330 ms., avg: 11.412 ms.
Best time for 4096K FFT length: 12.643 ms., avg: 12.736 ms.
Best time for 5120K FFT length: 16.489 ms., avg: 17.282 ms.
Best time for 6144K FFT length: 20.238 ms., avg: 20.950 ms.
Best time for 7168K FFT length: 24.345 ms., avg: 24.513 ms.
Best time for 8192K FFT length: 26.286 ms., avg: 26.744 ms.
Timing FFTs using 6 threads.
Best time for 1024K FFT length: 2.470 ms., avg: 2.858 ms.
Best time for 1280K FFT length: 3.207 ms., avg: 3.360 ms.
Best time for 1536K FFT length: 3.912 ms., avg: 3.999 ms.
Best time for 1792K FFT length: 4.741 ms., avg: 5.378 ms.
Best time for 2048K FFT length: 5.243 ms., avg: 5.828 ms.
Best time for 2560K FFT length: 6.570 ms., avg: 7.218 ms.
Best time for 3072K FFT length: 8.085 ms., avg: 8.629 ms.
Best time for 3584K FFT length: 9.679 ms., avg: 11.073 ms.
Best time for 4096K FFT length: 10.725 ms., avg: 11.134 ms.
Best time for 5120K FFT length: 13.989 ms., avg: 14.172 ms.
Best time for 6144K FFT length: 17.201 ms., avg: 17.386 ms.
Best time for 7168K FFT length: 21.077 ms., avg: 21.710 ms.
Best time for 8192K FFT length: 22.249 ms., avg: 22.662 ms.
Timing FFTs using 7 threads.
Best time for 1024K FFT length: 2.160 ms., avg: 2.268 ms.
Best time for 1280K FFT length: 3.178 ms., avg: 4.154 ms.
Best time for 1536K FFT length: 3.406 ms., avg: 3.569 ms.
Best time for 1792K FFT length: 4.156 ms., avg: 4.415 ms.
Best time for 2048K FFT length: 4.602 ms., avg: 4.904 ms.
Best time for 2560K FFT length: 5.741 ms., avg: 6.230 ms.
Best time for 3072K FFT length: 7.109 ms., avg: 7.462 ms.
thedigitalone is offline   Reply With Quote
Old 2017-03-23, 05:47   #739
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

1011011100012 Posts
Default

Clear Linux gives a small boost in mprime throughput.

I've recently been playing around with Intel's Clear Linux distribution. It's compiled with optimizations and built specifically for Intel's latest processors. Given that mprime's LL is mostly hand-tuned assembly, I wasn't expecting to see a difference in performance compared to Ubuntu 16.04, but I have.

I'm running my cluster of i5-6600's at 3.3 GHz, as the dual rank, dual channel DDR3-2133 makes it not worth the watts to run the CPUs any faster.

That being said, Clear Linux at 3.3 GHz is up to 3% faster than Ubuntu at 3.6 GHz.

I've updated my benchmark spreadsheet.

My guess is the difference comes down to different kernels and fewer background tasks running.
Mark Rose is offline   Reply With Quote
Old 2017-04-05, 14:39   #740
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

29·101 Posts
Default

I finally got around to experimenting with undervolting. So far I've lowered VCore by 0.10 volts and I've passed a 7 hour stress test. The result? Saved another 12.5 watts per node, so my 4 node cluster is now consuming only 270 watts at the wall, or 243 from the nodes (at 3.3 GHz all cores). With a 4096 FFT, 4 cores take 5.37 ms/iter, for 2.76 iter/sec/watt at the wall, or 3.06 iter/sec/watt from the nodes.

Compare that to the GTX 1080 Ti, which consumes 180 watts from the card to get 2.63 ms/iter, for 2.12 iter/sec/watt.

I wasn't expecting CPUs to be 44% more efficient.

I'm going to try lowering VCore more soon. I might have to add more nodes to this power supply.
Mark Rose is offline   Reply With Quote
Old 2017-04-10, 06:53   #741
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

10010110010112 Posts
Default Intel(R) Celeron(R) CPU N2840 @ 2.16GHz

Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz
CPU speed: 2557.70 MHz, 2 cores
CPU features: Prefetchw, SSE, SSE2, SSE4
L1 cache size: 24 KB
L2 cache size: 1 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Machine topology as determined by hwloc library:
 Machine#0 (total=2492796KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
  NUMANode#0 (local=2492796KB, total=2492796KB)
    Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=55, CPUModel="Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz", CPUStepping=8)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
Prime95 64-bit version 29.1, RdtscTiming=1
Timing FFTs using 2 cores.
Best time for 1024K FFT length: 22.048 ms., avg: 23.196 ms.
Best time for 1280K FFT length: 29.125 ms., avg: 30.133 ms.
Best time for 1536K FFT length: 35.795 ms., avg: 36.288 ms.
Best time for 1792K FFT length: 45.152 ms., avg: 46.324 ms.
Best time for 2048K FFT length: 47.919 ms., avg: 49.040 ms.
Best time for 2560K FFT length: 60.895 ms., avg: 64.173 ms.
Best time for 3072K FFT length: 77.295 ms., avg: 80.964 ms.
Best time for 3584K FFT length: 97.452 ms., avg: 98.772 ms.
Best time for 4096K FFT length: 117.728 ms., avg: 118.825 ms.
Best time for 5120K FFT length: 144.734 ms., avg: 148.510 ms.
Best time for 6144K FFT length: 186.521 ms., avg: 188.309 ms.
Best time for 7168K FFT length: 282.553 ms., avg: 284.238 ms.
Best time for 8192K FFT length: 302.990 ms., avg: 306.707 ms.
ET_ is offline   Reply With Quote
Old 2017-04-10, 07:08   #742
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

3,319 Posts
Default

Quote:
Originally Posted by ET_ View Post
Intel(R) Celeron(R) CPU N2840 @ 2.16GHz
Timing FFTs using 2 cores.
Almost useful to me, except you only posted timing for 2 cores, not a the single-thread test I need for benchmarks.
James Heinrich is offline   Reply With Quote
Old 2017-04-10, 09:14   #743
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

10010110010112 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Almost useful to me, except you only posted timing for 2 cores, not a the single-thread test I need for benchmarks.
I used the option FFT timings benchmark.

Here are the results for the option Throughput benchmark...

Code:
[Mon Apr 10 10:53:20 2017]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz
CPU speed: 2557.77 MHz, 2 cores
CPU features: Prefetchw, SSE, SSE2, SSE4
L1 cache size: 24 KB
L2 cache size: 1 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Machine topology as determined by hwloc library:
 Machine#0 (total=2170248KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
  NUMANode#0 (local=2170248KB, total=2170248KB)
    Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=55, CPUModel="Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz", CPUStepping=8)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
Prime95 64-bit version 29.1, RdtscTiming=1
Timings for 1024K FFT length (2 cpus, 1 worker): 23.38 ms.  Throughput: 42.76 iter/sec.
Timings for 1024K FFT length (2 cpus, 2 workers): 46.76, 46.03 ms.  Throughput: 43.11 iter/sec.
Timings for 1280K FFT length (2 cpus, 1 worker): 30.89 ms.  Throughput: 32.37 iter/sec.
Timings for 1280K FFT length (2 cpus, 2 workers): 61.83, 60.54 ms.  Throughput: 32.69 iter/sec.
Timings for 1536K FFT length (2 cpus, 1 worker): 37.43 ms.  Throughput: 26.72 iter/sec.
Timings for 1536K FFT length (2 cpus, 2 workers): 76.86, 74.73 ms.  Throughput: 26.39 iter/sec.
Timings for 1792K FFT length (2 cpus, 1 worker): 48.25 ms.  Throughput: 20.73 iter/sec.
Timings for 1792K FFT length (2 cpus, 2 workers): 97.16, 91.82 ms.  Throughput: 21.18 iter/sec.
Timings for 2048K FFT length (2 cpus, 1 worker): 51.60 ms.  Throughput: 19.38 iter/sec.
Timings for 2048K FFT length (2 cpus, 2 workers): 103.15, 99.23 ms.  Throughput: 19.77 iter/sec.
Timings for 2560K FFT length (2 cpus, 1 worker): 64.17 ms.  Throughput: 15.58 iter/sec.
Timings for 2560K FFT length (2 cpus, 2 workers): 128.12, 124.46 ms.  Throughput: 15.84 iter/sec.
Timings for 3072K FFT length (2 cpus, 1 worker): 80.92 ms.  Throughput: 12.36 iter/sec.
Timings for 3072K FFT length (2 cpus, 2 workers): 217.55, 216.71 ms.  Throughput:  9.21 iter/sec.
Timings for 3584K FFT length (2 cpus, 1 worker): 114.14 ms.  Throughput:  8.76 iter/sec.
Timings for 3584K FFT length (2 cpus, 2 workers): 322.22, 260.20 ms.  Throughput:  6.95 iter/sec.
[Mon Apr 10 10:58:35 2017]
Timings for 4096K FFT length (2 cpus, 1 worker): 152.85 ms.  Throughput:  6.54 iter/sec.
Timings for 4096K FFT length (2 cpus, 2 workers): 343.11, 248.39 ms.  Throughput:  6.94 iter/sec.
Timings for 5120K FFT length (2 cpus, 1 worker): 209.21 ms.  Throughput:  4.78 iter/sec.
Timings for 5120K FFT length (2 cpus, 2 workers): 474.79, 399.13 ms.  Throughput:  4.61 iter/sec.
Timings for 6144K FFT length (2 cpus, 1 worker): 240.27 ms.  Throughput:  4.16 iter/sec.
Timings for 6144K FFT length (2 cpus, 2 workers): 694.67, 595.62 ms.  Throughput:  3.12 iter/sec.
Timings for 7168K FFT length (2 cpus, 1 worker): 805.39 ms.  Throughput:  1.24 iter/sec.
Timings for 7168K FFT length (2 cpus, 2 workers): 1108.76, 926.34 ms.  Throughput:  1.98 iter/sec.
Timings for 8192K FFT length (2 cpus, 1 worker): 1045.18 ms.  Throughput:  0.96 iter/sec.
Timings for 8192K FFT length (2 cpus, 2 workers): 661.82, 562.99 ms.  Throughput:  3.29 iter/sec.
and the option trial factoring benchmark (repeated because of a strange value appearing on the 76bit section: I suspect some wrong timing interaction between the factoring threads and the thread responsible of writing data to disk; the same applied to the previous throughput benchmark during the writing of the timestamp)

Code:
[Mon Apr 10 11:05:55 2017]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz
CPU speed: 2558.10 MHz, 2 cores
CPU features: Prefetchw, SSE, SSE2, SSE4
L1 cache size: 24 KB
L2 cache size: 1 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Machine topology as determined by hwloc library:
 Machine#0 (total=2170248KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
  NUMANode#0 (local=2170248KB, total=2170248KB)
    Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=55, CPUModel="Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz", CPUStepping=8)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
Prime95 64-bit version 29.1, RdtscTiming=1
Best time for 61 bit trial factors: 7.731 ms.
Best time for 62 bit trial factors: 20.862 ms.
Best time for 63 bit trial factors: 15.234 ms.
Best time for 64 bit trial factors: 17.497 ms.
Best time for 65 bit trial factors: 19.764 ms.
Best time for 66 bit trial factors: 19.450 ms.
Best time for 67 bit trial factors: 54.953 ms.
Best time for 75 bit trial factors: 78.660 ms.
Best time for 76 bit trial factors: 1.207 ms.
Best time for 77 bit trial factors: 22.409 ms.
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz
CPU speed: 2557.81 MHz, 2 cores
CPU features: Prefetchw, SSE, SSE2, SSE4
L1 cache size: 24 KB
L2 cache size: 1 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 128
Machine topology as determined by hwloc library:
 Machine#0 (total=2170248KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
  NUMANode#0 (local=2170248KB, total=2170248KB)
    Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=55, CPUModel="Intel(R) Celeron(R) CPU  N2840  @ 2.16GHz", CPUStepping=8)
      L2 (size=1024KB, linesize=64, ways=16, Inclusive=0)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
        L1d (size=24KB, linesize=64, ways=6, Inclusive=0)
          Core (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
Prime95 64-bit version 29.1, RdtscTiming=1
Best time for 61 bit trial factors: 7.615 ms.
Best time for 62 bit trial factors: 7.840 ms.
Best time for 63 bit trial factors: 11.009 ms.
Best time for 64 bit trial factors: 13.912 ms.
Best time for 65 bit trial factors: 17.365 ms.
Best time for 66 bit trial factors: 18.360 ms.
Best time for 67 bit trial factors: 18.486 ms.
Best time for 75 bit trial factors: 46.705 ms.
Best time for 76 bit trial factors: 18.084 ms.
Best time for 77 bit trial factors: 19.320 ms.
Let me know if you need any more data

Luigi

Last fiddled with by ET_ on 2017-04-10 at 09:18
ET_ is offline   Reply With Quote
Old 2017-04-11, 08:21   #744
db597
 
db597's Avatar
 
Jan 2003

3138 Posts
Default Ryzen 1700 benchmark results

I posted the below results from my Ryzen 1700 (non-X) in the AMD Zen speculation thread earlier. Just thought I'd consolidate the results together with all the other benchmarks in this thread and also add a bit more detail on the setup.

CPU: AMD Ryzen 1700 (non-X)
Frequency: 3.32GHz @ 1.031V (stock rating 3GHz / Turbo 3.7GHz)
Heatsink: AMD Wraith Spire
Memory: Corsair 8GBx2 @ 2933GHz CAS16 (single rank)
Motherboard Asus X370-Pro
BIOS: 0604 (AGESA 1.0.0.4a)
Operating system: Windows 10 x64 Creators Update
Prime95 version: 29.1 Build 15

Code:
AMD Ryzen 7 1700 Eight-Core Processor
CPU speed: 3318.72 MHz, 8 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 64
L2 TLBS: 1536
Prime95 64-bit version 29.1, RdtscTiming=1

I rearranged the benchmark results below for a bit easier reading / comparison:

Timings for 1024K FFT length (1 cpu, 1 worker): 7.83 ms. Throughput: 127.69 iter/sec.
Timings for 1280K FFT length (1 cpu, 1 worker): 9.88 ms. Throughput: 101.17 iter/sec.
Timings for 1536K FFT length (1 cpu, 1 worker): 11.97 ms. Throughput: 83.57 iter/sec.
Timings for 1792K FFT length (1 cpu, 1 worker): 14.58 ms. Throughput: 68.60 iter/sec.
Timings for 2048K FFT length (1 cpu, 1 worker): 16.05 ms. Throughput: 62.29 iter/sec.
Timings for 2560K FFT length (1 cpu, 1 worker): 20.60 ms. Throughput: 48.55 iter/sec.
Timings for 3072K FFT length (1 cpu, 1 worker): 24.87 ms. Throughput: 40.20 iter/sec.
Timings for 3584K FFT length (1 cpu, 1 worker): 29.90 ms. Throughput: 33.44 iter/sec.
Timings for 4096K FFT length (1 cpu, 1 worker): 34.18 ms. Throughput: 29.26 iter/sec.
Timings for 5120K FFT length (1 cpu, 1 worker): 42.60 ms. Throughput: 23.48 iter/sec.
Timings for 6144K FFT length (1 cpu, 1 worker): 50.67 ms. Throughput: 19.74 iter/sec.
Timings for 7168K FFT length (1 cpu, 1 worker): 60.12 ms. Throughput: 16.63 iter/sec.
Timings for 8192K FFT length (1 cpu, 1 worker): 68.76 ms. Throughput: 14.54 iter/sec.

Timings for 1024K FFT length (8 cpus, 1 worker): 1.13 ms. Throughput: 886.42 iter/sec.
Timings for 1280K FFT length (8 cpus, 1 worker): 1.42 ms. Throughput: 704.55 iter/sec.
Timings for 1536K FFT length (8 cpus, 1 worker): 1.71 ms. Throughput: 584.87 iter/sec.
Timings for 1792K FFT length (8 cpus, 1 worker): 2.10 ms. Throughput: 475.44 iter/sec.
Timings for 2048K FFT length (8 cpus, 1 worker): 2.39 ms. Throughput: 418.60 iter/sec.
Timings for 2560K FFT length (8 cpus, 1 worker): 3.96 ms. Throughput: 252.38 iter/sec.
Timings for 3072K FFT length (8 cpus, 1 worker): 4.97 ms. Throughput: 201.08 iter/sec.
Timings for 3584K FFT length (8 cpus, 1 worker): 5.97 ms. Throughput: 167.51 iter/sec.
Timings for 4096K FFT length (8 cpus, 1 worker): 6.92 ms. Throughput: 144.58 iter/sec.
Timings for 5120K FFT length (8 cpus, 1 worker): 7.32 ms. Throughput: 136.59 iter/sec.
Timings for 6144K FFT length (8 cpus, 1 worker): 9.37 ms. Throughput: 106.71 iter/sec.
Timings for 7168K FFT length (8 cpus, 1 worker): 10.96 ms. Throughput: 91.21 iter/sec.
Timings for 8192K FFT length (8 cpus, 1 worker): 12.69 ms. Throughput: 78.83 iter/sec.

Timings for 1024K FFT length (8 cpus, 8 workers): 11.30, 11.41, 11.28, 11.22, 11.18, 11.18, 11.21, 11.20 ms. Throughput: 711.26 iter/sec.
Timings for 1280K FFT length (8 cpus, 8 workers): 14.15, 14.51, 14.13, 14.15, 14.03, 14.05, 14.13, 14.16 ms. Throughput: 564.84 iter/sec.
Timings for 1536K FFT length (8 cpus, 8 workers): 16.81, 17.45, 16.96, 17.00, 16.84, 16.82, 16.91, 16.82 ms. Throughput: 472.01 iter/sec.
Timings for 1792K FFT length (8 cpus, 8 workers): 20.85, 21.81, 20.92, 21.12, 20.68, 20.92, 21.25, 20.77 ms. Throughput: 380.31 iter/sec.
Timings for 2048K FFT length (8 cpus, 8 workers): 22.60, 23.32, 22.76, 22.78, 22.54, 22.61, 22.61, 22.54 ms. Throughput: 352.17 iter/sec.
Timings for 2560K FFT length (8 cpus, 8 workers): 33.53, 34.97, 33.76, 34.34, 34.01, 33.93, 34.26, 33.98 ms. Throughput: 234.66 iter/sec.
Timings for 3072K FFT length (8 cpus, 8 workers): 41.23, 42.38, 41.51, 40.71, 40.84, 40.78, 40.87, 41.04 ms. Throughput: 194.34 iter/sec.
Timings for 3584K FFT length (8 cpus, 8 workers): 48.09, 49.43, 47.96, 48.77, 47.89, 47.32, 47.90, 47.23 ms. Throughput: 166.45 iter/sec.
Timings for 4096K FFT length (8 cpus, 8 workers): 56.27, 57.15, 55.09, 55.39, 55.64, 54.99, 54.88, 54.69 ms. Throughput: 144.14 iter/sec.
Timings for 5120K FFT length (8 cpus, 8 workers): 58.15, 60.30, 58.03, 57.82, 57.55, 57.00, 58.24, 57.01 ms. Throughput: 137.94 iter/sec.
Timings for 6144K FFT length (8 cpus, 8 workers): 70.59, 72.77, 71.30, 71.76, 70.77, 70.67, 70.83, 70.63 ms. Throughput: 112.43 iter/sec.
Timings for 7168K FFT length (8 cpus, 8 workers): 87.46, 87.18, 83.29, 83.81, 82.80, 83.61, 83.66, 83.11 ms. Throughput: 94.87 iter/sec.
Timings for 8192K FFT length (8 cpus, 8 workers): 99.83, 99.12, 96.13, 97.41, 96.20, 96.03, 96.76, 96.01 ms. Throughput: 82.33 iter/sec.
The 8192K FFT performance looks incredible on this version of Prime95, especially when all 8 cores are thrown at it. Would be good if someone can post results from a similarly priced Intel i7 7700K on Prime95 v29.1 Build 15 for comparison (I expect the i7 is a lot faster per core, but at the end of the day having double the cores may make it a rather close competition).

Last fiddled with by db597 on 2017-04-11 at 08:25
db597 is offline   Reply With Quote
Old 2017-04-12, 14:09   #745
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100100100101112 Posts
Default

Well, not exactly the same price range, but for a comparison term: i7-6950X @ 3.00GHz (yes, underclocked, having momentarily problems with cooling, April is Thai summer, the hottest period of the year, ~45°C outside), with single worker, working on 8 cores (from 10), on the required FFT size, Prime95 64-bit version 28.10:

<snip>
Timing FFTs using 8 threads on 8 physical CPUs.
<snip>
Best time for 8192K FFT length: 7.136 ms., avg: 7.291 ms.
<snip>

Last fiddled with by LaurV on 2017-04-12 at 14:10
LaurV is offline   Reply With Quote
Old 2017-04-12, 17:28   #746
db597
 
db597's Avatar
 
Jan 2003

7·29 Posts
Default

@LaurV... thanks for the comparison benchmark.

So for the case of both systems running on 8 physical cores, it's 7.136ms for the i7-6950X @ 3.0GHz vs 12.69ms for the Ryzen 1700 @ 3.3GHz. Looks like Intel wins big in terms of IPC.

Would still be interesting to see the results from a i7-7700K (half the cores, but higher IPC and higher clockspeed)... to compare at a similar cost level (a Ryzen 1700 system being still a bit cheaper than a comparable i7-7700K system).
db597 is offline   Reply With Quote
Old 2017-04-13, 03:19   #747
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

I don't understand this read on CPU speed. It was, and is running at 4.20GHz.
RAM is at 3200MHz.
Code:
[Wed Apr 12 22:03:33 2017]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
CPU speed: 4008.14 MHz, 4 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 32 KB
L2 cache size: 256 KB, L3 cache size: 8 MB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
TLBS: 64
Machine topology as determined by hwloc library:
 Machine#0 (total=12649168KB, Backend=Windows, hwlocVersion=1.11.6, ProcessName=prime95.exe)
  NUMANode#0 (local=12649168KB, total=12649168KB)
    Package#0 (CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=94, CPUModel="Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz", CPUStepping=3)
      L3 (size=8192KB, linesize=64, ways=16, Inclusive=1)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x00000003)
              PU#0 (cpuset: 0x00000001)
              PU#1 (cpuset: 0x00000002)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x0000000c)
              PU#2 (cpuset: 0x00000004)
              PU#3 (cpuset: 0x00000008)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x00000030)
              PU#4 (cpuset: 0x00000010)
              PU#5 (cpuset: 0x00000020)
        L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
          L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
            Core (cpuset: 0x000000c0)
              PU#6 (cpuset: 0x00000040)
              PU#7 (cpuset: 0x00000080)
Prime95 64-bit version 29.1, RdtscTiming=1
Timings for 1024K FFT length (1 cpu, 1 worker):  3.18 ms.  Throughput: 314.28 iter/sec.
Timings for 1024K FFT length (2 cpus, 1 worker):  1.67 ms.  Throughput: 599.56 iter/sec.
Timings for 1024K FFT length (3 cpus, 1 worker):  1.13 ms.  Throughput: 888.71 iter/sec.
Timings for 1024K FFT length (4 cpus, 1 worker):  0.86 ms.  Throughput: 1161.54 iter/sec.
Timings for 1280K FFT length (1 cpu, 1 worker):  4.04 ms.  Throughput: 247.48 iter/sec.
Timings for 1280K FFT length (2 cpus, 1 worker):  2.09 ms.  Throughput: 478.34 iter/sec.
Timings for 1280K FFT length (3 cpus, 1 worker):  1.44 ms.  Throughput: 695.49 iter/sec.
Timings for 1280K FFT length (4 cpus, 1 worker):  1.11 ms.  Throughput: 900.27 iter/sec.
Timings for 1536K FFT length (1 cpu, 1 worker):  4.89 ms.  Throughput: 204.35 iter/sec.
Timings for 1536K FFT length (2 cpus, 1 worker):  2.54 ms.  Throughput: 394.47 iter/sec.
Timings for 1536K FFT length (3 cpus, 1 worker):  1.73 ms.  Throughput: 579.18 iter/sec.
Timings for 1536K FFT length (4 cpus, 1 worker):  1.38 ms.  Throughput: 724.80 iter/sec.
Timings for 1792K FFT length (1 cpu, 1 worker):  6.14 ms.  Throughput: 162.89 iter/sec.
Timings for 1792K FFT length (2 cpus, 1 worker):  3.24 ms.  Throughput: 308.59 iter/sec.
Timings for 1792K FFT length (3 cpus, 1 worker):  2.17 ms.  Throughput: 461.04 iter/sec.
Timings for 1792K FFT length (4 cpus, 1 worker):  1.70 ms.  Throughput: 588.90 iter/sec.
Timings for 2048K FFT length (1 cpu, 1 worker):  6.52 ms.  Throughput: 153.46 iter/sec.
Timings for 2048K FFT length (2 cpus, 1 worker):  3.41 ms.  Throughput: 292.96 iter/sec.
Timings for 2048K FFT length (3 cpus, 1 worker):  2.36 ms.  Throughput: 423.56 iter/sec.
Timings for 2048K FFT length (4 cpus, 1 worker):  1.94 ms.  Throughput: 515.17 iter/sec.
Timings for 2560K FFT length (1 cpu, 1 worker):  8.59 ms.  Throughput: 116.35 iter/sec.
Timings for 2560K FFT length (2 cpus, 1 worker):  4.50 ms.  Throughput: 222.19 iter/sec.
Timings for 2560K FFT length (3 cpus, 1 worker):  3.05 ms.  Throughput: 327.92 iter/sec.
Timings for 2560K FFT length (4 cpus, 1 worker):  2.45 ms.  Throughput: 408.69 iter/sec.
Timings for 3072K FFT length (1 cpu, 1 worker): 10.24 ms.  Throughput: 97.65 iter/sec.
Timings for 3072K FFT length (2 cpus, 1 worker):  5.27 ms.  Throughput: 189.81 iter/sec.
Timings for 3072K FFT length (3 cpus, 1 worker):  3.62 ms.  Throughput: 276.07 iter/sec.
Timings for 3072K FFT length (4 cpus, 1 worker):  2.95 ms.  Throughput: 339.20 iter/sec.
[Wed Apr 12 22:08:44 2017]
Timings for 3584K FFT length (1 cpu, 1 worker): 12.36 ms.  Throughput: 80.90 iter/sec.
Timings for 3584K FFT length (2 cpus, 1 worker):  6.34 ms.  Throughput: 157.62 iter/sec.
Timings for 3584K FFT length (3 cpus, 1 worker):  4.33 ms.  Throughput: 230.80 iter/sec.
Timings for 3584K FFT length (4 cpus, 1 worker):  3.53 ms.  Throughput: 283.48 iter/sec.
Timings for 4096K FFT length (1 cpu, 1 worker): 14.18 ms.  Throughput: 70.50 iter/sec.
Timings for 4096K FFT length (2 cpus, 1 worker):  7.33 ms.  Throughput: 136.44 iter/sec.
Timings for 4096K FFT length (3 cpus, 1 worker):  5.01 ms.  Throughput: 199.63 iter/sec.
Timings for 4096K FFT length (4 cpus, 1 worker):  4.07 ms.  Throughput: 245.91 iter/sec.
kladner is offline   Reply With Quote
Old 2017-04-13, 03:36   #748
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

3,319 Posts
Default

Quote:
Originally Posted by kladner View Post
I don't understand this read on CPU speed. It was, and is running at 4.20GHz
According to Intel, Processor Base Frequency is 4.0GHz, Max Turbo Frequency is 4.2GHz.
I would guess that Prime95 reads the processor frequency on startup before starting the actual benchmark, and turbo doesn't kick in until the CPU is under load. You could use your favourite monitoring utility (e.g. CPU-Z) to monitor CPU frequency in realtime and see how it changes as you start/run the benchmark.
James Heinrich is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perpetual "interesting video" thread... Xyzzy Lounge 39 2021-03-12 14:19
LLR benchmark thread Oddball Riesel Prime Search 5 2010-08-02 00:11
Perpetual I'm pi**ed off thread rogue Soap Box 19 2009-10-28 19:17
Perpetual autostereogram thread... Xyzzy Lounge 10 2006-09-28 00:36
Perpetual ECM factoring challenge thread... Xyzzy Factoring 65 2005-09-05 08:16

All times are UTC. The time now is 07:14.

Mon Apr 12 07:14:35 UTC 2021 up 4 days, 1:55, 1 user, load averages: 2.64, 2.36, 2.08

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.