mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2020-03-25, 16:59   #793
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

3·72·67 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
Could you do it on range of FFTs (e.g. 2048 to 8192), and without testing all implementations?
It will take a little while, but under lockdown I have time on my hands.
EDIT: Here ya go.
Attached Files
File Type: txt CPU clock bench 02.txt (9.6 KB, 34 views)

Last fiddled with by kladner on 2020-03-25 at 17:49
kladner is offline   Reply With Quote
Old 2020-03-25, 21:18   #794
Viliam Furik
 
Jul 2018
Martin, Slovakia

2×13 Posts
Default

Quote:
Originally Posted by kladner View Post
It will take a little while, but under lockdown I have time on my hands.
EDIT: Here ya go.
Thank you. I've looked at the results, and it seems like there is little advantage from RAM speed with those speeds and that CPU. And for some reason, the 4000 turns out fastest on some FFT lengths, but it is not that much so it may be a measurement error (some background tasks).

Is it possible that when RAM is faster than CPU clock, it will not be used the same as with faster CPU? I'm thinking about this exact thing, because I have 3200 MHz RAM now, and I want to know whether I should buy 4000 MHz or 4400 MHz RAM because my CPU is manually overclocked to 4,1 GHz (Ryzen 9 3900X).
Viliam Furik is offline   Reply With Quote
Old 2020-03-25, 22:02   #795
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×11×311 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
Thank you. I've looked at the results, and it seems like there is little advantage from RAM speed with those speeds and that CPU.
Note the massive L3 cache on that chip. It is not surprising that RAM speed is not a factor.
Prime95 is offline   Reply With Quote
Old 2020-03-26, 01:51   #796
axn
 
axn's Avatar
 
Jun 2003

3·1,531 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Note the massive L3 cache on that chip. It is not surprising that RAM speed is not a factor.
L3 is 8MB on a 6700K, which is mediocre.
axn is offline   Reply With Quote
Old 2020-03-26, 02:25   #797
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

53×79 Posts
Default

I think he was referring to the Ryzen?
Edit: 70MB L3, if my search result is to be believed.

Last fiddled with by VBCurtis on 2020-03-26 at 02:26
VBCurtis is offline   Reply With Quote
Old 2020-03-26, 03:04   #798
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×11×311 Posts
Default

Quote:
Originally Posted by axn View Post
L3 is 8MB on a 6700K, which is mediocre.
Sorry, confused the benchmark with JCoveiro's Ryzen.
Prime95 is offline   Reply With Quote
Old 2020-03-26, 03:40   #799
axn
 
axn's Avatar
 
Jun 2003

10001111100012 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Sorry, confused the benchmark with JCoveiro's Ryzen.
Ok, makes sense. That still leaves the question of why mem speed is not affecting performance. I suppose, with just 4 cores, all of the tested RAM speeds are sufficient to feed the CPU.

Quote:
Originally Posted by VBCurtis View Post
I think he was referring to the Ryzen?
Edit: 70MB L3, if my search result is to be believed.
Ryzen L3 are all power-of-2, but since it is a victim cache, AMD specs list L2+L3 as a single "cache" number. So if you see 70MB for a processor, that is 64MB L3 + 6MB (12*512KB) L2.
axn is offline   Reply With Quote
Old 2020-03-26, 16:15   #800
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

984910 Posts
Default

I have to note that the RAM speed was the same in all these runs. Only the CPU clock was different. I took the very similar results as an indication that this system is memory-bound.
EDIT: Also, these tests were not run under strict lab-like conditions. I did shut down obvious cycle-stealing apps like performance monitors and browsers. In line with normal operation on the machine I deliberately left mfaktc running on the GPU as part of the environment. Allowance has to be made for margin-of-error.

Last fiddled with by kladner on 2020-03-26 at 16:21
kladner is offline   Reply With Quote
Old 2020-03-30, 04:12   #801
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

11100110112 Posts
Default

I didn't see any benchmarks here for the i5-7500, so here goes:

Code:
[Sun Mar 29 19:08:40 2020]
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
CPU speed: 3371.20 MHz, 4 cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 4x32 KB, L2 cache size: 4x256 KB, L3 cache size: 6 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=28406668KB, Backend=Windows, hwlocVersion=2.0.4, ProcessName=prime95.exe)
  Package (total=28406668KB, CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=158, CPUModel="Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz", CPUStepping=9)
    L3 (size=6144KB, linesize=64, ways=12, Inclusive=1)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000004)
            PU#2 (cpuset: 0x00000004)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000008)
            PU#3 (cpuset: 0x00000008)
Prime95 64-bit version 29.8, RdtscTiming=1
Timings for 2048K FFT length (4 cores, 1 worker):  2.53 ms.  Throughput: 395.74 iter/sec.
Timings for 2048K FFT length (4 cores, 4 workers): 10.69, 10.59, 10.39, 10.16 ms.  Throughput: 382.59 iter/sec.
Timings for 2304K FFT length (4 cores, 1 worker):  2.89 ms.  Throughput: 346.50 iter/sec.
Timings for 2304K FFT length (4 cores, 4 workers): 11.58, 11.32, 11.11, 11.06 ms.  Throughput: 355.15 iter/sec.
Timings for 2400K FFT length (4 cores, 1 worker):  3.00 ms.  Throughput: 333.51 iter/sec.
Timings for 2400K FFT length (4 cores, 4 workers): 12.26, 11.89, 11.78, 11.95 ms.  Throughput: 334.20 iter/sec.
Timings for 2560K FFT length (4 cores, 1 worker):  3.08 ms.  Throughput: 324.68 iter/sec.
Timings for 2560K FFT length (4 cores, 4 workers): 13.10, 12.72, 12.66, 12.57 ms.  Throughput: 313.49 iter/sec.
Timings for 2688K FFT length (4 cores, 1 worker):  3.39 ms.  Throughput: 295.12 iter/sec.
Timings for 2688K FFT length (4 cores, 4 workers): 13.30, 13.26, 13.18, 13.13 ms.  Throughput: 302.63 iter/sec.
Timings for 2880K FFT length (4 cores, 1 worker):  3.57 ms.  Throughput: 280.17 iter/sec.
Timings for 2880K FFT length (4 cores, 4 workers): 15.30, 14.57, 14.32, 14.28 ms.  Throughput: 273.86 iter/sec.
Timings for 3072K FFT length (4 cores, 1 worker):  3.74 ms.  Throughput: 267.17 iter/sec.
Timings for 3072K FFT length (4 cores, 4 workers): 15.53, 15.41, 15.03, 14.97 ms.  Throughput: 262.61 iter/sec.
Timings for 3200K FFT length (4 cores, 1 worker):  4.25 ms.  Throughput: 235.31 iter/sec.
Timings for 3200K FFT length (4 cores, 4 workers): 16.82, 16.34, 16.11, 16.10 ms.  Throughput: 244.87 iter/sec.
Timings for 3360K FFT length (4 cores, 1 worker):  4.45 ms.  Throughput: 224.61 iter/sec.
Timings for 3360K FFT length (4 cores, 4 workers): 17.95, 17.58, 17.29, 17.20 ms.  Throughput: 228.55 iter/sec.
[Sun Mar 29 19:13:44 2020]
Timings for 3456K FFT length (4 cores, 1 worker):  4.34 ms.  Throughput: 230.16 iter/sec.
Timings for 3456K FFT length (4 cores, 4 workers): 17.76, 17.24, 17.24, 17.07 ms.  Throughput: 230.91 iter/sec.
Timings for 3584K FFT length (4 cores, 1 worker):  4.70 ms.  Throughput: 212.98 iter/sec.
Timings for 3584K FFT length (4 cores, 4 workers): 18.46, 17.98, 17.77, 17.95 ms.  Throughput: 221.75 iter/sec.
Timings for 3840K FFT length (4 cores, 1 worker):  4.86 ms.  Throughput: 205.63 iter/sec.
Timings for 3840K FFT length (4 cores, 4 workers): 20.57, 20.09, 19.75, 19.80 ms.  Throughput: 199.52 iter/sec.
Timings for 4096K FFT length (4 cores, 1 worker):  5.18 ms.  Throughput: 192.93 iter/sec.
Timings for 4096K FFT length (4 cores, 4 workers): 21.19, 21.17, 20.45, 20.43 ms.  Throughput: 192.28 iter/sec.
Timings for 4480K FFT length (4 cores, 1 worker):  5.90 ms.  Throughput: 169.36 iter/sec.
Timings for 4480K FFT length (4 cores, 4 workers): 23.51, 22.80, 22.65, 22.76 ms.  Throughput: 174.50 iter/sec.
Timings for 4608K FFT length (4 cores, 1 worker):  5.78 ms.  Throughput: 173.03 iter/sec.
Timings for 4608K FFT length (4 cores, 4 workers): 24.61, 24.00, 23.96, 23.51 ms.  Throughput: 166.57 iter/sec.
Timings for 4800K FFT length (4 cores, 1 worker):  6.16 ms.  Throughput: 162.33 iter/sec.
Timings for 4800K FFT length (4 cores, 4 workers): 24.56, 24.78, 24.18, 24.02 ms.  Throughput: 164.06 iter/sec.
Timings for 5120K FFT length (4 cores, 1 worker):  6.81 ms.  Throughput: 146.90 iter/sec.
Timings for 5120K FFT length (4 cores, 4 workers): 27.33, 26.39, 26.15, 25.96 ms.  Throughput: 151.23 iter/sec.
Timings for 5376K FFT length (4 cores, 1 worker):  7.06 ms.  Throughput: 141.55 iter/sec.
Timings for 5376K FFT length (4 cores, 4 workers): 29.35, 28.62, 28.16, 28.20 ms.  Throughput: 139.97 iter/sec.
Timings for 5760K FFT length (4 cores, 1 worker):  7.67 ms.  Throughput: 130.29 iter/sec.
[Sun Mar 29 19:18:56 2020]
Timings for 5760K FFT length (4 cores, 4 workers): 31.20, 30.60, 30.13, 30.14 ms.  Throughput: 131.10 iter/sec.
Timings for 6144K FFT length (4 cores, 1 worker):  8.59 ms.  Throughput: 116.37 iter/sec.
Timings for 6144K FFT length (4 cores, 4 workers): 33.18, 33.27, 32.70, 32.32 ms.  Throughput: 121.72 iter/sec.
Timings for 6400K FFT length (4 cores, 1 worker):  8.38 ms.  Throughput: 119.28 iter/sec.
Timings for 6400K FFT length (4 cores, 4 workers): 34.66, 34.26, 33.99, 33.72 ms.  Throughput: 117.11 iter/sec.
Timings for 6720K FFT length (4 cores, 1 worker):  9.01 ms.  Throughput: 110.97 iter/sec.
Timings for 6720K FFT length (4 cores, 4 workers): 36.54, 36.32, 35.39, 35.57 ms.  Throughput: 111.27 iter/sec.
Timings for 6912K FFT length (4 cores, 1 worker):  9.47 ms.  Throughput: 105.60 iter/sec.
Timings for 6912K FFT length (4 cores, 4 workers): 37.08, 36.96, 36.90, 36.56 ms.  Throughput: 108.48 iter/sec.
Timings for 7168K FFT length (4 cores, 1 worker):  9.49 ms.  Throughput: 105.35 iter/sec.
Timings for 7168K FFT length (4 cores, 4 workers): 39.29, 39.24, 37.98, 38.03 ms.  Throughput: 103.57 iter/sec.
Timings for 7680K FFT length (4 cores, 1 worker): 10.01 ms.  Throughput: 99.90 iter/sec.
Timings for 7680K FFT length (4 cores, 4 workers): 42.06, 40.07, 39.64, 39.48 ms.  Throughput: 99.29 iter/sec.
Timings for 8064K FFT length (4 cores, 1 worker): 11.01 ms.  Throughput: 90.86 iter/sec.
Timings for 8064K FFT length (4 cores, 4 workers): 44.00, 44.86, 42.86, 42.76 ms.  Throughput: 91.73 iter/sec.
Timings for 8192K FFT length (4 cores, 1 worker): 11.02 ms.  Throughput: 90.78 iter/sec.
Timings for 8192K FFT length (4 cores, 4 workers): 45.76, 46.28, 43.97, 44.50 ms.  Throughput: 88.68 iter/sec.
Rodrigo is offline   Reply With Quote
Old 2020-05-01, 01:34   #802
tjddbwls
 
Apr 2020

1 Posts
Default

This is from my first computer build in 10 years.

Corsair Carbide 275R
AMD Ryzen 9 3950X (stock, PBO at auto)
Asrock Taichi X570
Noctua NH-D15
2x8GB Corsair Vengeance LPX 3600
PNY nVidia Quadro K1200
512GB Samsung 970Pro Nvme SSD

Do my results seem... a little slow?

Throughput-Test:
Code:
AMD Ryzen 9 3950X 16-Core Processor            
CPU speed: 4274.04 MHz, 16 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 16x32 KB, L2 cache size: 16x512 KB, L3 cache size: 4x16 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=14009328KB, Backend=Windows, hwlocVersion=2.0.4, ProcessName=prime95.exe)
  Package (total=14009328KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=113, CPUModel="AMD Ryzen 9 3950X 16-Core Processor            ", CPUStepping=0)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00003000)
            PU#12 (cpuset: 0x00001000)
            PU#13 (cpuset: 0x00002000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000c000)
            PU#14 (cpuset: 0x00004000)
            PU#15 (cpuset: 0x00008000)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00030000)
            PU#16 (cpuset: 0x00010000)
            PU#17 (cpuset: 0x00020000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000c0000)
            PU#18 (cpuset: 0x00040000)
            PU#19 (cpuset: 0x00080000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00300000)
            PU#20 (cpuset: 0x00100000)
            PU#21 (cpuset: 0x00200000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00c00000)
            PU#22 (cpuset: 0x00400000)
            PU#23 (cpuset: 0x00800000)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x03000000)
            PU#24 (cpuset: 0x01000000)
            PU#25 (cpuset: 0x02000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0c000000)
            PU#26 (cpuset: 0x04000000)
            PU#27 (cpuset: 0x08000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x30000000)
            PU#28 (cpuset: 0x10000000)
            PU#29 (cpuset: 0x20000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0xc0000000)
            PU#30 (cpuset: 0x40000000)
            PU#31 (cpuset: 0x80000000)
Prime95 64-bit version 29.8, RdtscTiming=1
Timings for 2048K FFT length (16 cores, 1 worker):  1.29 ms.  Throughput: 777.85 iter/sec.
Timings for 2048K FFT length (16 cores, 2 workers):  1.23,  1.23 ms.  Throughput: 1621.36 iter/sec.
Timings for 2240K FFT length (16 cores, 1 worker):  1.18 ms.  Throughput: 847.91 iter/sec.
Timings for 2240K FFT length (16 cores, 2 workers):  1.53,  1.54 ms.  Throughput: 1302.56 iter/sec.
Timings for 2304K FFT length (16 cores, 1 worker):  1.21 ms.  Throughput: 828.81 iter/sec.
Timings for 2304K FFT length (16 cores, 2 workers):  1.64,  1.64 ms.  Throughput: 1222.46 iter/sec.
Timings for 2400K FFT length (16 cores, 1 worker):  1.25 ms.  Throughput: 800.44 iter/sec.
Timings for 2400K FFT length (16 cores, 2 workers):  1.74,  1.66 ms.  Throughput: 1177.02 iter/sec.
Timings for 2560K FFT length (16 cores, 1 worker):  1.33 ms.  Throughput: 752.11 iter/sec.
Timings for 2560K FFT length (16 cores, 2 workers):  1.87,  1.87 ms.  Throughput: 1071.72 iter/sec.
Timings for 2688K FFT length (16 cores, 1 worker):  1.39 ms.  Throughput: 718.46 iter/sec.
Timings for 2688K FFT length (16 cores, 2 workers):  2.03,  2.01 ms.  Throughput: 989.08 iter/sec.
Timings for 2800K FFT length (16 cores, 1 worker):  1.59 ms.  Throughput: 628.76 iter/sec.
Timings for 2800K FFT length (16 cores, 2 workers):  2.24,  2.59 ms.  Throughput: 832.53 iter/sec.
Timings for 2880K FFT length (16 cores, 1 worker):  1.49 ms.  Throughput: 672.83 iter/sec.
Timings for 2880K FFT length (16 cores, 2 workers):  2.58,  2.60 ms.  Throughput: 773.22 iter/sec.
Timings for 3072K FFT length (16 cores, 1 worker):  1.47 ms.  Throughput: 682.13 iter/sec.
Timings for 3072K FFT length (16 cores, 2 workers):  2.49,  2.48 ms.  Throughput: 804.90 iter/sec.
Timings for 3200K FFT length (16 cores, 1 worker):  1.67 ms.  Throughput: 599.22 iter/sec.
Timings for 3200K FFT length (16 cores, 2 workers):  2.97,  2.90 ms.  Throughput: 681.67 iter/sec.
Timings for 3360K FFT length (16 cores, 1 worker):  1.71 ms.  Throughput: 583.36 iter/sec.
Timings for 3360K FFT length (16 cores, 2 workers):  3.62,  3.43 ms.  Throughput: 568.04 iter/sec.
Timings for 3584K FFT length (16 cores, 1 worker):  1.70 ms.  Throughput: 588.05 iter/sec.
Timings for 3584K FFT length (16 cores, 2 workers):  3.98,  3.78 ms.  Throughput: 515.97 iter/sec.
Timings for 3840K FFT length (16 cores, 1 worker):  1.96 ms.  Throughput: 509.35 iter/sec.
Timings for 3840K FFT length (16 cores, 2 workers):  5.25,  4.95 ms.  Throughput: 392.56 iter/sec.
Timings for 4096K FFT length (16 cores, 1 worker):  1.96 ms.  Throughput: 511.07 iter/sec.
Timings for 4096K FFT length (16 cores, 2 workers):  6.40,  6.07 ms.  Throughput: 320.83 iter/sec.
Timings for 4480K FFT length (16 cores, 1 worker):  2.03 ms.  Throughput: 491.95 iter/sec.
Timings for 4480K FFT length (16 cores, 2 workers):  8.61,  8.39 ms.  Throughput: 235.38 iter/sec.
Timings for 4608K FFT length (16 cores, 1 worker):  2.40 ms.  Throughput: 417.09 iter/sec.
Timings for 4608K FFT length (16 cores, 2 workers):  8.90,  8.42 ms.  Throughput: 231.20 iter/sec.
Timings for 4800K FFT length (16 cores, 1 worker):  2.52 ms.  Throughput: 397.12 iter/sec.
Timings for 4800K FFT length (16 cores, 2 workers):  9.87,  9.45 ms.  Throughput: 207.23 iter/sec.
Timings for 5120K FFT length (16 cores, 1 worker):  2.78 ms.  Throughput: 360.14 iter/sec.
Timings for 5120K FFT length (16 cores, 2 workers): 11.35, 10.97 ms.  Throughput: 179.23 iter/sec.
Timings for 5376K FFT length (16 cores, 1 worker):  3.09 ms.  Throughput: 323.57 iter/sec.
Timings for 5376K FFT length (16 cores, 2 workers): 12.87, 12.43 ms.  Throughput: 158.21 iter/sec.
Timings for 5600K FFT length (16 cores, 1 worker):  2.84 ms.  Throughput: 351.59 iter/sec.
Timings for 5600K FFT length (16 cores, 2 workers): 14.04, 13.75 ms.  Throughput: 143.92 iter/sec.
Timings for 5760K FFT length (16 cores, 1 worker):  3.13 ms.  Throughput: 319.21 iter/sec.
Timings for 5760K FFT length (16 cores, 2 workers): 15.20, 14.78 ms.  Throughput: 133.49 iter/sec.
Timings for 6144K FFT length (16 cores, 1 worker):  3.73 ms.  Throughput: 268.08 iter/sec.
Timings for 6144K FFT length (16 cores, 2 workers): 15.93, 15.54 ms.  Throughput: 127.15 iter/sec.
Timings for 6400K FFT length (16 cores, 1 worker):  3.93 ms.  Throughput: 254.14 iter/sec.
Timings for 6400K FFT length (16 cores, 2 workers): 17.56, 17.01 ms.  Throughput: 115.74 iter/sec.
Timings for 6720K FFT length (16 cores, 1 worker):  4.36 ms.  Throughput: 229.36 iter/sec.
Timings for 6720K FFT length (16 cores, 2 workers): 18.74, 18.20 ms.  Throughput: 108.29 iter/sec.
Timings for 7168K FFT length (16 cores, 1 worker):  5.18 ms.  Throughput: 193.11 iter/sec.
Timings for 7168K FFT length (16 cores, 2 workers): 20.08, 19.65 ms.  Throughput: 100.69 iter/sec.
Timings for 7680K FFT length (16 cores, 1 worker):  6.66 ms.  Throughput: 150.18 iter/sec.
Timings for 7680K FFT length (16 cores, 2 workers): 22.89, 22.23 ms.  Throughput: 88.68 iter/sec.
Timings for 8000K FFT length (16 cores, 1 worker):  6.70 ms.  Throughput: 149.21 iter/sec.
Timings for 8000K FFT length (16 cores, 2 workers): 23.89, 23.05 ms.  Throughput: 85.26 iter/sec.
Timings for 8064K FFT length (16 cores, 1 worker):  6.98 ms.  Throughput: 143.28 iter/sec.
Timings for 8064K FFT length (16 cores, 2 workers): 24.07, 23.54 ms.  Throughput: 84.02 iter/sec.
Timings for 8192K FFT length (16 cores, 1 worker):  7.22 ms.  Throughput: 138.45 iter/sec.
Timings for 8192K FFT length (16 cores, 2 workers): 23.60, 23.62 ms.  Throughput: 84.72 iter/sec.
FFT-Timings:
Code:
AMD Ryzen 9 3950X 16-Core Processor            
CPU speed: 4297.50 MHz, 16 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 16x32 KB, L2 cache size: 16x512 KB, L3 cache size: 4x16 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=14009328KB, Backend=Windows, hwlocVersion=2.0.4, ProcessName=prime95.exe)
  Package (total=14009328KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=113, CPUModel="AMD Ryzen 9 3950X 16-Core Processor            ", CPUStepping=0)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00003000)
            PU#12 (cpuset: 0x00001000)
            PU#13 (cpuset: 0x00002000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000c000)
            PU#14 (cpuset: 0x00004000)
            PU#15 (cpuset: 0x00008000)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00030000)
            PU#16 (cpuset: 0x00010000)
            PU#17 (cpuset: 0x00020000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000c0000)
            PU#18 (cpuset: 0x00040000)
            PU#19 (cpuset: 0x00080000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00300000)
            PU#20 (cpuset: 0x00100000)
            PU#21 (cpuset: 0x00200000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00c00000)
            PU#22 (cpuset: 0x00400000)
            PU#23 (cpuset: 0x00800000)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x03000000)
            PU#24 (cpuset: 0x01000000)
            PU#25 (cpuset: 0x02000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0c000000)
            PU#26 (cpuset: 0x04000000)
            PU#27 (cpuset: 0x08000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x30000000)
            PU#28 (cpuset: 0x10000000)
            PU#29 (cpuset: 0x20000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0xc0000000)
            PU#30 (cpuset: 0x40000000)
            PU#31 (cpuset: 0x80000000)
Prime95 64-bit version 29.8, RdtscTiming=1
Timing FFTs using 16 threads on 16 cores.
Best time for 2048K FFT length: 1.152 ms., avg: 1.176 ms.
Best time for 2240K FFT length: 1.193 ms., avg: 1.223 ms.
Best time for 2304K FFT length: 1.196 ms., avg: 1.215 ms.
Best time for 2400K FFT length: 1.274 ms., avg: 1.288 ms.
Best time for 2560K FFT length: 1.357 ms., avg: 1.389 ms.
Best time for 2688K FFT length: 1.392 ms., avg: 1.409 ms.
Best time for 2800K FFT length: 1.578 ms., avg: 1.600 ms.
Best time for 2880K FFT length: 1.488 ms., avg: 1.510 ms.
Best time for 3072K FFT length: 1.466 ms., avg: 1.480 ms.
Best time for 3200K FFT length: 1.640 ms., avg: 1.664 ms.
Best time for 3360K FFT length: 1.720 ms., avg: 1.737 ms.
Best time for 3584K FFT length: 1.704 ms., avg: 1.734 ms.
Best time for 3840K FFT length: 1.913 ms., avg: 1.942 ms.
Best time for 4096K FFT length: 1.915 ms., avg: 1.959 ms.
Best time for 4480K FFT length: 1.954 ms., avg: 1.992 ms.
Best time for 4608K FFT length: 2.290 ms., avg: 2.330 ms.
Best time for 4800K FFT length: 2.382 ms., avg: 2.479 ms.
Best time for 5120K FFT length: 2.639 ms., avg: 2.734 ms.
Best time for 5376K FFT length: 2.856 ms., avg: 3.057 ms.
Best time for 5600K FFT length: 2.747 ms., avg: 2.885 ms.
Best time for 5760K FFT length: 2.965 ms., avg: 3.059 ms.
Best time for 6144K FFT length: 3.477 ms., avg: 3.746 ms.
Best time for 6400K FFT length: 3.589 ms., avg: 3.834 ms.
Best time for 6720K FFT length: 4.012 ms., avg: 4.177 ms.
Best time for 7168K FFT length: 4.820 ms., avg: 5.011 ms.
Best time for 7680K FFT length: 6.110 ms., avg: 6.286 ms.
Best time for 8000K FFT length: 6.301 ms., avg: 6.461 ms.
Best time for 8064K FFT length: 6.630 ms., avg: 6.760 ms.
Best time for 8192K FFT length: 6.941 ms., avg: 7.102 ms.
TF benchmark:
Code:
AMD Ryzen 9 3950X 16-Core Processor            
CPU speed: 4278.18 MHz, 16 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 16x32 KB, L2 cache size: 16x512 KB, L3 cache size: 4x16 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=14009328KB, Backend=Windows, hwlocVersion=2.0.4, ProcessName=prime95.exe)
  Package (total=14009328KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=113, CPUModel="AMD Ryzen 9 3950X 16-Core Processor            ", CPUStepping=0)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00003000)
            PU#12 (cpuset: 0x00001000)
            PU#13 (cpuset: 0x00002000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000c000)
            PU#14 (cpuset: 0x00004000)
            PU#15 (cpuset: 0x00008000)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00030000)
            PU#16 (cpuset: 0x00010000)
            PU#17 (cpuset: 0x00020000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000c0000)
            PU#18 (cpuset: 0x00040000)
            PU#19 (cpuset: 0x00080000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00300000)
            PU#20 (cpuset: 0x00100000)
            PU#21 (cpuset: 0x00200000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00c00000)
            PU#22 (cpuset: 0x00400000)
            PU#23 (cpuset: 0x00800000)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x03000000)
            PU#24 (cpuset: 0x01000000)
            PU#25 (cpuset: 0x02000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0c000000)
            PU#26 (cpuset: 0x04000000)
            PU#27 (cpuset: 0x08000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x30000000)
            PU#28 (cpuset: 0x10000000)
            PU#29 (cpuset: 0x20000000)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0xc0000000)
            PU#30 (cpuset: 0x40000000)
            PU#31 (cpuset: 0x80000000)
Prime95 64-bit version 29.8, RdtscTiming=1
Best time for 61 bit trial factors: 0.714 ms.
Best time for 62 bit trial factors: 0.730 ms.
Best time for 63 bit trial factors: 0.717 ms.
Best time for 64 bit trial factors: 0.719 ms.
Best time for 65 bit trial factors: 0.711 ms.
Best time for 66 bit trial factors: 0.698 ms.
Best time for 67 bit trial factors: 0.695 ms.
Best time for 75 bit trial factors: 0.692 ms.
Best time for 76 bit trial factors: 0.697 ms.
Best time for 77 bit trial factors: 0.697 ms.
tjddbwls is offline   Reply With Quote
Old 2020-05-01, 22:38   #803
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
Ͳօɾօղէօ

2×7×199 Posts
Default

Quote:
Originally Posted by tjddbwls View Post
Do my results seem... a little slow?
Your 1 worker results are far faster than your 2 worker results. I'd run only a single worker after seeing your results.

The higher core count Ryzen and Threadripper parts are very memory bandwidth starved for Prime95 (same with Intel parts). You might get a little more performance with dual rank memory, or four sticks of memory.

I'd be curious to see how 1 worker with 6, 7, and 8 cores performs. The sweet spot for the available memory bandwidth is probably around 7 or 8 cores.
Mark Rose is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
LLR benchmark thread Oddball Riesel Prime Search 5 2010-08-02 00:11
Perpetual I'm pi**ed off thread rogue Soap Box 19 2009-10-28 19:17
Perpetual "interesting video" thread... Xyzzy Lounge 9 2006-12-24 20:06
Perpetual autostereogram thread... Xyzzy Lounge 10 2006-09-28 00:36
Perpetual ECM factoring challenge thread... Xyzzy Factoring 65 2005-09-05 08:16

All times are UTC. The time now is 13:07.

Wed Jun 3 13:07:17 UTC 2020 up 70 days, 10:40, 2 users, load averages: 1.47, 1.57, 1.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.