mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2020-08-28, 16:05   #815
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

467 Posts
Default

With 30.3b3 (but also with Version 29), I get really bad multi core scaling on an Intel i5-9500:

Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
CPU speed: 4073.68 MHz, 6 cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 6x32 KB, L2 cache size: 6x256 KB, L3 cache size: 9 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=3138356KB, Backend=Windows, hwlocVersion=2.2.0, ProcessName=prime95.exe)
  Package (total=3138356KB, CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=158, CPUModel="Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz", CPUStepping=10)
    L3 (size=9216KB, linesize=64, ways=12, Inclusive=1)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000004)
            PU#2 (cpuset: 0x00000004)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000008)
            PU#3 (cpuset: 0x00000008)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000010)
            PU#4 (cpuset: 0x00000010)
      L2 (size=256KB, linesize=64, ways=4, Inclusive=0)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000020)
            PU#5 (cpuset: 0x00000020)
Prime95 64-bit version 30.3, RdtscTiming=1
Timings for 3072K FFT length (1 core, 1 worker): 10.92 ms.  Throughput: 91.61 iter/sec.
Timings for 3072K FFT length (2 cores, 1 worker):  7.80 ms.  Throughput: 128.14 iter/sec.
Timings for 3072K FFT length (3 cores, 1 worker):  7.61 ms.  Throughput: 131.42 iter/sec.
Timings for 3072K FFT length (6 cores, 1 worker):  7.98 ms.  Throughput: 125.37 iter/sec.
[snip of the prelude data again...]
Timings for 6144K FFT length (1 core, 1 worker): 22.32 ms.  Throughput: 44.80 iter/sec.
Timings for 6144K FFT length (2 cores, 1 worker): 15.69 ms.  Throughput: 63.74 iter/sec.
Timings for 6144K FFT length (3 cores, 1 worker): 15.86 ms.  Throughput: 63.04 iter/sec.
Timings for 6144K FFT length (6 cores, 1 worker): 17.18 ms.  Throughput: 58.20 iter/sec.
How can it be that it does not gain anything from more than two cores? Temperatures are okay, stress testing for more than 24 hours was flawless.

Last fiddled with by kruoli on 2020-08-28 at 16:06 Reason: Missing letter.
kruoli is offline   Reply With Quote
Old 2020-08-28, 16:41   #816
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11×673 Posts
Default

Quote:
Originally Posted by kruoli View Post
How can it be that it does not gain anything from more than two cores? Temperatures are okay, stress testing for more than 24 hours was flawless.
Just one RAM slot filled?
Prime95 is offline   Reply With Quote
Old 2020-08-29, 08:49   #817
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

1110100112 Posts
Default

Yes. So it is a severe memory bottleneck? It's a work machine we are preparing for delivery. Beforehand, we always run some tests. I'll bring that up at work, since our software (which shall be running on that machine) is also vectorization-aware and will be limited by that, too, I guess (of course much less than the highly-optimized gwnum-code).

Last fiddled with by kruoli on 2020-08-29 at 08:50 Reason: Clarified intentions.
kruoli is offline   Reply With Quote
Old 2020-09-10, 17:40   #818
jwnutter
 
"Joe"
Oct 2019
United States

22·19 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
I've recently run a few throughput benchmarks on the 3900X.
I attached graphs of the results.
Viliam, how did you go about parsing results.bench? I'm trying to do this in excel on about 4,500 records but I assume there has to be an easier way to do this as I'm not finding a single clean delimiter. Does P95 provide a parsed benchmark output file that I'm overlooking?
jwnutter is offline   Reply With Quote
Old 2020-09-11, 18:45   #819
Viliam Furik
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

2·13·17 Posts
Default

I have done it the hard way... I have copied them manually into the spreadsheet, one by one.

If you want, you can send me the rows by email and I will write some quick Python code to put them in usable .csv format, and then I will send you back a spreadsheet.
Viliam Furik is offline   Reply With Quote
Old 2020-09-11, 23:07   #820
jwnutter
 
"Joe"
Oct 2019
United States

22×19 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
I have done it the hard way... I have copied them manually into the spreadsheet, one by one.

If you want, you can send me the rows by email and I will write some quick Python code to put them in usable .csv format, and then I will send you back a spreadsheet.
Thanks for the offer Viliam, but I've already completed the task the hard way.

Well not exactly, I found enough unique delimiters to parse the data in stages. All I need to do now is load it into a statistical analysis tool and see what it tells me. That said, based on a cursory review, I'm fairly confident that 1 - 8 core worker is optimal for this i9-9900KF.

Thanks again!
jwnutter is offline   Reply With Quote
Old 2020-09-12, 18:19   #821
jwnutter
 
"Joe"
Oct 2019
United States

10011002 Posts
Default

Well, assuming I'm looking at these benchmark results through an appropriate lens, it appears that 1 worker is (on average) a ~24% improvement over the next best alternative of 2 workers and HT carries a ~13% throughput penalty for this i9-9900KF.
Attached Thumbnails
Click image for larger version

Name:	Untitled.png
Views:	134
Size:	111.7 KB
ID:	23312  
jwnutter is offline   Reply With Quote
Old 2020-09-12, 23:56   #822
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

292910 Posts
Default

Quote:
Originally Posted by jwnutter View Post
Well, assuming I'm looking at these benchmark results through an appropriate lens, it appears that 1 worker is (on average) a ~24% improvement over the next best alternative of 2 workers and HT carries a ~13% throughput penalty for this i9-9900KF.
Sounds reasonable.
Mark Rose is offline   Reply With Quote
Old 2020-10-13, 11:04   #823
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

14258 Posts
Default AMD Ryzen 4700u (8C8T Zen 2 mobile), 2x16GB 3200 DDR4 SO-DIMM CL22

Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks
AMD Ryzen 7 4700U with Radeon Graphics         
CPU speed: 4192.05 MHz, 8 cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 8x32 KB, L2 cache size: 8x512 KB, L3 cache size: 2x4 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=32354192KB, DMIProductName="MINIPC PN50", DMIProductVersion=0409, DMIBoardVendor="ASUSTeK COMPUTER INC.", DMIBoardName=PN50, DMIBoardVersion="To be filled by O.E.M.", DMIBoardAssetTag="Default string", DMIChassisVendor="Default string", DMIChassisType=35, DMIChassisVersion="Default string", DMIChassisAssetTag="Default string", DMIBIOSVendor="ASUSTeK COMPUTER INC.", DMIBIOSVersion=0409, DMIBIOSDate=06/30/2020, DMISysVendor="ASUSTeK COMPUTER INC.", Backend=Linux, LinuxCgroup=/, OSName=Linux, OSRelease=5.4.0-48-generic, OSVersion="#52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020", HostName=pn50, Architecture=x86_64, hwlocVersion=2.0.4, ProcessName=mprime)
  Package#0 (total=32354192KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=96, CPUModel="AMD Ryzen 7 4700U with Radeon Graphics         ", CPUStepping=1)
    L3 (size=4096KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#0 (cpuset: 0x00000001)
            PU#0 (cpuset: 0x00000001)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#1 (cpuset: 0x00000002)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#2 (cpuset: 0x00000004)
            PU#2 (cpuset: 0x00000004)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#3 (cpuset: 0x00000008)
            PU#3 (cpuset: 0x00000008)
    L3 (size=4096KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#4 (cpuset: 0x00000010)
            PU#4 (cpuset: 0x00000010)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#5 (cpuset: 0x00000020)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#6 (cpuset: 0x00000040)
            PU#6 (cpuset: 0x00000040)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core#7 (cpuset: 0x00000080)
            PU#7 (cpuset: 0x00000080)
Prime95 64-bit version 30.3, RdtscTiming=1
Timings for 256K FFT length (8 cores, 1 worker):  0.28 ms.  Throughput: 3533.70 iter/sec.
Timings for 256K FFT length (8 cores, 2 workers):  0.33,  0.33 ms.  Throughput: 6127.31 iter/sec.
Timings for 256K FFT length (8 cores, 8 workers):  2.22,  2.23,  2.24,  2.24,  2.25,  2.25,  2.24,  2.27 ms.  Throughput: 3564.11 iter/sec.
Timings for 280K FFT length (8 cores, 1 worker):  0.36 ms.  Throughput: 2755.54 iter/sec.
Timings for 280K FFT length (8 cores, 2 workers):  0.47,  0.47 ms.  Throughput: 4254.04 iter/sec.
Timings for 280K FFT length (8 cores, 8 workers):  2.67,  2.65,  2.67,  2.67,  2.69,  2.68,  2.72,  2.71 ms.  Throughput: 2982.84 iter/sec.
Timings for 288K FFT length (8 cores, 1 worker):  0.36 ms.  Throughput: 2790.67 iter/sec.
Timings for 288K FFT length (8 cores, 2 workers):  0.49,  0.49 ms.  Throughput: 4106.36 iter/sec.
Timings for 288K FFT length (8 cores, 8 workers):  2.65,  2.65,  2.65,  2.64,  2.66,  2.67,  2.66,  2.69 ms.  Throughput: 3009.30 iter/sec.
Timings for 320K FFT length (8 cores, 1 worker):  0.37 ms.  Throughput: 2708.26 iter/sec.
Timings for 320K FFT length (8 cores, 2 workers):  0.54,  0.55 ms.  Throughput: 3670.88 iter/sec.
Timings for 320K FFT length (8 cores, 8 workers):  3.27,  3.27,  3.27,  3.28,  3.26,  3.29,  3.28,  3.30 ms.  Throughput: 2440.51 iter/sec.
Timings for 336K FFT length (8 cores, 1 worker):  0.34 ms.  Throughput: 2924.26 iter/sec.
Timings for 336K FFT length (8 cores, 2 workers):  0.60,  0.59 ms.  Throughput: 3360.83 iter/sec.
Timings for 336K FFT length (8 cores, 8 workers):  3.41,  3.43,  3.48,  3.41,  3.38,  3.42,  3.40,  3.39 ms.  Throughput: 2342.98 iter/sec.
Timings for 384K FFT length (8 cores, 1 worker):  0.37 ms.  Throughput: 2668.51 iter/sec.
[Tue Oct 13 10:37:12 2020]
Timings for 384K FFT length (8 cores, 2 workers):  0.66,  0.66 ms.  Throughput: 3016.13 iter/sec.
Timings for 384K FFT length (8 cores, 8 workers):  4.08,  4.10,  4.09,  4.08,  4.06,  4.07,  4.09,  4.07 ms.  Throughput: 1960.66 iter/sec.
Timings for 400K FFT length (8 cores, 1 worker):  0.40 ms.  Throughput: 2498.94 iter/sec.
Timings for 400K FFT length (8 cores, 2 workers):  0.72,  0.71 ms.  Throughput: 2793.35 iter/sec.
Timings for 400K FFT length (8 cores, 8 workers):  4.28,  4.31,  4.30,  4.29,  4.32,  4.35,  4.32,  4.35 ms.  Throughput: 1853.76 iter/sec.
Timings for 448K FFT length (8 cores, 1 worker):  0.45 ms.  Throughput: 2246.91 iter/sec.
Timings for 448K FFT length (8 cores, 2 workers):  0.81,  0.80 ms.  Throughput: 2486.94 iter/sec.
Timings for 448K FFT length (8 cores, 8 workers):  4.60,  4.62,  4.61,  4.59,  4.60,  4.60,  4.61,  4.63 ms.  Throughput: 1735.87 iter/sec.
Timings for 480K FFT length (8 cores, 1 worker):  0.47 ms.  Throughput: 2118.65 iter/sec.
Timings for 480K FFT length (8 cores, 2 workers):  0.91,  0.89 ms.  Throughput: 2213.58 iter/sec.
Timings for 480K FFT length (8 cores, 8 workers):  5.26,  5.27,  5.31,  5.29,  5.28,  5.28,  5.29,  5.34 ms.  Throughput: 1512.20 iter/sec.
Timings for 512K FFT length (8 cores, 1 worker):  0.50 ms.  Throughput: 2001.00 iter/sec.
Timings for 512K FFT length (8 cores, 2 workers):  0.97,  0.97 ms.  Throughput: 2065.31 iter/sec.
Timings for 512K FFT length (8 cores, 8 workers):  5.76,  5.75,  5.82,  5.78,  5.75,  5.78,  5.76,  5.76 ms.  Throughput: 1386.66 iter/sec.
Timings for 560K FFT length (8 cores, 1 worker):  0.57 ms.  Throughput: 1749.02 iter/sec.
Timings for 560K FFT length (8 cores, 2 workers):  1.14,  1.10 ms.  Throughput: 1783.52 iter/sec.
Timings for 560K FFT length (8 cores, 8 workers):  6.25,  6.25,  6.23,  6.23,  6.28,  6.28,  6.27,  6.29 ms.  Throughput: 1278.05 iter/sec.
Timings for 640K FFT length (8 cores, 1 worker):  0.63 ms.  Throughput: 1584.40 iter/sec.
Timings for 640K FFT length (8 cores, 2 workers):  1.31,  1.26 ms.  Throughput: 1555.01 iter/sec.
Timings for 640K FFT length (8 cores, 8 workers):  7.30,  7.26,  7.23,  7.28,  7.30,  7.29,  7.27,  7.29 ms.  Throughput: 1099.39 iter/sec.
[Tue Oct 13 10:42:22 2020]
Timings for 672K FFT length (8 cores, 1 worker):  0.69 ms.  Throughput: 1454.69 iter/sec.
Timings for 672K FFT length (8 cores, 2 workers):  1.39,  1.37 ms.  Throughput: 1446.29 iter/sec.
Timings for 672K FFT length (8 cores, 8 workers):  7.57,  7.54,  7.58,  7.55,  7.58,  7.54,  7.56,  7.53 ms.  Throughput: 1058.79 iter/sec.
Timings for 768K FFT length (8 cores, 1 worker):  0.76 ms.  Throughput: 1321.58 iter/sec.
Timings for 768K FFT length (8 cores, 2 workers):  1.69,  1.66 ms.  Throughput: 1193.21 iter/sec.
Timings for 768K FFT length (8 cores, 8 workers):  8.81,  8.75,  8.76,  8.77,  8.72,  8.74,  8.75,  8.76 ms.  Throughput: 913.47 iter/sec.
Timings for 800K FFT length (8 cores, 1 worker):  0.83 ms.  Throughput: 1199.87 iter/sec.
Timings for 800K FFT length (8 cores, 2 workers):  1.92,  1.85 ms.  Throughput: 1059.68 iter/sec.
Timings for 800K FFT length (8 cores, 8 workers):  9.12,  9.22,  9.16,  9.21,  9.23,  9.24,  9.25,  9.21 ms.  Throughput: 869.01 iter/sec.
Timings for 896K FFT length (8 cores, 1 worker):  0.93 ms.  Throughput: 1073.83 iter/sec.
Timings for 896K FFT length (8 cores, 2 workers):  2.15,  2.11 ms.  Throughput: 937.83 iter/sec.
Timings for 896K FFT length (8 cores, 8 workers): 10.23, 10.19, 10.28, 10.28, 10.25, 10.25, 10.26, 10.25 ms.  Throughput: 780.58 iter/sec.
Timings for 960K FFT length (8 cores, 1 worker):  1.02 ms.  Throughput: 983.24 iter/sec.
Timings for 960K FFT length (8 cores, 2 workers):  2.36,  2.32 ms.  Throughput: 854.15 iter/sec.
Timings for 960K FFT length (8 cores, 8 workers): 11.04, 11.02, 11.06, 11.11, 11.07, 11.08, 11.00, 11.10 ms.  Throughput: 723.38 iter/sec.
Timings for 1024K FFT length (8 cores, 1 worker):  1.07 ms.  Throughput: 935.44 iter/sec.
Timings for 1024K FFT length (8 cores, 2 workers):  2.54,  2.55 ms.  Throughput: 786.55 iter/sec.
Timings for 1024K FFT length (8 cores, 8 workers): 11.79, 11.77, 11.80, 11.75, 11.77, 11.75, 11.76, 11.75 ms.  Throughput: 679.81 iter/sec.
Timings for 1120K FFT length (8 cores, 1 worker):  1.24 ms.  Throughput: 807.39 iter/sec.
[Tue Oct 13 10:47:23 2020]
Timings for 1120K FFT length (8 cores, 2 workers):  2.92,  2.96 ms.  Throughput: 680.08 iter/sec.
Timings for 1120K FFT length (8 cores, 8 workers): 12.88, 12.86, 12.91, 12.78, 12.87, 12.88, 12.95, 12.95 ms.  Throughput: 620.92 iter/sec.
Timings for 1152K FFT length (8 cores, 1 worker):  1.26 ms.  Throughput: 793.57 iter/sec.
Timings for 1152K FFT length (8 cores, 2 workers):  2.94,  2.90 ms.  Throughput: 684.48 iter/sec.
Timings for 1152K FFT length (8 cores, 8 workers): 12.44, 12.52, 12.51, 12.61, 12.51, 12.54, 12.52, 12.51 ms.  Throughput: 638.95 iter/sec.
Timings for 1280K FFT length (8 cores, 1 worker):  1.43 ms.  Throughput: 697.43 iter/sec.
Timings for 1280K FFT length (8 cores, 2 workers):  3.50,  3.44 ms.  Throughput: 575.93 iter/sec.
Timings for 1280K FFT length (8 cores, 8 workers): 14.75, 14.74, 14.53, 14.86, 14.69, 14.69, 14.72, 14.72 ms.  Throughput: 543.80 iter/sec.
Timings for 1344K FFT length (8 cores, 1 worker):  1.56 ms.  Throughput: 642.43 iter/sec.
Timings for 1344K FFT length (8 cores, 2 workers):  3.53,  3.48 ms.  Throughput: 570.74 iter/sec.
Timings for 1344K FFT length (8 cores, 8 workers): 14.58, 14.59, 14.50, 14.73, 14.61, 14.62, 14.63, 14.61 ms.  Throughput: 547.64 iter/sec.
Timings for 1440K FFT length (8 cores, 1 worker):  1.78 ms.  Throughput: 563.11 iter/sec.
Timings for 1440K FFT length (8 cores, 2 workers):  3.77,  3.81 ms.  Throughput: 527.34 iter/sec.
Timings for 1440K FFT length (8 cores, 8 workers): 15.64, 15.81, 15.76, 15.76, 15.82, 15.83, 15.88, 15.87 ms.  Throughput: 506.46 iter/sec.
Timings for 1536K FFT length (8 cores, 1 worker):  1.87 ms.  Throughput: 535.48 iter/sec.
Timings for 1536K FFT length (8 cores, 2 workers):  4.60,  4.27 ms.  Throughput: 451.66 iter/sec.
Timings for 1536K FFT length (8 cores, 8 workers): 17.72, 17.73, 17.70, 17.80, 17.52, 17.81, 17.90, 17.89 ms.  Throughput: 450.50 iter/sec.
Timings for 1600K FFT length (8 cores, 1 worker):  2.01 ms.  Throughput: 497.74 iter/sec.
Timings for 1600K FFT length (8 cores, 2 workers):  4.30,  4.29 ms.  Throughput: 465.49 iter/sec.
[Tue Oct 13 10:52:36 2020]
Timings for 1600K FFT length (8 cores, 8 workers): 17.54, 17.58, 17.60, 17.63, 17.58, 17.60, 17.71, 17.62 ms.  Throughput: 454.31 iter/sec.
Timings for 1680K FFT length (8 cores, 1 worker):  2.16 ms.  Throughput: 462.86 iter/sec.
Timings for 1680K FFT length (8 cores, 2 workers):  4.53,  4.49 ms.  Throughput: 443.50 iter/sec.
Timings for 1680K FFT length (8 cores, 8 workers): 18.25, 18.26, 18.38, 18.16, 18.37, 18.35, 18.40, 18.38 ms.  Throughput: 436.66 iter/sec.
Timings for 1792K FFT length (8 cores, 1 worker):  2.30 ms.  Throughput: 434.65 iter/sec.
Timings for 1792K FFT length (8 cores, 2 workers):  5.15,  5.21 ms.  Throughput: 385.80 iter/sec.
Timings for 1792K FFT length (8 cores, 8 workers): 20.88, 20.79, 20.84, 20.40, 20.85, 20.92, 20.87, 20.89 ms.  Throughput: 384.58 iter/sec.
Timings for 1920K FFT length (8 cores, 1 worker):  2.52 ms.  Throughput: 397.57 iter/sec.
Timings for 1920K FFT length (8 cores, 2 workers):  5.23,  5.20 ms.  Throughput: 383.32 iter/sec.
Timings for 1920K FFT length (8 cores, 8 workers): 20.85, 21.01, 21.05, 21.01, 21.17, 21.12, 21.19, 21.16 ms.  Throughput: 379.67 iter/sec.
Timings for 2048K FFT length (8 cores, 1 worker):  2.70 ms.  Throughput: 370.88 iter/sec.
Timings for 2048K FFT length (8 cores, 2 workers):  6.07,  6.12 ms.  Throughput: 328.14 iter/sec.
Timings for 2048K FFT length (8 cores, 8 workers): 24.00, 23.63, 24.26, 24.26, 23.77, 23.76, 23.82, 23.78 ms.  Throughput: 334.59 iter/sec.
Timings for 2240K FFT length (8 cores, 1 worker):  3.02 ms.  Throughput: 331.60 iter/sec.
Timings for 2240K FFT length (8 cores, 2 workers):  6.64,  6.50 ms.  Throughput: 304.50 iter/sec.
Timings for 2240K FFT length (8 cores, 8 workers): 25.38, 25.95, 26.09, 26.13, 26.05, 25.83, 25.98, 26.08 ms.  Throughput: 308.48 iter/sec.
Timings for 2304K FFT length (8 cores, 1 worker):  3.09 ms.  Throughput: 323.45 iter/sec.
Timings for 2304K FFT length (8 cores, 2 workers):  6.64,  6.69 ms.  Throughput: 300.22 iter/sec.
[Tue Oct 13 10:57:42 2020]
Timings for 2304K FFT length (8 cores, 8 workers): 26.60, 26.87, 26.81, 26.19, 26.76, 26.60, 26.66, 26.69 ms.  Throughput: 300.23 iter/sec.
Timings for 2400K FFT length (8 cores, 1 worker):  3.27 ms.  Throughput: 306.07 iter/sec.
Timings for 2400K FFT length (8 cores, 2 workers):  7.16,  7.02 ms.  Throughput: 282.13 iter/sec.
Timings for 2400K FFT length (8 cores, 8 workers): 27.87, 27.23, 28.03, 27.85, 27.90, 27.87, 28.05, 27.87 ms.  Throughput: 287.46 iter/sec.
Timings for 2560K FFT length (8 cores, 1 worker):  3.53 ms.  Throughput: 283.39 iter/sec.
Timings for 2560K FFT length (8 cores, 2 workers):  7.52,  7.56 ms.  Throughput: 265.31 iter/sec.
Timings for 2560K FFT length (8 cores, 8 workers): 29.71, 29.77, 29.87, 29.82, 29.93, 29.13, 30.08, 29.91 ms.  Throughput: 268.68 iter/sec.
Timings for 2688K FFT length (8 cores, 1 worker):  3.69 ms.  Throughput: 271.16 iter/sec.
Timings for 2688K FFT length (8 cores, 2 workers):  7.87,  7.87 ms.  Throughput: 254.06 iter/sec.
Timings for 2688K FFT length (8 cores, 8 workers): 30.88, 31.02, 31.37, 31.27, 30.56, 31.16, 31.31, 31.47 ms.  Throughput: 257.01 iter/sec.
Timings for 2800K FFT length (8 cores, 1 worker):  3.92 ms.  Throughput: 255.33 iter/sec.
Timings for 2800K FFT length (8 cores, 2 workers):  8.23,  8.26 ms.  Throughput: 242.54 iter/sec.
Timings for 2800K FFT length (8 cores, 8 workers): 32.62, 32.70, 32.85, 32.71, 32.83, 32.63, 32.68, 32.66 ms.  Throughput: 244.57 iter/sec.
Timings for 2880K FFT length (8 cores, 1 worker):  4.00 ms.  Throughput: 249.91 iter/sec.
Timings for 2880K FFT length (8 cores, 2 workers):  8.44,  8.55 ms.  Throughput: 235.39 iter/sec.
Timings for 2880K FFT length (8 cores, 8 workers): 33.18, 33.29, 33.54, 33.73, 33.21, 33.68, 33.45, 32.83 ms.  Throughput: 239.79 iter/sec.
Timings for 3072K FFT length (8 cores, 1 worker):  4.29 ms.  Throughput: 233.08 iter/sec.
Timings for 3072K FFT length (8 cores, 2 workers):  8.60,  8.59 ms.  Throughput: 232.59 iter/sec.
[Tue Oct 13 11:03:02 2020]
Timings for 3072K FFT length (8 cores, 8 workers): 33.67, 33.54, 33.80, 33.69, 33.78, 33.80, 34.00, 33.62 ms.  Throughput: 237.13 iter/sec.
Timings for 3200K FFT length (8 cores, 1 worker):  4.85 ms.  Throughput: 206.34 iter/sec.
Timings for 3200K FFT length (8 cores, 2 workers):  9.46,  9.65 ms.  Throughput: 209.36 iter/sec.
Timings for 3200K FFT length (8 cores, 8 workers): 37.27, 37.24, 37.33, 37.51, 36.61, 37.36, 37.30, 37.47 ms.  Throughput: 214.71 iter/sec.
Timings for 3360K FFT length (8 cores, 1 worker):  4.99 ms.  Throughput: 200.49 iter/sec.
Timings for 3360K FFT length (8 cores, 2 workers):  9.84,  9.93 ms.  Throughput: 202.32 iter/sec.
Timings for 3360K FFT length (8 cores, 8 workers): 39.16, 39.03, 38.83, 38.79, 38.88, 38.41, 39.11, 39.21 ms.  Throughput: 205.52 iter/sec.
Timings for 3584K FFT length (8 cores, 1 worker):  5.13 ms.  Throughput: 194.77 iter/sec.
Timings for 3584K FFT length (8 cores, 2 workers): 10.02, 10.15 ms.  Throughput: 198.32 iter/sec.
Timings for 3584K FFT length (8 cores, 8 workers): 39.22, 39.28, 39.23, 39.36, 39.25, 39.45, 39.83, 39.55 ms.  Throughput: 203.07 iter/sec.
Timings for 3840K FFT length (8 cores, 1 worker):  5.79 ms.  Throughput: 172.80 iter/sec.
Timings for 3840K FFT length (8 cores, 2 workers): 11.36, 11.39 ms.  Throughput: 175.81 iter/sec.
Timings for 3840K FFT length (8 cores, 8 workers): 44.70, 44.64, 44.84, 44.26, 44.86, 44.75, 45.08, 45.26 ms.  Throughput: 178.58 iter/sec.
Timings for 4096K FFT length (8 cores, 1 worker):  5.89 ms.  Throughput: 169.72 iter/sec.
Timings for 4096K FFT length (8 cores, 2 workers): 11.55, 11.51 ms.  Throughput: 173.47 iter/sec.
Timings for 4096K FFT length (8 cores, 8 workers): 45.31, 45.32, 45.35, 45.12, 45.70, 44.74, 45.87, 45.11 ms.  Throughput: 176.55 iter/sec.
Timings for 4480K FFT length (8 cores, 1 worker):  6.41 ms.  Throughput: 156.11 iter/sec.
[Tue Oct 13 11:08:14 2020]
Timings for 4480K FFT length (8 cores, 2 workers): 12.50, 12.61 ms.  Throughput: 159.32 iter/sec.
Timings for 4480K FFT length (8 cores, 8 workers): 49.02, 48.99, 49.54, 49.16, 49.13, 49.25, 49.42, 49.22 ms.  Throughput: 162.55 iter/sec.
Timings for 4608K FFT length (8 cores, 1 worker):  7.03 ms.  Throughput: 142.16 iter/sec.
Timings for 4608K FFT length (8 cores, 2 workers): 13.60, 13.92 ms.  Throughput: 145.39 iter/sec.
Timings for 4608K FFT length (8 cores, 8 workers): 54.05, 54.26, 52.41, 54.15, 55.14, 53.56, 53.51, 53.45 ms.  Throughput: 148.68 iter/sec.
Timings for 4800K FFT length (8 cores, 1 worker):  7.32 ms.  Throughput: 136.56 iter/sec.
Timings for 4800K FFT length (8 cores, 2 workers): 14.23, 14.41 ms.  Throughput: 139.64 iter/sec.
Timings for 4800K FFT length (8 cores, 8 workers): 55.67, 55.70, 55.76, 55.88, 56.73, 55.78, 56.59, 56.64 ms.  Throughput: 142.63 iter/sec.
Timings for 5120K FFT length (8 cores, 1 worker):  7.81 ms.  Throughput: 128.05 iter/sec.
Timings for 5120K FFT length (8 cores, 2 workers): 15.22, 15.22 ms.  Throughput: 131.37 iter/sec.
Timings for 5120K FFT length (8 cores, 8 workers): 59.81, 59.89, 59.87, 60.01, 60.25, 60.04, 60.36, 59.09 ms.  Throughput: 133.53 iter/sec.
Timings for 5376K FFT length (8 cores, 1 worker):  8.28 ms.  Throughput: 120.81 iter/sec.
Timings for 5376K FFT length (8 cores, 2 workers): 16.03, 16.20 ms.  Throughput: 124.12 iter/sec.
Timings for 5376K FFT length (8 cores, 8 workers): 63.61, 63.50, 63.48, 61.76, 62.52, 62.93, 62.86, 63.18 ms.  Throughput: 127.03 iter/sec.
Timings for 5600K FFT length (8 cores, 1 worker):  8.08 ms.  Throughput: 123.79 iter/sec.
Timings for 5600K FFT length (8 cores, 2 workers): 15.86, 15.87 ms.  Throughput: 126.07 iter/sec.
[Tue Oct 13 11:13:41 2020]
Timings for 5600K FFT length (8 cores, 8 workers): 62.63, 62.67, 62.63, 62.37, 61.66, 63.12, 62.95, 62.81 ms.  Throughput: 127.79 iter/sec.
Timings for 5760K FFT length (8 cores, 1 worker):  8.40 ms.  Throughput: 119.02 iter/sec.
Timings for 5760K FFT length (8 cores, 2 workers): 16.36, 16.52 ms.  Throughput: 121.67 iter/sec.
Timings for 5760K FFT length (8 cores, 8 workers): 64.24, 64.54, 64.04, 64.72, 64.57, 65.03, 64.24, 64.28 ms.  Throughput: 124.12 iter/sec.
Timings for 6144K FFT length (8 cores, 1 worker):  9.43 ms.  Throughput: 106.03 iter/sec.
Timings for 6144K FFT length (8 cores, 2 workers): 18.55, 18.53 ms.  Throughput: 107.89 iter/sec.
Timings for 6144K FFT length (8 cores, 8 workers): 73.07, 73.05, 73.01, 73.06, 72.11, 72.04, 71.84, 72.15 ms.  Throughput: 110.29 iter/sec.
Timings for 6400K FFT length (8 cores, 1 worker):  9.81 ms.  Throughput: 101.95 iter/sec.
Timings for 6400K FFT length (8 cores, 2 workers): 19.00, 19.16 ms.  Throughput: 104.84 iter/sec.
Timings for 6400K FFT length (8 cores, 8 workers): 75.80, 75.86, 75.96, 75.79, 75.36, 75.14, 75.22, 74.55 ms.  Throughput: 106.02 iter/sec.
Timings for 6720K FFT length (8 cores, 1 worker):  9.65 ms.  Throughput: 103.65 iter/sec.
Timings for 6720K FFT length (8 cores, 2 workers): 18.92, 19.16 ms.  Throughput: 105.06 iter/sec.
Timings for 6720K FFT length (8 cores, 8 workers): 74.57, 74.36, 74.63, 74.80, 74.85, 74.95, 74.95, 75.59 ms.  Throughput: 106.90 iter/sec.
Timings for 7168K FFT length (8 cores, 1 worker): 11.00 ms.  Throughput: 90.90 iter/sec.
Timings for 7168K FFT length (8 cores, 2 workers): 21.69, 21.38 ms.  Throughput: 92.88 iter/sec.
[Tue Oct 13 11:19:02 2020]
Timings for 7168K FFT length (8 cores, 8 workers): 84.37, 82.63, 83.34, 84.50, 84.59, 84.71, 84.97, 85.40 ms.  Throughput: 94.89 iter/sec.
Timings for 7680K FFT length (8 cores, 1 worker): 11.24 ms.  Throughput: 89.00 iter/sec.
Timings for 7680K FFT length (8 cores, 2 workers): 21.81, 22.15 ms.  Throughput: 90.99 iter/sec.
Timings for 7680K FFT length (8 cores, 8 workers): 86.31, 86.58, 86.36, 86.67, 86.89, 86.47, 86.63, 86.82 ms.  Throughput: 92.39 iter/sec.
Timings for 8000K FFT length (8 cores, 1 worker): 12.33 ms.  Throughput: 81.11 iter/sec.
Timings for 8000K FFT length (8 cores, 2 workers): 23.89, 23.94 ms.  Throughput: 83.63 iter/sec.
Timings for 8000K FFT length (8 cores, 8 workers): 94.09, 92.40, 94.26, 94.46, 95.22, 94.32, 95.04, 95.17 ms.  Throughput: 84.78 iter/sec.
Timings for 8064K FFT length (8 cores, 1 worker): 12.39 ms.  Throughput: 80.70 iter/sec.
Timings for 8064K FFT length (8 cores, 2 workers): 24.31, 24.35 ms.  Throughput: 82.19 iter/sec.
Timings for 8064K FFT length (8 cores, 8 workers): 94.36, 95.06, 94.72, 95.16, 95.87, 95.30, 95.06, 96.13 ms.  Throughput: 84.03 iter/sec.
Timings for 8192K FFT length (8 cores, 1 worker): 12.70 ms.  Throughput: 78.73 iter/sec.
Timings for 8192K FFT length (8 cores, 2 workers): 24.62, 24.57 ms.  Throughput: 81.30 iter/sec.
Timings for 8192K FFT length (8 cores, 8 workers): 96.59, 96.15, 96.33, 97.07, 96.82, 97.03, 97.26, 97.43 ms.  Throughput: 82.62 iter/sec.
This is from a Zen 2 mobile "NUC", a single die instead of the usual desktop Ryzen MCM. Tested including low FFTs because these chips only have 2x4MiB of L3 cache (desktop equivalents have 2x16MiB) so the inflection points are lower.
M344587487 is offline   Reply With Quote
Old 2020-11-22, 08:48   #824
NookieN
 
NookieN's Avatar
 
Aug 2002

2×29 Posts
Default

Finally getting around to setting up my 10900X system. RAM is 3200C16. Benchmark below is stock. I played around with various OCs and as expected they don't make any difference in throughput (but a lot in temperature!)--the stock speed with AVX512 (apparently 3.4GHz) easily saturates memory bandwidth.

Code:
Intel(R) Core(TM) i9-10900X CPU @ 3.70GHz
CPU speed: 4288.93 MHz, 10 hyperthreaded cores
CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA, AVX512F
L1 cache size: 10x32 KB, L2 cache size: 10x1 MB, L3 cache size: 19712 KB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Prime95 64-bit version 30.3, RdtscTiming=1
Timings for 2048K FFT length (10 cores, 1 worker):  0.62 ms.  Throughput: 1621.33 iter/sec.
Timings for 2048K FFT length (10 cores, 2 workers):  1.45,  1.34 ms.  Throughput: 1433.85 iter/sec.
Timings for 2048K FFT length (10 cores, 10 workers): 11.78, 11.69, 11.66, 10.87, 11.68, 11.67, 11.67, 11.65, 11.69, 11.67 ms.  Throughput: 862.35 iter/sec.
Timings for 2100K FFT length (10 cores, 1 worker):  0.68 ms.  Throughput: 1477.05 iter/sec.
Timings for 2100K FFT length (10 cores, 2 workers):  1.48,  1.48 ms.  Throughput: 1350.56 iter/sec.
Timings for 2100K FFT length (10 cores, 10 workers): 11.30, 11.30, 11.30, 11.29, 11.30, 11.30, 11.30, 11.29, 11.30, 11.30 ms.  Throughput: 885.20 iter/sec.
Timings for 2160K FFT length (10 cores, 1 worker):  0.72 ms.  Throughput: 1396.96 iter/sec.
Timings for 2160K FFT length (10 cores, 2 workers):  1.62,  1.62 ms.  Throughput: 1234.74 iter/sec.
Timings for 2160K FFT length (10 cores, 10 workers): 11.60, 11.62, 11.62, 11.61, 11.62, 11.60, 11.62, 11.61, 11.61, 11.61 ms.  Throughput: 860.97 iter/sec.
Timings for 2240K FFT length (10 cores, 1 worker):  0.74 ms.  Throughput: 1347.42 iter/sec.
Timings for 2240K FFT length (10 cores, 2 workers):  1.76,  1.66 ms.  Throughput: 1170.89 iter/sec.
Timings for 2240K FFT length (10 cores, 10 workers): 12.91, 12.89, 12.95, 12.03, 12.92, 12.94, 12.91, 12.91, 12.92, 12.94 ms.  Throughput: 779.61 iter/sec.
Timings for 2304K FFT length (10 cores, 1 worker):  0.75 ms.  Throughput: 1340.50 iter/sec.
Timings for 2304K FFT length (10 cores, 2 workers):  1.90,  1.73 ms.  Throughput: 1105.43 iter/sec.
Timings for 2304K FFT length (10 cores, 10 workers): 13.29, 13.30, 13.40, 12.43, 13.31, 13.28, 13.29, 13.29, 13.34, 13.31 ms.  Throughput: 756.59 iter/sec.
Timings for 2400K FFT length (10 cores, 1 worker):  0.81 ms.  Throughput: 1230.45 iter/sec.
Timings for 2400K FFT length (10 cores, 2 workers):  1.89,  1.92 ms.  Throughput: 1048.94 iter/sec.
Timings for 2400K FFT length (10 cores, 10 workers): 13.32, 13.32, 13.31, 13.33, 13.30, 13.32, 13.31, 13.31, 13.32, 13.32 ms.  Throughput: 751.10 iter/sec.
Timings for 2520K FFT length (10 cores, 1 worker):  0.84 ms.  Throughput: 1196.41 iter/sec.
[Sun Nov 22 07:53:39 2020]
Timings for 2520K FFT length (10 cores, 2 workers):  2.03,  2.03 ms.  Throughput: 986.85 iter/sec.
Timings for 2520K FFT length (10 cores, 10 workers): 13.93, 13.94, 13.94, 13.94, 13.82, 13.94, 13.94, 13.93, 13.94, 13.94 ms.  Throughput: 718.09 iter/sec.
Timings for 2560K FFT length (10 cores, 1 worker):  0.85 ms.  Throughput: 1177.57 iter/sec.
Timings for 2560K FFT length (10 cores, 2 workers):  2.16,  2.16 ms.  Throughput: 924.33 iter/sec.
Timings for 2560K FFT length (10 cores, 10 workers): 14.32, 14.32, 14.32, 14.31, 14.32, 14.32, 14.31, 14.31, 14.31, 14.32 ms.  Throughput: 698.63 iter/sec.
Timings for 2592K FFT length (10 cores, 1 worker):  0.84 ms.  Throughput: 1193.89 iter/sec.
Timings for 2592K FFT length (10 cores, 2 workers):  2.18,  2.18 ms.  Throughput: 919.39 iter/sec.
Timings for 2592K FFT length (10 cores, 10 workers): 14.32, 14.12, 14.32, 14.34, 14.34, 14.34, 14.21, 14.23, 14.32, 14.34 ms.  Throughput: 699.92 iter/sec.
Timings for 2688K FFT length (10 cores, 1 worker):  0.86 ms.  Throughput: 1157.89 iter/sec.
Timings for 2688K FFT length (10 cores, 2 workers):  2.32,  2.32 ms.  Throughput: 863.79 iter/sec.
Timings for 2688K FFT length (10 cores, 10 workers): 15.28, 15.16, 15.10, 15.31, 15.30, 15.21, 15.17, 14.99, 15.19, 15.31 ms.  Throughput: 657.81 iter/sec.
Timings for 2880K FFT length (10 cores, 1 worker):  0.92 ms.  Throughput: 1089.76 iter/sec.
Timings for 2880K FFT length (10 cores, 2 workers):  2.63,  2.48 ms.  Throughput: 783.82 iter/sec.
Timings for 2880K FFT length (10 cores, 10 workers): 15.91, 16.01, 16.00, 15.59, 15.98, 15.93, 15.93, 15.90, 16.04, 15.95 ms.  Throughput: 628.01 iter/sec.
Timings for 2940K FFT length (10 cores, 1 worker):  0.98 ms.  Throughput: 1024.34 iter/sec.
Timings for 2940K FFT length (10 cores, 2 workers):  2.68,  2.68 ms.  Throughput: 745.01 iter/sec.
Timings for 2940K FFT length (10 cores, 10 workers): 15.86, 16.03, 15.84, 15.98, 15.94, 15.86, 15.86, 15.74, 15.94, 15.90 ms.  Throughput: 629.22 iter/sec.
Timings for 3000K FFT length (10 cores, 1 worker):  1.05 ms.  Throughput: 953.88 iter/sec.
Timings for 3000K FFT length (10 cores, 2 workers):  2.69,  2.69 ms.  Throughput: 743.37 iter/sec.
Timings for 3000K FFT length (10 cores, 10 workers): 16.84, 16.84, 16.84, 16.82, 16.84, 16.78, 16.78, 16.73, 16.84, 16.84 ms.  Throughput: 594.71 iter/sec.
[Sun Nov 22 07:58:45 2020]
Timings for 3072K FFT length (10 cores, 1 worker):  0.93 ms.  Throughput: 1075.16 iter/sec.
Timings for 3072K FFT length (10 cores, 2 workers):  2.90,  2.64 ms.  Throughput: 724.15 iter/sec.
Timings for 3072K FFT length (10 cores, 10 workers): 17.38, 17.45, 16.63, 17.42, 17.40, 17.40, 17.48, 17.38, 17.40, 17.44 ms.  Throughput: 576.91 iter/sec.
Timings for 3136K FFT length (10 cores, 1 worker):  1.07 ms.  Throughput: 938.49 iter/sec.
Timings for 3136K FFT length (10 cores, 2 workers):  3.14,  2.91 ms.  Throughput: 661.78 iter/sec.
Timings for 3136K FFT length (10 cores, 10 workers): 18.50, 18.50, 18.52, 18.53, 18.50, 18.54, 18.50, 17.18, 18.53, 18.60 ms.  Throughput: 544.06 iter/sec.
Timings for 3200K FFT length (10 cores, 1 worker):  1.15 ms.  Throughput: 867.95 iter/sec.
Timings for 3200K FFT length (10 cores, 2 workers):  3.01,  3.01 ms.  Throughput: 664.84 iter/sec.
Timings for 3200K FFT length (10 cores, 10 workers): 17.83, 17.97, 17.95, 17.80, 17.95, 17.95, 17.93, 17.80, 17.97, 17.97 ms.  Throughput: 558.36 iter/sec.
Timings for 3360K FFT length (10 cores, 1 worker):  1.12 ms.  Throughput: 893.80 iter/sec.
Timings for 3360K FFT length (10 cores, 2 workers):  3.30,  3.10 ms.  Throughput: 624.92 iter/sec.
Timings for 3360K FFT length (10 cores, 10 workers): 19.33, 19.42, 18.62, 19.35, 19.35, 19.33, 19.50, 19.33, 19.41, 19.39 ms.  Throughput: 518.15 iter/sec.
Timings for 3456K FFT length (10 cores, 1 worker):  1.19 ms.  Throughput: 842.58 iter/sec.
Timings for 3456K FFT length (10 cores, 2 workers):  3.41,  3.25 ms.  Throughput: 600.34 iter/sec.
Timings for 3456K FFT length (10 cores, 10 workers): 19.91, 19.94, 19.86, 19.25, 20.12, 19.86, 19.86, 19.94, 19.88, 19.89 ms.  Throughput: 503.81 iter/sec.
Timings for 3600K FFT length (10 cores, 1 worker):  1.30 ms.  Throughput: 770.75 iter/sec.
Timings for 3600K FFT length (10 cores, 2 workers):  3.50,  3.50 ms.  Throughput: 571.12 iter/sec.
Timings for 3600K FFT length (10 cores, 10 workers): 20.37, 20.37, 20.37, 20.34, 20.37, 20.37, 20.26, 20.26, 20.37, 20.37 ms.  Throughput: 491.45 iter/sec.
Timings for 3840K FFT length (10 cores, 1 worker):  1.41 ms.  Throughput: 709.48 iter/sec.
Timings for 3840K FFT length (10 cores, 2 workers):  3.83,  3.82 ms.  Throughput: 522.78 iter/sec.
[Sun Nov 22 08:03:53 2020]
Timings for 3840K FFT length (10 cores, 10 workers): 21.31, 21.52, 21.38, 21.38, 21.47, 21.37, 21.34, 21.25, 21.41, 21.47 ms.  Throughput: 467.54 iter/sec.
Timings for 3920K FFT length (10 cores, 1 worker):  1.46 ms.  Throughput: 686.86 iter/sec.
Timings for 3920K FFT length (10 cores, 2 workers):  4.04,  4.04 ms.  Throughput: 494.65 iter/sec.
Timings for 3920K FFT length (10 cores, 10 workers): 22.72, 22.57, 22.56, 22.49, 22.55, 22.57, 22.53, 22.49, 22.59, 22.54 ms.  Throughput: 443.24 iter/sec.
Timings for 4032K FFT length (10 cores, 1 worker):  1.47 ms.  Throughput: 679.19 iter/sec.
Timings for 4032K FFT length (10 cores, 2 workers):  4.09,  4.09 ms.  Throughput: 489.04 iter/sec.
Timings for 4032K FFT length (10 cores, 10 workers): 24.21, 24.37, 24.06, 23.69, 24.33, 24.09, 24.00, 23.90, 24.13, 24.33 ms.  Throughput: 414.76 iter/sec.
Timings for 4200K FFT length (10 cores, 1 worker):  1.58 ms.  Throughput: 632.69 iter/sec.
Timings for 4200K FFT length (10 cores, 2 workers):  4.26,  4.22 ms.  Throughput: 471.50 iter/sec.
Timings for 4200K FFT length (10 cores, 10 workers): 23.78, 23.70, 23.65, 23.59, 23.78, 23.60, 23.67, 23.67, 23.70, 23.78 ms.  Throughput: 422.08 iter/sec.
Timings for 4320K FFT length (10 cores, 1 worker):  1.69 ms.  Throughput: 590.92 iter/sec.
Timings for 4320K FFT length (10 cores, 2 workers):  4.42,  4.42 ms.  Throughput: 452.10 iter/sec.
Timings for 4320K FFT length (10 cores, 10 workers): 24.40, 24.39, 24.39, 24.27, 24.39, 24.40, 24.20, 24.32, 24.40, 24.40 ms.  Throughput: 410.59 iter/sec.
Timings for 4480K FFT length (10 cores, 1 worker):  1.81 ms.  Throughput: 553.65 iter/sec.
Timings for 4480K FFT length (10 cores, 2 workers):  4.68,  4.68 ms.  Throughput: 427.63 iter/sec.
Timings for 4480K FFT length (10 cores, 10 workers): 25.67, 25.67, 25.67, 25.50, 25.67, 25.67, 25.67, 25.56, 25.54, 25.67 ms.  Throughput: 390.21 iter/sec.
Timings for 4608K FFT length (10 cores, 1 worker):  1.91 ms.  Throughput: 522.41 iter/sec.
Timings for 4608K FFT length (10 cores, 2 workers):  4.75,  4.95 ms.  Throughput: 412.36 iter/sec.
Timings for 4608K FFT length (10 cores, 10 workers): 26.71, 26.79, 26.71, 26.77, 26.77, 26.77, 26.87, 25.83, 27.17, 26.78 ms.  Throughput: 374.38 iter/sec.
Timings for 4704K FFT length (10 cores, 1 worker):  1.96 ms.  Throughput: 509.37 iter/sec.
[Sun Nov 22 08:09:02 2020]
Timings for 4704K FFT length (10 cores, 2 workers):  4.92,  5.11 ms.  Throughput: 399.20 iter/sec.
Timings for 4704K FFT length (10 cores, 10 workers): 27.50, 27.50, 27.50, 26.91, 27.61, 27.50, 27.50, 27.55, 27.55, 27.55 ms.  Throughput: 364.11 iter/sec.
Timings for 4800K FFT length (10 cores, 1 worker):  2.24 ms.  Throughput: 446.78 iter/sec.
Timings for 4800K FFT length (10 cores, 2 workers):  5.45,  5.51 ms.  Throughput: 365.01 iter/sec.
Timings for 4800K FFT length (10 cores, 10 workers): 28.59, 27.35, 27.29, 27.39, 27.80, 27.60, 27.80, 27.35, 27.53, 28.05 ms.  Throughput: 361.40 iter/sec.
Timings for 5040K FFT length (10 cores, 1 worker):  2.10 ms.  Throughput: 476.41 iter/sec.
Timings for 5040K FFT length (10 cores, 2 workers):  5.42,  5.42 ms.  Throughput: 368.93 iter/sec.
Timings for 5040K FFT length (10 cores, 10 workers): 30.27, 30.09, 30.15, 29.77, 30.15, 29.98, 30.10, 29.96, 30.15, 30.27 ms.  Throughput: 332.35 iter/sec.
Timings for 5120K FFT length (10 cores, 1 worker):  2.19 ms.  Throughput: 456.28 iter/sec.
Timings for 5120K FFT length (10 cores, 2 workers):  5.71,  5.71 ms.  Throughput: 350.21 iter/sec.
Timings for 5120K FFT length (10 cores, 10 workers): 31.26, 30.96, 30.79, 30.68, 31.26, 30.87, 30.86, 30.86, 30.95, 31.26 ms.  Throughput: 322.87 iter/sec.
Timings for 5184K FFT length (10 cores, 1 worker):  2.24 ms.  Throughput: 446.51 iter/sec.
Timings for 5184K FFT length (10 cores, 2 workers):  5.70,  5.69 ms.  Throughput: 351.17 iter/sec.
Timings for 5184K FFT length (10 cores, 10 workers): 30.54, 30.53, 30.66, 30.38, 30.60, 30.67, 30.66, 30.41, 30.66, 30.67 ms.  Throughput: 327.02 iter/sec.
Timings for 5376K FFT length (10 cores, 1 worker):  2.39 ms.  Throughput: 417.94 iter/sec.
Timings for 5376K FFT length (10 cores, 2 workers):  5.91,  5.91 ms.  Throughput: 338.27 iter/sec.
Timings for 5376K FFT length (10 cores, 10 workers): 32.31, 32.52, 32.18, 31.93, 32.45, 32.23, 32.17, 32.11, 32.37, 32.47 ms.  Throughput: 309.86 iter/sec.
Timings for 5760K FFT length (10 cores, 1 worker):  2.90 ms.  Throughput: 344.81 iter/sec.
Timings for 5760K FFT length (10 cores, 2 workers):  6.85,  6.85 ms.  Throughput: 291.83 iter/sec.
Timings for 5760K FFT length (10 cores, 10 workers): 33.58, 33.37, 33.29, 32.72, 33.02, 32.89, 33.00, 32.95, 33.18, 33.25 ms.  Throughput: 301.91 iter/sec.
[Sun Nov 22 08:14:12 2020]
Timings for 6048K FFT length (10 cores, 1 worker):  2.84 ms.  Throughput: 351.93 iter/sec.
Timings for 6048K FFT length (10 cores, 2 workers):  6.77,  6.75 ms.  Throughput: 296.01 iter/sec.
Timings for 6048K FFT length (10 cores, 10 workers): 36.29, 36.30, 36.19, 36.49, 35.93, 36.21, 36.17, 36.02, 36.21, 36.37 ms.  Throughput: 276.11 iter/sec.
Timings for 6144K FFT length (10 cores, 1 worker):  2.95 ms.  Throughput: 339.04 iter/sec.
Timings for 6144K FFT length (10 cores, 2 workers):  7.05,  7.04 ms.  Throughput: 283.89 iter/sec.
Timings for 6144K FFT length (10 cores, 10 workers): 37.96, 37.94, 37.72, 37.39, 38.06, 37.91, 37.93, 37.67, 37.97, 38.25 ms.  Throughput: 264.01 iter/sec.
Timings for 6272K FFT length (10 cores, 1 worker):  3.00 ms.  Throughput: 333.52 iter/sec.
Timings for 6272K FFT length (10 cores, 2 workers):  7.15,  7.15 ms.  Throughput: 279.82 iter/sec.
Timings for 6272K FFT length (10 cores, 10 workers): 38.65, 38.95, 38.35, 38.06, 38.84, 38.48, 38.35, 38.27, 38.64, 38.85 ms.  Throughput: 259.46 iter/sec.
Timings for 6400K FFT length (10 cores, 1 worker):  3.18 ms.  Throughput: 314.58 iter/sec.
Timings for 6400K FFT length (10 cores, 2 workers):  7.34,  7.23 ms.  Throughput: 274.62 iter/sec.
Timings for 6400K FFT length (10 cores, 10 workers): 38.34, 38.45, 38.28, 37.93, 38.48, 38.55, 38.40, 38.32, 38.46, 38.38 ms.  Throughput: 260.70 iter/sec.
Timings for 6720K FFT length (10 cores, 1 worker):  3.36 ms.  Throughput: 298.00 iter/sec.
Timings for 6720K FFT length (10 cores, 2 workers):  7.70,  7.70 ms.  Throughput: 259.66 iter/sec.
Timings for 6720K FFT length (10 cores, 10 workers): 40.31, 40.22, 40.14, 39.95, 40.23, 40.18, 40.16, 40.03, 40.26, 40.45 ms.  Throughput: 248.81 iter/sec.
Timings for 7056K FFT length (10 cores, 1 worker):  3.54 ms.  Throughput: 282.21 iter/sec.
Timings for 7056K FFT length (10 cores, 2 workers):  8.07,  8.07 ms.  Throughput: 247.80 iter/sec.
Timings for 7056K FFT length (10 cores, 10 workers): 42.08, 42.20, 42.05, 41.65, 41.89, 42.04, 42.09, 41.86, 42.15, 42.55 ms.  Throughput: 237.79 iter/sec.
Timings for 7168K FFT length (10 cores, 1 worker):  3.64 ms.  Throughput: 274.72 iter/sec.
Timings for 7168K FFT length (10 cores, 2 workers):  8.38,  8.39 ms.  Throughput: 238.60 iter/sec.
[Sun Nov 22 08:19:26 2020]
Timings for 7168K FFT length (10 cores, 10 workers): 44.38, 44.51, 44.26, 43.75, 44.67, 44.25, 44.24, 44.04, 44.34, 44.78 ms.  Throughput: 225.62 iter/sec.
Timings for 7200K FFT length (10 cores, 1 worker):  3.60 ms.  Throughput: 277.47 iter/sec.
Timings for 7200K FFT length (10 cores, 2 workers):  8.31,  8.30 ms.  Throughput: 240.86 iter/sec.
Timings for 7200K FFT length (10 cores, 10 workers): 44.51, 44.91, 44.36, 43.83, 44.64, 44.38, 44.26, 44.19, 44.42, 44.78 ms.  Throughput: 225.09 iter/sec.
Timings for 7680K FFT length (10 cores, 1 worker):  3.96 ms.  Throughput: 252.74 iter/sec.
Timings for 7680K FFT length (10 cores, 2 workers):  8.94,  8.94 ms.  Throughput: 223.78 iter/sec.
Timings for 7680K FFT length (10 cores, 10 workers): 46.83, 46.98, 46.59, 45.89, 46.98, 46.65, 46.54, 46.39, 46.73, 46.99 ms.  Throughput: 214.34 iter/sec.
Timings for 8064K FFT length (10 cores, 1 worker):  4.37 ms.  Throughput: 228.87 iter/sec.
Timings for 8064K FFT length (10 cores, 2 workers):  9.34,  9.55 ms.  Throughput: 211.75 iter/sec.
Timings for 8064K FFT length (10 cores, 10 workers): 48.23, 48.30, 48.23, 48.34, 48.45, 48.14, 48.38, 48.04, 48.32, 48.64 ms.  Throughput: 207.01 iter/sec.
NookieN is offline   Reply With Quote
Old 2020-11-23, 19:11   #825
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

3·1,951 Posts
Default

Are you able to saturate memory bandwidth without AVX512? If so how does power consumption compare at the lowest frequency that maxes bandwidth?
henryzz is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perpetual "interesting video" thread... Xyzzy Lounge 39 2021-03-12 14:19
LLR benchmark thread Oddball Riesel Prime Search 5 2010-08-02 00:11
Perpetual I'm pi**ed off thread rogue Soap Box 19 2009-10-28 19:17
Perpetual autostereogram thread... Xyzzy Lounge 10 2006-09-28 00:36
Perpetual ECM factoring challenge thread... Xyzzy Factoring 65 2005-09-05 08:16

All times are UTC. The time now is 06:54.

Mon Apr 12 06:54:50 UTC 2021 up 4 days, 1:35, 1 user, load averages: 1.11, 1.44, 1.49

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.