20200828, 16:05  #815 
"Oliver"
Sep 2017
Porta Westfalica, DE
467 Posts 
With 30.3b3 (but also with Version 29), I get really bad multi core scaling on an Intel i59500:
Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks Intel(R) Core(TM) i59500 CPU @ 3.00GHz CPU speed: 4073.68 MHz, 6 cores CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 6x32 KB, L2 cache size: 6x256 KB, L3 cache size: 9 MB L1 cache line size: 64 bytes, L2 cache line size: 64 bytes Machine topology as determined by hwloc library: Machine#0 (total=3138356KB, Backend=Windows, hwlocVersion=2.2.0, ProcessName=prime95.exe) Package (total=3138356KB, CPUVendor=GenuineIntel, CPUFamilyNumber=6, CPUModelNumber=158, CPUModel="Intel(R) Core(TM) i59500 CPU @ 3.00GHz", CPUStepping=10) L3 (size=9216KB, linesize=64, ways=12, Inclusive=1) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000001) PU#0 (cpuset: 0x00000001) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000002) PU#1 (cpuset: 0x00000002) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000004) PU#2 (cpuset: 0x00000004) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000008) PU#3 (cpuset: 0x00000008) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000010) PU#4 (cpuset: 0x00000010) L2 (size=256KB, linesize=64, ways=4, Inclusive=0) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core (cpuset: 0x00000020) PU#5 (cpuset: 0x00000020) Prime95 64bit version 30.3, RdtscTiming=1 Timings for 3072K FFT length (1 core, 1 worker): 10.92 ms. Throughput: 91.61 iter/sec. Timings for 3072K FFT length (2 cores, 1 worker): 7.80 ms. Throughput: 128.14 iter/sec. Timings for 3072K FFT length (3 cores, 1 worker): 7.61 ms. Throughput: 131.42 iter/sec. Timings for 3072K FFT length (6 cores, 1 worker): 7.98 ms. Throughput: 125.37 iter/sec. [snip of the prelude data again...] Timings for 6144K FFT length (1 core, 1 worker): 22.32 ms. Throughput: 44.80 iter/sec. Timings for 6144K FFT length (2 cores, 1 worker): 15.69 ms. Throughput: 63.74 iter/sec. Timings for 6144K FFT length (3 cores, 1 worker): 15.86 ms. Throughput: 63.04 iter/sec. Timings for 6144K FFT length (6 cores, 1 worker): 17.18 ms. Throughput: 58.20 iter/sec. Last fiddled with by kruoli on 20200828 at 16:06 Reason: Missing letter. 
20200828, 16:41  #816 
P90 years forever!
Aug 2002
Yeehaw, FL
11×673 Posts 

20200829, 08:49  #817 
"Oliver"
Sep 2017
Porta Westfalica, DE
111010011_{2} Posts 
Yes. So it is a severe memory bottleneck? It's a work machine we are preparing for delivery. Beforehand, we always run some tests. I'll bring that up at work, since our software (which shall be running on that machine) is also vectorizationaware and will be limited by that, too, I guess (of course much less than the highlyoptimized gwnumcode).
Last fiddled with by kruoli on 20200829 at 08:50 Reason: Clarified intentions. 
20200910, 17:40  #818 
"Joe"
Oct 2019
United States
2^{2}·19 Posts 
Viliam, how did you go about parsing results.bench? I'm trying to do this in excel on about 4,500 records but I assume there has to be an easier way to do this as I'm not finding a single clean delimiter. Does P95 provide a parsed benchmark output file that I'm overlooking?

20200911, 18:45  #819 
"Viliam Furík"
Jul 2018
Martin, Slovakia
2·13·17 Posts 
I have done it the hard way... I have copied them manually into the spreadsheet, one by one.
If you want, you can send me the rows by email and I will write some quick Python code to put them in usable .csv format, and then I will send you back a spreadsheet. 
20200911, 23:07  #820  
"Joe"
Oct 2019
United States
2^{2}×19 Posts 
Quote:
Well not exactly, I found enough unique delimiters to parse the data in stages. All I need to do now is load it into a statistical analysis tool and see what it tells me. That said, based on a cursory review, I'm fairly confident that 1  8 core worker is optimal for this i99900KF. Thanks again! 

20200912, 18:19  #821 
"Joe"
Oct 2019
United States
1001100_{2} Posts 
Well, assuming I'm looking at these benchmark results through an appropriate lens, it appears that 1 worker is (on average) a ~24% improvement over the next best alternative of 2 workers and HT carries a ~13% throughput penalty for this i99900KF.

20200912, 23:56  #822 
"/X\(‘‘)/X\"
Jan 2013
2929_{10} Posts 
Sounds reasonable.

20201013, 11:04  #823 
"Composite as Heck"
Oct 2017
1425_{8} Posts 
AMD Ryzen 4700u (8C8T Zen 2 mobile), 2x16GB 3200 DDR4 SODIMM CL22
Code:
Compare your results to other computers at http://www.mersenne.org/report_benchmarks AMD Ryzen 7 4700U with Radeon Graphics CPU speed: 4192.05 MHz, 8 cores CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA L1 cache size: 8x32 KB, L2 cache size: 8x512 KB, L3 cache size: 2x4 MB L1 cache line size: 64 bytes, L2 cache line size: 64 bytes Machine topology as determined by hwloc library: Machine#0 (total=32354192KB, DMIProductName="MINIPC PN50", DMIProductVersion=0409, DMIBoardVendor="ASUSTeK COMPUTER INC.", DMIBoardName=PN50, DMIBoardVersion="To be filled by O.E.M.", DMIBoardAssetTag="Default string", DMIChassisVendor="Default string", DMIChassisType=35, DMIChassisVersion="Default string", DMIChassisAssetTag="Default string", DMIBIOSVendor="ASUSTeK COMPUTER INC.", DMIBIOSVersion=0409, DMIBIOSDate=06/30/2020, DMISysVendor="ASUSTeK COMPUTER INC.", Backend=Linux, LinuxCgroup=/, OSName=Linux, OSRelease=5.4.048generic, OSVersion="#52Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020", HostName=pn50, Architecture=x86_64, hwlocVersion=2.0.4, ProcessName=mprime) Package#0 (total=32354192KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=23, CPUModelNumber=96, CPUModel="AMD Ryzen 7 4700U with Radeon Graphics ", CPUStepping=1) L3 (size=4096KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#0 (cpuset: 0x00000001) PU#0 (cpuset: 0x00000001) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#1 (cpuset: 0x00000002) PU#1 (cpuset: 0x00000002) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#2 (cpuset: 0x00000004) PU#2 (cpuset: 0x00000004) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#3 (cpuset: 0x00000008) PU#3 (cpuset: 0x00000008) L3 (size=4096KB, linesize=64, ways=16, Inclusive=0) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#4 (cpuset: 0x00000010) PU#4 (cpuset: 0x00000010) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#5 (cpuset: 0x00000020) PU#5 (cpuset: 0x00000020) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#6 (cpuset: 0x00000040) PU#6 (cpuset: 0x00000040) L2 (size=512KB, linesize=64, ways=8, Inclusive=1) L1d (size=32KB, linesize=64, ways=8, Inclusive=0) Core#7 (cpuset: 0x00000080) PU#7 (cpuset: 0x00000080) Prime95 64bit version 30.3, RdtscTiming=1 Timings for 256K FFT length (8 cores, 1 worker): 0.28 ms. Throughput: 3533.70 iter/sec. Timings for 256K FFT length (8 cores, 2 workers): 0.33, 0.33 ms. Throughput: 6127.31 iter/sec. Timings for 256K FFT length (8 cores, 8 workers): 2.22, 2.23, 2.24, 2.24, 2.25, 2.25, 2.24, 2.27 ms. Throughput: 3564.11 iter/sec. Timings for 280K FFT length (8 cores, 1 worker): 0.36 ms. Throughput: 2755.54 iter/sec. Timings for 280K FFT length (8 cores, 2 workers): 0.47, 0.47 ms. Throughput: 4254.04 iter/sec. Timings for 280K FFT length (8 cores, 8 workers): 2.67, 2.65, 2.67, 2.67, 2.69, 2.68, 2.72, 2.71 ms. Throughput: 2982.84 iter/sec. Timings for 288K FFT length (8 cores, 1 worker): 0.36 ms. Throughput: 2790.67 iter/sec. Timings for 288K FFT length (8 cores, 2 workers): 0.49, 0.49 ms. Throughput: 4106.36 iter/sec. Timings for 288K FFT length (8 cores, 8 workers): 2.65, 2.65, 2.65, 2.64, 2.66, 2.67, 2.66, 2.69 ms. Throughput: 3009.30 iter/sec. Timings for 320K FFT length (8 cores, 1 worker): 0.37 ms. Throughput: 2708.26 iter/sec. Timings for 320K FFT length (8 cores, 2 workers): 0.54, 0.55 ms. Throughput: 3670.88 iter/sec. Timings for 320K FFT length (8 cores, 8 workers): 3.27, 3.27, 3.27, 3.28, 3.26, 3.29, 3.28, 3.30 ms. Throughput: 2440.51 iter/sec. Timings for 336K FFT length (8 cores, 1 worker): 0.34 ms. Throughput: 2924.26 iter/sec. Timings for 336K FFT length (8 cores, 2 workers): 0.60, 0.59 ms. Throughput: 3360.83 iter/sec. Timings for 336K FFT length (8 cores, 8 workers): 3.41, 3.43, 3.48, 3.41, 3.38, 3.42, 3.40, 3.39 ms. Throughput: 2342.98 iter/sec. Timings for 384K FFT length (8 cores, 1 worker): 0.37 ms. Throughput: 2668.51 iter/sec. [Tue Oct 13 10:37:12 2020] Timings for 384K FFT length (8 cores, 2 workers): 0.66, 0.66 ms. Throughput: 3016.13 iter/sec. Timings for 384K FFT length (8 cores, 8 workers): 4.08, 4.10, 4.09, 4.08, 4.06, 4.07, 4.09, 4.07 ms. Throughput: 1960.66 iter/sec. Timings for 400K FFT length (8 cores, 1 worker): 0.40 ms. Throughput: 2498.94 iter/sec. Timings for 400K FFT length (8 cores, 2 workers): 0.72, 0.71 ms. Throughput: 2793.35 iter/sec. Timings for 400K FFT length (8 cores, 8 workers): 4.28, 4.31, 4.30, 4.29, 4.32, 4.35, 4.32, 4.35 ms. Throughput: 1853.76 iter/sec. Timings for 448K FFT length (8 cores, 1 worker): 0.45 ms. Throughput: 2246.91 iter/sec. Timings for 448K FFT length (8 cores, 2 workers): 0.81, 0.80 ms. Throughput: 2486.94 iter/sec. Timings for 448K FFT length (8 cores, 8 workers): 4.60, 4.62, 4.61, 4.59, 4.60, 4.60, 4.61, 4.63 ms. Throughput: 1735.87 iter/sec. Timings for 480K FFT length (8 cores, 1 worker): 0.47 ms. Throughput: 2118.65 iter/sec. Timings for 480K FFT length (8 cores, 2 workers): 0.91, 0.89 ms. Throughput: 2213.58 iter/sec. Timings for 480K FFT length (8 cores, 8 workers): 5.26, 5.27, 5.31, 5.29, 5.28, 5.28, 5.29, 5.34 ms. Throughput: 1512.20 iter/sec. Timings for 512K FFT length (8 cores, 1 worker): 0.50 ms. Throughput: 2001.00 iter/sec. Timings for 512K FFT length (8 cores, 2 workers): 0.97, 0.97 ms. Throughput: 2065.31 iter/sec. Timings for 512K FFT length (8 cores, 8 workers): 5.76, 5.75, 5.82, 5.78, 5.75, 5.78, 5.76, 5.76 ms. Throughput: 1386.66 iter/sec. Timings for 560K FFT length (8 cores, 1 worker): 0.57 ms. Throughput: 1749.02 iter/sec. Timings for 560K FFT length (8 cores, 2 workers): 1.14, 1.10 ms. Throughput: 1783.52 iter/sec. Timings for 560K FFT length (8 cores, 8 workers): 6.25, 6.25, 6.23, 6.23, 6.28, 6.28, 6.27, 6.29 ms. Throughput: 1278.05 iter/sec. Timings for 640K FFT length (8 cores, 1 worker): 0.63 ms. Throughput: 1584.40 iter/sec. Timings for 640K FFT length (8 cores, 2 workers): 1.31, 1.26 ms. Throughput: 1555.01 iter/sec. Timings for 640K FFT length (8 cores, 8 workers): 7.30, 7.26, 7.23, 7.28, 7.30, 7.29, 7.27, 7.29 ms. Throughput: 1099.39 iter/sec. [Tue Oct 13 10:42:22 2020] Timings for 672K FFT length (8 cores, 1 worker): 0.69 ms. Throughput: 1454.69 iter/sec. Timings for 672K FFT length (8 cores, 2 workers): 1.39, 1.37 ms. Throughput: 1446.29 iter/sec. Timings for 672K FFT length (8 cores, 8 workers): 7.57, 7.54, 7.58, 7.55, 7.58, 7.54, 7.56, 7.53 ms. Throughput: 1058.79 iter/sec. Timings for 768K FFT length (8 cores, 1 worker): 0.76 ms. Throughput: 1321.58 iter/sec. Timings for 768K FFT length (8 cores, 2 workers): 1.69, 1.66 ms. Throughput: 1193.21 iter/sec. Timings for 768K FFT length (8 cores, 8 workers): 8.81, 8.75, 8.76, 8.77, 8.72, 8.74, 8.75, 8.76 ms. Throughput: 913.47 iter/sec. Timings for 800K FFT length (8 cores, 1 worker): 0.83 ms. Throughput: 1199.87 iter/sec. Timings for 800K FFT length (8 cores, 2 workers): 1.92, 1.85 ms. Throughput: 1059.68 iter/sec. Timings for 800K FFT length (8 cores, 8 workers): 9.12, 9.22, 9.16, 9.21, 9.23, 9.24, 9.25, 9.21 ms. Throughput: 869.01 iter/sec. Timings for 896K FFT length (8 cores, 1 worker): 0.93 ms. Throughput: 1073.83 iter/sec. Timings for 896K FFT length (8 cores, 2 workers): 2.15, 2.11 ms. Throughput: 937.83 iter/sec. Timings for 896K FFT length (8 cores, 8 workers): 10.23, 10.19, 10.28, 10.28, 10.25, 10.25, 10.26, 10.25 ms. Throughput: 780.58 iter/sec. Timings for 960K FFT length (8 cores, 1 worker): 1.02 ms. Throughput: 983.24 iter/sec. Timings for 960K FFT length (8 cores, 2 workers): 2.36, 2.32 ms. Throughput: 854.15 iter/sec. Timings for 960K FFT length (8 cores, 8 workers): 11.04, 11.02, 11.06, 11.11, 11.07, 11.08, 11.00, 11.10 ms. Throughput: 723.38 iter/sec. Timings for 1024K FFT length (8 cores, 1 worker): 1.07 ms. Throughput: 935.44 iter/sec. Timings for 1024K FFT length (8 cores, 2 workers): 2.54, 2.55 ms. Throughput: 786.55 iter/sec. Timings for 1024K FFT length (8 cores, 8 workers): 11.79, 11.77, 11.80, 11.75, 11.77, 11.75, 11.76, 11.75 ms. Throughput: 679.81 iter/sec. Timings for 1120K FFT length (8 cores, 1 worker): 1.24 ms. Throughput: 807.39 iter/sec. [Tue Oct 13 10:47:23 2020] Timings for 1120K FFT length (8 cores, 2 workers): 2.92, 2.96 ms. Throughput: 680.08 iter/sec. Timings for 1120K FFT length (8 cores, 8 workers): 12.88, 12.86, 12.91, 12.78, 12.87, 12.88, 12.95, 12.95 ms. Throughput: 620.92 iter/sec. Timings for 1152K FFT length (8 cores, 1 worker): 1.26 ms. Throughput: 793.57 iter/sec. Timings for 1152K FFT length (8 cores, 2 workers): 2.94, 2.90 ms. Throughput: 684.48 iter/sec. Timings for 1152K FFT length (8 cores, 8 workers): 12.44, 12.52, 12.51, 12.61, 12.51, 12.54, 12.52, 12.51 ms. Throughput: 638.95 iter/sec. Timings for 1280K FFT length (8 cores, 1 worker): 1.43 ms. Throughput: 697.43 iter/sec. Timings for 1280K FFT length (8 cores, 2 workers): 3.50, 3.44 ms. Throughput: 575.93 iter/sec. Timings for 1280K FFT length (8 cores, 8 workers): 14.75, 14.74, 14.53, 14.86, 14.69, 14.69, 14.72, 14.72 ms. Throughput: 543.80 iter/sec. Timings for 1344K FFT length (8 cores, 1 worker): 1.56 ms. Throughput: 642.43 iter/sec. Timings for 1344K FFT length (8 cores, 2 workers): 3.53, 3.48 ms. Throughput: 570.74 iter/sec. Timings for 1344K FFT length (8 cores, 8 workers): 14.58, 14.59, 14.50, 14.73, 14.61, 14.62, 14.63, 14.61 ms. Throughput: 547.64 iter/sec. Timings for 1440K FFT length (8 cores, 1 worker): 1.78 ms. Throughput: 563.11 iter/sec. Timings for 1440K FFT length (8 cores, 2 workers): 3.77, 3.81 ms. Throughput: 527.34 iter/sec. Timings for 1440K FFT length (8 cores, 8 workers): 15.64, 15.81, 15.76, 15.76, 15.82, 15.83, 15.88, 15.87 ms. Throughput: 506.46 iter/sec. Timings for 1536K FFT length (8 cores, 1 worker): 1.87 ms. Throughput: 535.48 iter/sec. Timings for 1536K FFT length (8 cores, 2 workers): 4.60, 4.27 ms. Throughput: 451.66 iter/sec. Timings for 1536K FFT length (8 cores, 8 workers): 17.72, 17.73, 17.70, 17.80, 17.52, 17.81, 17.90, 17.89 ms. Throughput: 450.50 iter/sec. Timings for 1600K FFT length (8 cores, 1 worker): 2.01 ms. Throughput: 497.74 iter/sec. Timings for 1600K FFT length (8 cores, 2 workers): 4.30, 4.29 ms. Throughput: 465.49 iter/sec. [Tue Oct 13 10:52:36 2020] Timings for 1600K FFT length (8 cores, 8 workers): 17.54, 17.58, 17.60, 17.63, 17.58, 17.60, 17.71, 17.62 ms. Throughput: 454.31 iter/sec. Timings for 1680K FFT length (8 cores, 1 worker): 2.16 ms. Throughput: 462.86 iter/sec. Timings for 1680K FFT length (8 cores, 2 workers): 4.53, 4.49 ms. Throughput: 443.50 iter/sec. Timings for 1680K FFT length (8 cores, 8 workers): 18.25, 18.26, 18.38, 18.16, 18.37, 18.35, 18.40, 18.38 ms. Throughput: 436.66 iter/sec. Timings for 1792K FFT length (8 cores, 1 worker): 2.30 ms. Throughput: 434.65 iter/sec. Timings for 1792K FFT length (8 cores, 2 workers): 5.15, 5.21 ms. Throughput: 385.80 iter/sec. Timings for 1792K FFT length (8 cores, 8 workers): 20.88, 20.79, 20.84, 20.40, 20.85, 20.92, 20.87, 20.89 ms. Throughput: 384.58 iter/sec. Timings for 1920K FFT length (8 cores, 1 worker): 2.52 ms. Throughput: 397.57 iter/sec. Timings for 1920K FFT length (8 cores, 2 workers): 5.23, 5.20 ms. Throughput: 383.32 iter/sec. Timings for 1920K FFT length (8 cores, 8 workers): 20.85, 21.01, 21.05, 21.01, 21.17, 21.12, 21.19, 21.16 ms. Throughput: 379.67 iter/sec. Timings for 2048K FFT length (8 cores, 1 worker): 2.70 ms. Throughput: 370.88 iter/sec. Timings for 2048K FFT length (8 cores, 2 workers): 6.07, 6.12 ms. Throughput: 328.14 iter/sec. Timings for 2048K FFT length (8 cores, 8 workers): 24.00, 23.63, 24.26, 24.26, 23.77, 23.76, 23.82, 23.78 ms. Throughput: 334.59 iter/sec. Timings for 2240K FFT length (8 cores, 1 worker): 3.02 ms. Throughput: 331.60 iter/sec. Timings for 2240K FFT length (8 cores, 2 workers): 6.64, 6.50 ms. Throughput: 304.50 iter/sec. Timings for 2240K FFT length (8 cores, 8 workers): 25.38, 25.95, 26.09, 26.13, 26.05, 25.83, 25.98, 26.08 ms. Throughput: 308.48 iter/sec. Timings for 2304K FFT length (8 cores, 1 worker): 3.09 ms. Throughput: 323.45 iter/sec. Timings for 2304K FFT length (8 cores, 2 workers): 6.64, 6.69 ms. Throughput: 300.22 iter/sec. [Tue Oct 13 10:57:42 2020] Timings for 2304K FFT length (8 cores, 8 workers): 26.60, 26.87, 26.81, 26.19, 26.76, 26.60, 26.66, 26.69 ms. Throughput: 300.23 iter/sec. Timings for 2400K FFT length (8 cores, 1 worker): 3.27 ms. Throughput: 306.07 iter/sec. Timings for 2400K FFT length (8 cores, 2 workers): 7.16, 7.02 ms. Throughput: 282.13 iter/sec. Timings for 2400K FFT length (8 cores, 8 workers): 27.87, 27.23, 28.03, 27.85, 27.90, 27.87, 28.05, 27.87 ms. Throughput: 287.46 iter/sec. Timings for 2560K FFT length (8 cores, 1 worker): 3.53 ms. Throughput: 283.39 iter/sec. Timings for 2560K FFT length (8 cores, 2 workers): 7.52, 7.56 ms. Throughput: 265.31 iter/sec. Timings for 2560K FFT length (8 cores, 8 workers): 29.71, 29.77, 29.87, 29.82, 29.93, 29.13, 30.08, 29.91 ms. Throughput: 268.68 iter/sec. Timings for 2688K FFT length (8 cores, 1 worker): 3.69 ms. Throughput: 271.16 iter/sec. Timings for 2688K FFT length (8 cores, 2 workers): 7.87, 7.87 ms. Throughput: 254.06 iter/sec. Timings for 2688K FFT length (8 cores, 8 workers): 30.88, 31.02, 31.37, 31.27, 30.56, 31.16, 31.31, 31.47 ms. Throughput: 257.01 iter/sec. Timings for 2800K FFT length (8 cores, 1 worker): 3.92 ms. Throughput: 255.33 iter/sec. Timings for 2800K FFT length (8 cores, 2 workers): 8.23, 8.26 ms. Throughput: 242.54 iter/sec. Timings for 2800K FFT length (8 cores, 8 workers): 32.62, 32.70, 32.85, 32.71, 32.83, 32.63, 32.68, 32.66 ms. Throughput: 244.57 iter/sec. Timings for 2880K FFT length (8 cores, 1 worker): 4.00 ms. Throughput: 249.91 iter/sec. Timings for 2880K FFT length (8 cores, 2 workers): 8.44, 8.55 ms. Throughput: 235.39 iter/sec. Timings for 2880K FFT length (8 cores, 8 workers): 33.18, 33.29, 33.54, 33.73, 33.21, 33.68, 33.45, 32.83 ms. Throughput: 239.79 iter/sec. Timings for 3072K FFT length (8 cores, 1 worker): 4.29 ms. Throughput: 233.08 iter/sec. Timings for 3072K FFT length (8 cores, 2 workers): 8.60, 8.59 ms. Throughput: 232.59 iter/sec. [Tue Oct 13 11:03:02 2020] Timings for 3072K FFT length (8 cores, 8 workers): 33.67, 33.54, 33.80, 33.69, 33.78, 33.80, 34.00, 33.62 ms. Throughput: 237.13 iter/sec. Timings for 3200K FFT length (8 cores, 1 worker): 4.85 ms. Throughput: 206.34 iter/sec. Timings for 3200K FFT length (8 cores, 2 workers): 9.46, 9.65 ms. Throughput: 209.36 iter/sec. Timings for 3200K FFT length (8 cores, 8 workers): 37.27, 37.24, 37.33, 37.51, 36.61, 37.36, 37.30, 37.47 ms. Throughput: 214.71 iter/sec. Timings for 3360K FFT length (8 cores, 1 worker): 4.99 ms. Throughput: 200.49 iter/sec. Timings for 3360K FFT length (8 cores, 2 workers): 9.84, 9.93 ms. Throughput: 202.32 iter/sec. Timings for 3360K FFT length (8 cores, 8 workers): 39.16, 39.03, 38.83, 38.79, 38.88, 38.41, 39.11, 39.21 ms. Throughput: 205.52 iter/sec. Timings for 3584K FFT length (8 cores, 1 worker): 5.13 ms. Throughput: 194.77 iter/sec. Timings for 3584K FFT length (8 cores, 2 workers): 10.02, 10.15 ms. Throughput: 198.32 iter/sec. Timings for 3584K FFT length (8 cores, 8 workers): 39.22, 39.28, 39.23, 39.36, 39.25, 39.45, 39.83, 39.55 ms. Throughput: 203.07 iter/sec. Timings for 3840K FFT length (8 cores, 1 worker): 5.79 ms. Throughput: 172.80 iter/sec. Timings for 3840K FFT length (8 cores, 2 workers): 11.36, 11.39 ms. Throughput: 175.81 iter/sec. Timings for 3840K FFT length (8 cores, 8 workers): 44.70, 44.64, 44.84, 44.26, 44.86, 44.75, 45.08, 45.26 ms. Throughput: 178.58 iter/sec. Timings for 4096K FFT length (8 cores, 1 worker): 5.89 ms. Throughput: 169.72 iter/sec. Timings for 4096K FFT length (8 cores, 2 workers): 11.55, 11.51 ms. Throughput: 173.47 iter/sec. Timings for 4096K FFT length (8 cores, 8 workers): 45.31, 45.32, 45.35, 45.12, 45.70, 44.74, 45.87, 45.11 ms. Throughput: 176.55 iter/sec. Timings for 4480K FFT length (8 cores, 1 worker): 6.41 ms. Throughput: 156.11 iter/sec. [Tue Oct 13 11:08:14 2020] Timings for 4480K FFT length (8 cores, 2 workers): 12.50, 12.61 ms. Throughput: 159.32 iter/sec. Timings for 4480K FFT length (8 cores, 8 workers): 49.02, 48.99, 49.54, 49.16, 49.13, 49.25, 49.42, 49.22 ms. Throughput: 162.55 iter/sec. Timings for 4608K FFT length (8 cores, 1 worker): 7.03 ms. Throughput: 142.16 iter/sec. Timings for 4608K FFT length (8 cores, 2 workers): 13.60, 13.92 ms. Throughput: 145.39 iter/sec. Timings for 4608K FFT length (8 cores, 8 workers): 54.05, 54.26, 52.41, 54.15, 55.14, 53.56, 53.51, 53.45 ms. Throughput: 148.68 iter/sec. Timings for 4800K FFT length (8 cores, 1 worker): 7.32 ms. Throughput: 136.56 iter/sec. Timings for 4800K FFT length (8 cores, 2 workers): 14.23, 14.41 ms. Throughput: 139.64 iter/sec. Timings for 4800K FFT length (8 cores, 8 workers): 55.67, 55.70, 55.76, 55.88, 56.73, 55.78, 56.59, 56.64 ms. Throughput: 142.63 iter/sec. Timings for 5120K FFT length (8 cores, 1 worker): 7.81 ms. Throughput: 128.05 iter/sec. Timings for 5120K FFT length (8 cores, 2 workers): 15.22, 15.22 ms. Throughput: 131.37 iter/sec. Timings for 5120K FFT length (8 cores, 8 workers): 59.81, 59.89, 59.87, 60.01, 60.25, 60.04, 60.36, 59.09 ms. Throughput: 133.53 iter/sec. Timings for 5376K FFT length (8 cores, 1 worker): 8.28 ms. Throughput: 120.81 iter/sec. Timings for 5376K FFT length (8 cores, 2 workers): 16.03, 16.20 ms. Throughput: 124.12 iter/sec. Timings for 5376K FFT length (8 cores, 8 workers): 63.61, 63.50, 63.48, 61.76, 62.52, 62.93, 62.86, 63.18 ms. Throughput: 127.03 iter/sec. Timings for 5600K FFT length (8 cores, 1 worker): 8.08 ms. Throughput: 123.79 iter/sec. Timings for 5600K FFT length (8 cores, 2 workers): 15.86, 15.87 ms. Throughput: 126.07 iter/sec. [Tue Oct 13 11:13:41 2020] Timings for 5600K FFT length (8 cores, 8 workers): 62.63, 62.67, 62.63, 62.37, 61.66, 63.12, 62.95, 62.81 ms. Throughput: 127.79 iter/sec. Timings for 5760K FFT length (8 cores, 1 worker): 8.40 ms. Throughput: 119.02 iter/sec. Timings for 5760K FFT length (8 cores, 2 workers): 16.36, 16.52 ms. Throughput: 121.67 iter/sec. Timings for 5760K FFT length (8 cores, 8 workers): 64.24, 64.54, 64.04, 64.72, 64.57, 65.03, 64.24, 64.28 ms. Throughput: 124.12 iter/sec. Timings for 6144K FFT length (8 cores, 1 worker): 9.43 ms. Throughput: 106.03 iter/sec. Timings for 6144K FFT length (8 cores, 2 workers): 18.55, 18.53 ms. Throughput: 107.89 iter/sec. Timings for 6144K FFT length (8 cores, 8 workers): 73.07, 73.05, 73.01, 73.06, 72.11, 72.04, 71.84, 72.15 ms. Throughput: 110.29 iter/sec. Timings for 6400K FFT length (8 cores, 1 worker): 9.81 ms. Throughput: 101.95 iter/sec. Timings for 6400K FFT length (8 cores, 2 workers): 19.00, 19.16 ms. Throughput: 104.84 iter/sec. Timings for 6400K FFT length (8 cores, 8 workers): 75.80, 75.86, 75.96, 75.79, 75.36, 75.14, 75.22, 74.55 ms. Throughput: 106.02 iter/sec. Timings for 6720K FFT length (8 cores, 1 worker): 9.65 ms. Throughput: 103.65 iter/sec. Timings for 6720K FFT length (8 cores, 2 workers): 18.92, 19.16 ms. Throughput: 105.06 iter/sec. Timings for 6720K FFT length (8 cores, 8 workers): 74.57, 74.36, 74.63, 74.80, 74.85, 74.95, 74.95, 75.59 ms. Throughput: 106.90 iter/sec. Timings for 7168K FFT length (8 cores, 1 worker): 11.00 ms. Throughput: 90.90 iter/sec. Timings for 7168K FFT length (8 cores, 2 workers): 21.69, 21.38 ms. Throughput: 92.88 iter/sec. [Tue Oct 13 11:19:02 2020] Timings for 7168K FFT length (8 cores, 8 workers): 84.37, 82.63, 83.34, 84.50, 84.59, 84.71, 84.97, 85.40 ms. Throughput: 94.89 iter/sec. Timings for 7680K FFT length (8 cores, 1 worker): 11.24 ms. Throughput: 89.00 iter/sec. Timings for 7680K FFT length (8 cores, 2 workers): 21.81, 22.15 ms. Throughput: 90.99 iter/sec. Timings for 7680K FFT length (8 cores, 8 workers): 86.31, 86.58, 86.36, 86.67, 86.89, 86.47, 86.63, 86.82 ms. Throughput: 92.39 iter/sec. Timings for 8000K FFT length (8 cores, 1 worker): 12.33 ms. Throughput: 81.11 iter/sec. Timings for 8000K FFT length (8 cores, 2 workers): 23.89, 23.94 ms. Throughput: 83.63 iter/sec. Timings for 8000K FFT length (8 cores, 8 workers): 94.09, 92.40, 94.26, 94.46, 95.22, 94.32, 95.04, 95.17 ms. Throughput: 84.78 iter/sec. Timings for 8064K FFT length (8 cores, 1 worker): 12.39 ms. Throughput: 80.70 iter/sec. Timings for 8064K FFT length (8 cores, 2 workers): 24.31, 24.35 ms. Throughput: 82.19 iter/sec. Timings for 8064K FFT length (8 cores, 8 workers): 94.36, 95.06, 94.72, 95.16, 95.87, 95.30, 95.06, 96.13 ms. Throughput: 84.03 iter/sec. Timings for 8192K FFT length (8 cores, 1 worker): 12.70 ms. Throughput: 78.73 iter/sec. Timings for 8192K FFT length (8 cores, 2 workers): 24.62, 24.57 ms. Throughput: 81.30 iter/sec. Timings for 8192K FFT length (8 cores, 8 workers): 96.59, 96.15, 96.33, 97.07, 96.82, 97.03, 97.26, 97.43 ms. Throughput: 82.62 iter/sec. 
20201122, 08:48  #824 
Aug 2002
2×29 Posts 
Finally getting around to setting up my 10900X system. RAM is 3200C16. Benchmark below is stock. I played around with various OCs and as expected they don't make any difference in throughput (but a lot in temperature!)the stock speed with AVX512 (apparently 3.4GHz) easily saturates memory bandwidth.
Code:
Intel(R) Core(TM) i910900X CPU @ 3.70GHz CPU speed: 4288.93 MHz, 10 hyperthreaded cores CPU features: Prefetchw, SSE, SSE2, SSE4, AVX, AVX2, FMA, AVX512F L1 cache size: 10x32 KB, L2 cache size: 10x1 MB, L3 cache size: 19712 KB L1 cache line size: 64 bytes, L2 cache line size: 64 bytes Prime95 64bit version 30.3, RdtscTiming=1 Timings for 2048K FFT length (10 cores, 1 worker): 0.62 ms. Throughput: 1621.33 iter/sec. Timings for 2048K FFT length (10 cores, 2 workers): 1.45, 1.34 ms. Throughput: 1433.85 iter/sec. Timings for 2048K FFT length (10 cores, 10 workers): 11.78, 11.69, 11.66, 10.87, 11.68, 11.67, 11.67, 11.65, 11.69, 11.67 ms. Throughput: 862.35 iter/sec. Timings for 2100K FFT length (10 cores, 1 worker): 0.68 ms. Throughput: 1477.05 iter/sec. Timings for 2100K FFT length (10 cores, 2 workers): 1.48, 1.48 ms. Throughput: 1350.56 iter/sec. Timings for 2100K FFT length (10 cores, 10 workers): 11.30, 11.30, 11.30, 11.29, 11.30, 11.30, 11.30, 11.29, 11.30, 11.30 ms. Throughput: 885.20 iter/sec. Timings for 2160K FFT length (10 cores, 1 worker): 0.72 ms. Throughput: 1396.96 iter/sec. Timings for 2160K FFT length (10 cores, 2 workers): 1.62, 1.62 ms. Throughput: 1234.74 iter/sec. Timings for 2160K FFT length (10 cores, 10 workers): 11.60, 11.62, 11.62, 11.61, 11.62, 11.60, 11.62, 11.61, 11.61, 11.61 ms. Throughput: 860.97 iter/sec. Timings for 2240K FFT length (10 cores, 1 worker): 0.74 ms. Throughput: 1347.42 iter/sec. Timings for 2240K FFT length (10 cores, 2 workers): 1.76, 1.66 ms. Throughput: 1170.89 iter/sec. Timings for 2240K FFT length (10 cores, 10 workers): 12.91, 12.89, 12.95, 12.03, 12.92, 12.94, 12.91, 12.91, 12.92, 12.94 ms. Throughput: 779.61 iter/sec. Timings for 2304K FFT length (10 cores, 1 worker): 0.75 ms. Throughput: 1340.50 iter/sec. Timings for 2304K FFT length (10 cores, 2 workers): 1.90, 1.73 ms. Throughput: 1105.43 iter/sec. Timings for 2304K FFT length (10 cores, 10 workers): 13.29, 13.30, 13.40, 12.43, 13.31, 13.28, 13.29, 13.29, 13.34, 13.31 ms. Throughput: 756.59 iter/sec. Timings for 2400K FFT length (10 cores, 1 worker): 0.81 ms. Throughput: 1230.45 iter/sec. Timings for 2400K FFT length (10 cores, 2 workers): 1.89, 1.92 ms. Throughput: 1048.94 iter/sec. Timings for 2400K FFT length (10 cores, 10 workers): 13.32, 13.32, 13.31, 13.33, 13.30, 13.32, 13.31, 13.31, 13.32, 13.32 ms. Throughput: 751.10 iter/sec. Timings for 2520K FFT length (10 cores, 1 worker): 0.84 ms. Throughput: 1196.41 iter/sec. [Sun Nov 22 07:53:39 2020] Timings for 2520K FFT length (10 cores, 2 workers): 2.03, 2.03 ms. Throughput: 986.85 iter/sec. Timings for 2520K FFT length (10 cores, 10 workers): 13.93, 13.94, 13.94, 13.94, 13.82, 13.94, 13.94, 13.93, 13.94, 13.94 ms. Throughput: 718.09 iter/sec. Timings for 2560K FFT length (10 cores, 1 worker): 0.85 ms. Throughput: 1177.57 iter/sec. Timings for 2560K FFT length (10 cores, 2 workers): 2.16, 2.16 ms. Throughput: 924.33 iter/sec. Timings for 2560K FFT length (10 cores, 10 workers): 14.32, 14.32, 14.32, 14.31, 14.32, 14.32, 14.31, 14.31, 14.31, 14.32 ms. Throughput: 698.63 iter/sec. Timings for 2592K FFT length (10 cores, 1 worker): 0.84 ms. Throughput: 1193.89 iter/sec. Timings for 2592K FFT length (10 cores, 2 workers): 2.18, 2.18 ms. Throughput: 919.39 iter/sec. Timings for 2592K FFT length (10 cores, 10 workers): 14.32, 14.12, 14.32, 14.34, 14.34, 14.34, 14.21, 14.23, 14.32, 14.34 ms. Throughput: 699.92 iter/sec. Timings for 2688K FFT length (10 cores, 1 worker): 0.86 ms. Throughput: 1157.89 iter/sec. Timings for 2688K FFT length (10 cores, 2 workers): 2.32, 2.32 ms. Throughput: 863.79 iter/sec. Timings for 2688K FFT length (10 cores, 10 workers): 15.28, 15.16, 15.10, 15.31, 15.30, 15.21, 15.17, 14.99, 15.19, 15.31 ms. Throughput: 657.81 iter/sec. Timings for 2880K FFT length (10 cores, 1 worker): 0.92 ms. Throughput: 1089.76 iter/sec. Timings for 2880K FFT length (10 cores, 2 workers): 2.63, 2.48 ms. Throughput: 783.82 iter/sec. Timings for 2880K FFT length (10 cores, 10 workers): 15.91, 16.01, 16.00, 15.59, 15.98, 15.93, 15.93, 15.90, 16.04, 15.95 ms. Throughput: 628.01 iter/sec. Timings for 2940K FFT length (10 cores, 1 worker): 0.98 ms. Throughput: 1024.34 iter/sec. Timings for 2940K FFT length (10 cores, 2 workers): 2.68, 2.68 ms. Throughput: 745.01 iter/sec. Timings for 2940K FFT length (10 cores, 10 workers): 15.86, 16.03, 15.84, 15.98, 15.94, 15.86, 15.86, 15.74, 15.94, 15.90 ms. Throughput: 629.22 iter/sec. Timings for 3000K FFT length (10 cores, 1 worker): 1.05 ms. Throughput: 953.88 iter/sec. Timings for 3000K FFT length (10 cores, 2 workers): 2.69, 2.69 ms. Throughput: 743.37 iter/sec. Timings for 3000K FFT length (10 cores, 10 workers): 16.84, 16.84, 16.84, 16.82, 16.84, 16.78, 16.78, 16.73, 16.84, 16.84 ms. Throughput: 594.71 iter/sec. [Sun Nov 22 07:58:45 2020] Timings for 3072K FFT length (10 cores, 1 worker): 0.93 ms. Throughput: 1075.16 iter/sec. Timings for 3072K FFT length (10 cores, 2 workers): 2.90, 2.64 ms. Throughput: 724.15 iter/sec. Timings for 3072K FFT length (10 cores, 10 workers): 17.38, 17.45, 16.63, 17.42, 17.40, 17.40, 17.48, 17.38, 17.40, 17.44 ms. Throughput: 576.91 iter/sec. Timings for 3136K FFT length (10 cores, 1 worker): 1.07 ms. Throughput: 938.49 iter/sec. Timings for 3136K FFT length (10 cores, 2 workers): 3.14, 2.91 ms. Throughput: 661.78 iter/sec. Timings for 3136K FFT length (10 cores, 10 workers): 18.50, 18.50, 18.52, 18.53, 18.50, 18.54, 18.50, 17.18, 18.53, 18.60 ms. Throughput: 544.06 iter/sec. Timings for 3200K FFT length (10 cores, 1 worker): 1.15 ms. Throughput: 867.95 iter/sec. Timings for 3200K FFT length (10 cores, 2 workers): 3.01, 3.01 ms. Throughput: 664.84 iter/sec. Timings for 3200K FFT length (10 cores, 10 workers): 17.83, 17.97, 17.95, 17.80, 17.95, 17.95, 17.93, 17.80, 17.97, 17.97 ms. Throughput: 558.36 iter/sec. Timings for 3360K FFT length (10 cores, 1 worker): 1.12 ms. Throughput: 893.80 iter/sec. Timings for 3360K FFT length (10 cores, 2 workers): 3.30, 3.10 ms. Throughput: 624.92 iter/sec. Timings for 3360K FFT length (10 cores, 10 workers): 19.33, 19.42, 18.62, 19.35, 19.35, 19.33, 19.50, 19.33, 19.41, 19.39 ms. Throughput: 518.15 iter/sec. Timings for 3456K FFT length (10 cores, 1 worker): 1.19 ms. Throughput: 842.58 iter/sec. Timings for 3456K FFT length (10 cores, 2 workers): 3.41, 3.25 ms. Throughput: 600.34 iter/sec. Timings for 3456K FFT length (10 cores, 10 workers): 19.91, 19.94, 19.86, 19.25, 20.12, 19.86, 19.86, 19.94, 19.88, 19.89 ms. Throughput: 503.81 iter/sec. Timings for 3600K FFT length (10 cores, 1 worker): 1.30 ms. Throughput: 770.75 iter/sec. Timings for 3600K FFT length (10 cores, 2 workers): 3.50, 3.50 ms. Throughput: 571.12 iter/sec. Timings for 3600K FFT length (10 cores, 10 workers): 20.37, 20.37, 20.37, 20.34, 20.37, 20.37, 20.26, 20.26, 20.37, 20.37 ms. Throughput: 491.45 iter/sec. Timings for 3840K FFT length (10 cores, 1 worker): 1.41 ms. Throughput: 709.48 iter/sec. Timings for 3840K FFT length (10 cores, 2 workers): 3.83, 3.82 ms. Throughput: 522.78 iter/sec. [Sun Nov 22 08:03:53 2020] Timings for 3840K FFT length (10 cores, 10 workers): 21.31, 21.52, 21.38, 21.38, 21.47, 21.37, 21.34, 21.25, 21.41, 21.47 ms. Throughput: 467.54 iter/sec. Timings for 3920K FFT length (10 cores, 1 worker): 1.46 ms. Throughput: 686.86 iter/sec. Timings for 3920K FFT length (10 cores, 2 workers): 4.04, 4.04 ms. Throughput: 494.65 iter/sec. Timings for 3920K FFT length (10 cores, 10 workers): 22.72, 22.57, 22.56, 22.49, 22.55, 22.57, 22.53, 22.49, 22.59, 22.54 ms. Throughput: 443.24 iter/sec. Timings for 4032K FFT length (10 cores, 1 worker): 1.47 ms. Throughput: 679.19 iter/sec. Timings for 4032K FFT length (10 cores, 2 workers): 4.09, 4.09 ms. Throughput: 489.04 iter/sec. Timings for 4032K FFT length (10 cores, 10 workers): 24.21, 24.37, 24.06, 23.69, 24.33, 24.09, 24.00, 23.90, 24.13, 24.33 ms. Throughput: 414.76 iter/sec. Timings for 4200K FFT length (10 cores, 1 worker): 1.58 ms. Throughput: 632.69 iter/sec. Timings for 4200K FFT length (10 cores, 2 workers): 4.26, 4.22 ms. Throughput: 471.50 iter/sec. Timings for 4200K FFT length (10 cores, 10 workers): 23.78, 23.70, 23.65, 23.59, 23.78, 23.60, 23.67, 23.67, 23.70, 23.78 ms. Throughput: 422.08 iter/sec. Timings for 4320K FFT length (10 cores, 1 worker): 1.69 ms. Throughput: 590.92 iter/sec. Timings for 4320K FFT length (10 cores, 2 workers): 4.42, 4.42 ms. Throughput: 452.10 iter/sec. Timings for 4320K FFT length (10 cores, 10 workers): 24.40, 24.39, 24.39, 24.27, 24.39, 24.40, 24.20, 24.32, 24.40, 24.40 ms. Throughput: 410.59 iter/sec. Timings for 4480K FFT length (10 cores, 1 worker): 1.81 ms. Throughput: 553.65 iter/sec. Timings for 4480K FFT length (10 cores, 2 workers): 4.68, 4.68 ms. Throughput: 427.63 iter/sec. Timings for 4480K FFT length (10 cores, 10 workers): 25.67, 25.67, 25.67, 25.50, 25.67, 25.67, 25.67, 25.56, 25.54, 25.67 ms. Throughput: 390.21 iter/sec. Timings for 4608K FFT length (10 cores, 1 worker): 1.91 ms. Throughput: 522.41 iter/sec. Timings for 4608K FFT length (10 cores, 2 workers): 4.75, 4.95 ms. Throughput: 412.36 iter/sec. Timings for 4608K FFT length (10 cores, 10 workers): 26.71, 26.79, 26.71, 26.77, 26.77, 26.77, 26.87, 25.83, 27.17, 26.78 ms. Throughput: 374.38 iter/sec. Timings for 4704K FFT length (10 cores, 1 worker): 1.96 ms. Throughput: 509.37 iter/sec. [Sun Nov 22 08:09:02 2020] Timings for 4704K FFT length (10 cores, 2 workers): 4.92, 5.11 ms. Throughput: 399.20 iter/sec. Timings for 4704K FFT length (10 cores, 10 workers): 27.50, 27.50, 27.50, 26.91, 27.61, 27.50, 27.50, 27.55, 27.55, 27.55 ms. Throughput: 364.11 iter/sec. Timings for 4800K FFT length (10 cores, 1 worker): 2.24 ms. Throughput: 446.78 iter/sec. Timings for 4800K FFT length (10 cores, 2 workers): 5.45, 5.51 ms. Throughput: 365.01 iter/sec. Timings for 4800K FFT length (10 cores, 10 workers): 28.59, 27.35, 27.29, 27.39, 27.80, 27.60, 27.80, 27.35, 27.53, 28.05 ms. Throughput: 361.40 iter/sec. Timings for 5040K FFT length (10 cores, 1 worker): 2.10 ms. Throughput: 476.41 iter/sec. Timings for 5040K FFT length (10 cores, 2 workers): 5.42, 5.42 ms. Throughput: 368.93 iter/sec. Timings for 5040K FFT length (10 cores, 10 workers): 30.27, 30.09, 30.15, 29.77, 30.15, 29.98, 30.10, 29.96, 30.15, 30.27 ms. Throughput: 332.35 iter/sec. Timings for 5120K FFT length (10 cores, 1 worker): 2.19 ms. Throughput: 456.28 iter/sec. Timings for 5120K FFT length (10 cores, 2 workers): 5.71, 5.71 ms. Throughput: 350.21 iter/sec. Timings for 5120K FFT length (10 cores, 10 workers): 31.26, 30.96, 30.79, 30.68, 31.26, 30.87, 30.86, 30.86, 30.95, 31.26 ms. Throughput: 322.87 iter/sec. Timings for 5184K FFT length (10 cores, 1 worker): 2.24 ms. Throughput: 446.51 iter/sec. Timings for 5184K FFT length (10 cores, 2 workers): 5.70, 5.69 ms. Throughput: 351.17 iter/sec. Timings for 5184K FFT length (10 cores, 10 workers): 30.54, 30.53, 30.66, 30.38, 30.60, 30.67, 30.66, 30.41, 30.66, 30.67 ms. Throughput: 327.02 iter/sec. Timings for 5376K FFT length (10 cores, 1 worker): 2.39 ms. Throughput: 417.94 iter/sec. Timings for 5376K FFT length (10 cores, 2 workers): 5.91, 5.91 ms. Throughput: 338.27 iter/sec. Timings for 5376K FFT length (10 cores, 10 workers): 32.31, 32.52, 32.18, 31.93, 32.45, 32.23, 32.17, 32.11, 32.37, 32.47 ms. Throughput: 309.86 iter/sec. Timings for 5760K FFT length (10 cores, 1 worker): 2.90 ms. Throughput: 344.81 iter/sec. Timings for 5760K FFT length (10 cores, 2 workers): 6.85, 6.85 ms. Throughput: 291.83 iter/sec. Timings for 5760K FFT length (10 cores, 10 workers): 33.58, 33.37, 33.29, 32.72, 33.02, 32.89, 33.00, 32.95, 33.18, 33.25 ms. Throughput: 301.91 iter/sec. [Sun Nov 22 08:14:12 2020] Timings for 6048K FFT length (10 cores, 1 worker): 2.84 ms. Throughput: 351.93 iter/sec. Timings for 6048K FFT length (10 cores, 2 workers): 6.77, 6.75 ms. Throughput: 296.01 iter/sec. Timings for 6048K FFT length (10 cores, 10 workers): 36.29, 36.30, 36.19, 36.49, 35.93, 36.21, 36.17, 36.02, 36.21, 36.37 ms. Throughput: 276.11 iter/sec. Timings for 6144K FFT length (10 cores, 1 worker): 2.95 ms. Throughput: 339.04 iter/sec. Timings for 6144K FFT length (10 cores, 2 workers): 7.05, 7.04 ms. Throughput: 283.89 iter/sec. Timings for 6144K FFT length (10 cores, 10 workers): 37.96, 37.94, 37.72, 37.39, 38.06, 37.91, 37.93, 37.67, 37.97, 38.25 ms. Throughput: 264.01 iter/sec. Timings for 6272K FFT length (10 cores, 1 worker): 3.00 ms. Throughput: 333.52 iter/sec. Timings for 6272K FFT length (10 cores, 2 workers): 7.15, 7.15 ms. Throughput: 279.82 iter/sec. Timings for 6272K FFT length (10 cores, 10 workers): 38.65, 38.95, 38.35, 38.06, 38.84, 38.48, 38.35, 38.27, 38.64, 38.85 ms. Throughput: 259.46 iter/sec. Timings for 6400K FFT length (10 cores, 1 worker): 3.18 ms. Throughput: 314.58 iter/sec. Timings for 6400K FFT length (10 cores, 2 workers): 7.34, 7.23 ms. Throughput: 274.62 iter/sec. Timings for 6400K FFT length (10 cores, 10 workers): 38.34, 38.45, 38.28, 37.93, 38.48, 38.55, 38.40, 38.32, 38.46, 38.38 ms. Throughput: 260.70 iter/sec. Timings for 6720K FFT length (10 cores, 1 worker): 3.36 ms. Throughput: 298.00 iter/sec. Timings for 6720K FFT length (10 cores, 2 workers): 7.70, 7.70 ms. Throughput: 259.66 iter/sec. Timings for 6720K FFT length (10 cores, 10 workers): 40.31, 40.22, 40.14, 39.95, 40.23, 40.18, 40.16, 40.03, 40.26, 40.45 ms. Throughput: 248.81 iter/sec. Timings for 7056K FFT length (10 cores, 1 worker): 3.54 ms. Throughput: 282.21 iter/sec. Timings for 7056K FFT length (10 cores, 2 workers): 8.07, 8.07 ms. Throughput: 247.80 iter/sec. Timings for 7056K FFT length (10 cores, 10 workers): 42.08, 42.20, 42.05, 41.65, 41.89, 42.04, 42.09, 41.86, 42.15, 42.55 ms. Throughput: 237.79 iter/sec. Timings for 7168K FFT length (10 cores, 1 worker): 3.64 ms. Throughput: 274.72 iter/sec. Timings for 7168K FFT length (10 cores, 2 workers): 8.38, 8.39 ms. Throughput: 238.60 iter/sec. [Sun Nov 22 08:19:26 2020] Timings for 7168K FFT length (10 cores, 10 workers): 44.38, 44.51, 44.26, 43.75, 44.67, 44.25, 44.24, 44.04, 44.34, 44.78 ms. Throughput: 225.62 iter/sec. Timings for 7200K FFT length (10 cores, 1 worker): 3.60 ms. Throughput: 277.47 iter/sec. Timings for 7200K FFT length (10 cores, 2 workers): 8.31, 8.30 ms. Throughput: 240.86 iter/sec. Timings for 7200K FFT length (10 cores, 10 workers): 44.51, 44.91, 44.36, 43.83, 44.64, 44.38, 44.26, 44.19, 44.42, 44.78 ms. Throughput: 225.09 iter/sec. Timings for 7680K FFT length (10 cores, 1 worker): 3.96 ms. Throughput: 252.74 iter/sec. Timings for 7680K FFT length (10 cores, 2 workers): 8.94, 8.94 ms. Throughput: 223.78 iter/sec. Timings for 7680K FFT length (10 cores, 10 workers): 46.83, 46.98, 46.59, 45.89, 46.98, 46.65, 46.54, 46.39, 46.73, 46.99 ms. Throughput: 214.34 iter/sec. Timings for 8064K FFT length (10 cores, 1 worker): 4.37 ms. Throughput: 228.87 iter/sec. Timings for 8064K FFT length (10 cores, 2 workers): 9.34, 9.55 ms. Throughput: 211.75 iter/sec. Timings for 8064K FFT length (10 cores, 10 workers): 48.23, 48.30, 48.23, 48.34, 48.45, 48.14, 48.38, 48.04, 48.32, 48.64 ms. Throughput: 207.01 iter/sec. 
20201123, 19:11  #825 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
3·1,951 Posts 
Are you able to saturate memory bandwidth without AVX512? If so how does power consumption compare at the lowest frequency that maxes bandwidth?

