mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2021-01-28, 15:39   #848
MisterBitcoin
 
MisterBitcoin's Avatar
 
"Nuri, the dragon :P"
Jul 2016
Good old Germany

11010001002 Posts
Default

Quote:
Originally Posted by Viliam Furik View Post
And what is the CPU? I guess it's Intel 10900 something.

Yes, its an i9-10900 @ 2.8 GHz.
MisterBitcoin is offline   Reply With Quote
Old 2021-05-20, 11:42   #849
fivemack
(loop (#_fork))
 
fivemack's Avatar
 
Feb 2006
Cambridge, England

3·19·113 Posts
Default

Asus X99-A motherboard, second-hand E5-2673v4, running 20 threads at one per core

Code:
[Work thread May 20 12:44] Resuming primality test of M54082397 using FMA3 FFT length 2880K, Pass1=384, Pass2=7680, clm=2, 20 threads
[Work thread May 20 12:44] Iteration: 2779000 / 54082397 [5.13%], ms/iter:  1.223, ETA: 17:26:02
I suspect everything fits in the 50MB L3 on this chip, and so the iteration time is extremely stable; the clock speed is 2.4GHz on each core.

If I switch to 10 threads

Code:
[Work thread May 20 12:40] Iteration: 2825000 / 54082397 [5.22%], ms/iter:  2.131, ETA: 30:20:29
but I guess two 10-thread workers would need to hit RAM and be significantly slower.

I'll run mprime on this machine for a few days until the extra memory required to run 40 gnfs-lasieve4I16e jobs at once arrives

For comparison, the i7-5820K on the same board managed 3.29ms/iter for p=56778599

Last fiddled with by fivemack on 2021-05-20 at 12:38
fivemack is offline   Reply With Quote
Old 2021-05-20, 12:38   #850
Viliam Furik
 
Viliam Furik's Avatar
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

2·11·31 Posts
Default

Huh, that's a weird one.

What exponent is running on the second worker? Assuming both are 2880K FFT size, they should both fit with a little spare place into 50 MiB, as 2880*8 gives about 22.5 MiB of L3 cache needed for the test.
Viliam Furik is offline   Reply With Quote
Old 2021-08-02, 00:42   #851
Viliam Furik
 
Viliam Furik's Avatar
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

2·11·31 Posts
Default 3990X benchmarks

I've asked paulunderwood to run a few throughput benchmarks on the 3990X. Many thanks for that.
I attached graphs of the results.

The first one is normalized throughput benchmark from size 64K to 19200K for 1, 2, 4, 8, 16, 32, and 64 workers, all using 64 cores in total. Normalization was done by the following formula [normperf = throughput * ln(FFT size) * FFT size / clock speed (4,1 GHz)]. You can see performance drops, which show when the cache runs out.

The second graph is a zoomed-in part of the first one so that the differences can be seen better.

ZIP file contains the values used, as well as original graphs, in an Excel spreadsheet.


Manual data analysis showed some weird behaviour. Due to not having enough time now, they will be mentioned in another post.
Attached Thumbnails
Click image for larger version

Name:	Ryzen 9 3990X.png
Views:	76
Size:	95.7 KB
ID:	25381   Click image for larger version

Name:	Ryzen 9 3990X - closer.png
Views:	71
Size:	58.5 KB
ID:	25382  
Attached Files
File Type: 7z FFT benchmark results - 3990X.7z (51.0 KB, 43 views)
Viliam Furik is offline   Reply With Quote
Old 2021-08-24, 05:08   #852
DrobinsonPE
 
Aug 2020

6016 Posts
Default Ryzen 5 5600G

Prime 95, AMD Ryzen 5 5600G, ASROCK B450-HDV R4.0, DDR4-3600 RAM

Code:
AMD Ryzen 5 5600G with Radeon Graphics         
CPU speed: 4444.53 MHz, 6 hyperthreaded cores
CPU features: 3DNow! Prefetch, SSE, SSE2, SSE4, AVX, AVX2, FMA
L1 cache size: 6x32 KB, L2 cache size: 6x512 KB, L3 cache size: 16 MB
L1 cache line size: 64 bytes, L2 cache line size: 64 bytes
Machine topology as determined by hwloc library:
 Machine#0 (total=27844592KB, Backend=Windows, hwlocVersion=2.2.0, ProcessName=prime95.exe)
  Package (total=27844592KB, CPUVendor=AuthenticAMD, CPUFamilyNumber=25, CPUModelNumber=80, CPUModel="AMD Ryzen 5 5600G with Radeon Graphics         ", CPUStepping=0)
    L3 (size=16384KB, linesize=64, ways=16, Inclusive=0)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000003)
            PU#0 (cpuset: 0x00000001)
            PU#1 (cpuset: 0x00000002)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x0000000c)
            PU#2 (cpuset: 0x00000004)
            PU#3 (cpuset: 0x00000008)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000030)
            PU#4 (cpuset: 0x00000010)
            PU#5 (cpuset: 0x00000020)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x000000c0)
            PU#6 (cpuset: 0x00000040)
            PU#7 (cpuset: 0x00000080)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000300)
            PU#8 (cpuset: 0x00000100)
            PU#9 (cpuset: 0x00000200)
      L2 (size=512KB, linesize=64, ways=8, Inclusive=1)
        L1d (size=32KB, linesize=64, ways=8, Inclusive=0)
          Core (cpuset: 0x00000c00)
            PU#10 (cpuset: 0x00000400)
            PU#11 (cpuset: 0x00000800)
Prime95 64-bit version 30.4, RdtscTiming=1
Timings for 2048K FFT length (6 cores, 1 worker):  1.31 ms.  Throughput: 763.12 iter/sec.
Timings for 2048K FFT length (6 cores, 6 workers): 11.15, 11.17, 10.50, 11.05, 11.31, 11.22 ms.  Throughput: 542.38 iter/sec.
Timings for 2048K FFT length (6 cores hyperthreaded, 1 worker):  1.48 ms.  Throughput: 677.58 iter/sec.
Timings for 2048K FFT length (6 cores hyperthreaded, 6 workers): 12.02, 11.94, 11.82, 11.51, 11.66, 11.71 ms.  Throughput: 509.50 iter/sec.
Timings for 2240K FFT length (6 cores, 1 worker):  1.43 ms.  Throughput: 698.86 iter/sec.
Timings for 2240K FFT length (6 cores, 6 workers): 11.29, 11.92, 11.59, 11.24, 11.71, 11.56 ms.  Throughput: 519.66 iter/sec.
Timings for 2240K FFT length (6 cores hyperthreaded, 1 worker):  1.49 ms.  Throughput: 673.14 iter/sec.
Timings for 2240K FFT length (6 cores hyperthreaded, 6 workers): 12.19, 11.72, 12.09, 11.94, 11.65, 11.98 ms.  Throughput: 503.15 iter/sec.
Timings for 2304K FFT length (6 cores, 1 worker):  1.42 ms.  Throughput: 702.11 iter/sec.
Timings for 2304K FFT length (6 cores, 6 workers): 12.02, 11.55, 12.00, 11.53, 12.26, 12.46 ms.  Throughput: 501.65 iter/sec.
Timings for 2304K FFT length (6 cores hyperthreaded, 1 worker):  1.52 ms.  Throughput: 658.29 iter/sec.
Timings for 2304K FFT length (6 cores hyperthreaded, 6 workers): 12.10, 12.07, 12.56, 12.47, 12.27, 12.22 ms.  Throughput: 488.67 iter/sec.
Timings for 2400K FFT length (6 cores, 1 worker):  1.51 ms.  Throughput: 661.42 iter/sec.
Timings for 2400K FFT length (6 cores, 6 workers): 12.34, 12.71, 12.85, 12.56, 12.81, 12.35 ms.  Throughput: 476.18 iter/sec.
Timings for 2400K FFT length (6 cores hyperthreaded, 1 worker):  1.59 ms.  Throughput: 629.89 iter/sec.
Timings for 2400K FFT length (6 cores hyperthreaded, 6 workers): 12.85, 13.01, 13.53, 13.14, 13.41, 13.25 ms.  Throughput: 454.72 iter/sec.
Timings for 2560K FFT length (6 cores, 1 worker):  1.66 ms.  Throughput: 601.00 iter/sec.
Timings for 2560K FFT length (6 cores, 6 workers): 13.44, 13.38, 13.31, 13.32, 13.57, 13.74 ms.  Throughput: 445.89 iter/sec.
Timings for 2560K FFT length (6 cores hyperthreaded, 1 worker):  1.73 ms.  Throughput: 578.11 iter/sec.
[Mon Aug 23 21:52:00 2021]
Timings for 2560K FFT length (6 cores hyperthreaded, 6 workers): 14.46, 14.40, 14.50, 13.37, 14.24, 14.01 ms.  Throughput: 424.01 iter/sec.
Timings for 2688K FFT length (6 cores, 1 worker):  1.73 ms.  Throughput: 578.75 iter/sec.
Timings for 2688K FFT length (6 cores, 6 workers): 13.99, 13.85, 14.42, 14.61, 14.03, 14.17 ms.  Throughput: 423.30 iter/sec.
Timings for 2688K FFT length (6 cores hyperthreaded, 1 worker):  1.84 ms.  Throughput: 543.78 iter/sec.
Timings for 2688K FFT length (6 cores hyperthreaded, 6 workers): 15.04, 14.89, 14.60, 15.01, 14.59, 15.08 ms.  Throughput: 403.61 iter/sec.
Timings for 2800K FFT length (6 cores, 1 worker):  1.87 ms.  Throughput: 533.69 iter/sec.
Timings for 2800K FFT length (6 cores, 6 workers): 14.41, 14.90, 14.65, 15.05, 14.83, 14.85 ms.  Throughput: 405.97 iter/sec.
Timings for 2800K FFT length (6 cores hyperthreaded, 1 worker):  1.95 ms.  Throughput: 511.62 iter/sec.
Timings for 2800K FFT length (6 cores hyperthreaded, 6 workers): 16.17, 15.92, 15.98, 15.66, 15.66, 15.93 ms.  Throughput: 377.74 iter/sec.
Timings for 2880K FFT length (6 cores, 1 worker):  1.85 ms.  Throughput: 539.11 iter/sec.
Timings for 2880K FFT length (6 cores, 6 workers): 15.85, 15.73, 15.21, 15.21, 15.20, 15.18 ms.  Throughput: 389.80 iter/sec.
Timings for 2880K FFT length (6 cores hyperthreaded, 1 worker):  1.94 ms.  Throughput: 515.51 iter/sec.
Timings for 2880K FFT length (6 cores hyperthreaded, 6 workers): 15.85, 15.52, 15.52, 15.78, 15.69, 15.71 ms.  Throughput: 382.69 iter/sec.
Timings for 3072K FFT length (6 cores, 1 worker):  2.00 ms.  Throughput: 500.84 iter/sec.
Timings for 3072K FFT length (6 cores, 6 workers): 16.22, 16.29, 16.01, 15.98, 16.20, 16.18 ms.  Throughput: 371.63 iter/sec.
Timings for 3072K FFT length (6 cores hyperthreaded, 1 worker):  2.22 ms.  Throughput: 450.79 iter/sec.
Timings for 3072K FFT length (6 cores hyperthreaded, 6 workers): 17.62, 17.48, 17.50, 17.10, 17.51, 17.22 ms.  Throughput: 344.75 iter/sec.
Timings for 3200K FFT length (6 cores, 1 worker):  2.20 ms.  Throughput: 454.48 iter/sec.
Timings for 3200K FFT length (6 cores, 6 workers): 17.05, 17.11, 17.28, 16.76, 17.01, 17.04 ms.  Throughput: 352.08 iter/sec.
Timings for 3200K FFT length (6 cores hyperthreaded, 1 worker):  2.32 ms.  Throughput: 430.38 iter/sec.
[Mon Aug 23 21:57:05 2021]
Timings for 3200K FFT length (6 cores hyperthreaded, 6 workers): 19.20, 17.96, 17.96, 18.10, 18.00, 17.81 ms.  Throughput: 330.42 iter/sec.
Timings for 3360K FFT length (6 cores, 1 worker):  2.28 ms.  Throughput: 438.92 iter/sec.
Timings for 3360K FFT length (6 cores, 6 workers): 18.17, 17.86, 17.68, 18.28, 18.19, 18.24 ms.  Throughput: 332.09 iter/sec.
Timings for 3360K FFT length (6 cores hyperthreaded, 1 worker):  2.36 ms.  Throughput: 423.52 iter/sec.
Timings for 3360K FFT length (6 cores hyperthreaded, 6 workers): 18.26, 18.68, 18.58, 18.30, 18.15, 18.30 ms.  Throughput: 326.49 iter/sec.
Timings for 3584K FFT length (6 cores, 1 worker):  2.44 ms.  Throughput: 409.60 iter/sec.
Timings for 3584K FFT length (6 cores, 6 workers): 19.43, 18.97, 18.90, 19.13, 19.14, 19.31 ms.  Throughput: 313.43 iter/sec.
Timings for 3584K FFT length (6 cores hyperthreaded, 1 worker):  2.81 ms.  Throughput: 356.19 iter/sec.
Timings for 3584K FFT length (6 cores hyperthreaded, 6 workers): 20.50, 20.57, 20.37, 20.14, 20.47, 20.26 ms.  Throughput: 294.36 iter/sec.
Timings for 3840K FFT length (6 cores, 1 worker):  2.66 ms.  Throughput: 376.43 iter/sec.
Timings for 3840K FFT length (6 cores, 6 workers): 20.82, 20.53, 20.78, 20.76, 20.90, 20.85 ms.  Throughput: 288.82 iter/sec.
Timings for 3840K FFT length (6 cores hyperthreaded, 1 worker):  2.84 ms.  Throughput: 352.22 iter/sec.
Timings for 3840K FFT length (6 cores hyperthreaded, 6 workers): 22.68, 21.80, 22.58, 21.21, 21.09, 21.21 ms.  Throughput: 275.98 iter/sec.
Timings for 4096K FFT length (6 cores, 1 worker):  2.93 ms.  Throughput: 341.25 iter/sec.
Timings for 4096K FFT length (6 cores, 6 workers): 22.61, 21.73, 22.45, 22.35, 22.43, 22.46 ms.  Throughput: 268.65 iter/sec.
Timings for 4096K FFT length (6 cores hyperthreaded, 1 worker):  3.55 ms.  Throughput: 281.73 iter/sec.
Timings for 4096K FFT length (6 cores hyperthreaded, 6 workers): 23.33, 23.70, 23.33, 22.95, 23.20, 23.31 ms.  Throughput: 257.49 iter/sec.
Timings for 4480K FFT length (6 cores, 1 worker):  3.61 ms.  Throughput: 276.66 iter/sec.
Timings for 4480K FFT length (6 cores, 6 workers): 25.35, 25.18, 24.92, 25.08, 24.92, 25.39 ms.  Throughput: 238.69 iter/sec.
Timings for 4480K FFT length (6 cores hyperthreaded, 1 worker):  4.15 ms.  Throughput: 241.04 iter/sec.
[Mon Aug 23 22:02:12 2021]
Timings for 4480K FFT length (6 cores hyperthreaded, 6 workers): 26.03, 26.18, 26.24, 25.69, 26.08, 25.82 ms.  Throughput: 230.72 iter/sec.
Timings for 4608K FFT length (6 cores, 1 worker):  3.53 ms.  Throughput: 283.65 iter/sec.
Timings for 4608K FFT length (6 cores, 6 workers): 25.29, 25.33, 24.98, 25.59, 25.18, 25.16 ms.  Throughput: 237.58 iter/sec.
Timings for 4608K FFT length (6 cores hyperthreaded, 1 worker):  3.89 ms.  Throughput: 256.97 iter/sec.
Timings for 4608K FFT length (6 cores hyperthreaded, 6 workers): 25.95, 26.39, 25.98, 27.83, 26.03, 25.72 ms.  Throughput: 228.16 iter/sec.
Timings for 4800K FFT length (6 cores, 1 worker):  3.78 ms.  Throughput: 264.85 iter/sec.
Timings for 4800K FFT length (6 cores, 6 workers): 26.51, 26.10, 26.19, 26.08, 26.10, 27.47 ms.  Throughput: 227.27 iter/sec.
Timings for 4800K FFT length (6 cores hyperthreaded, 1 worker):  3.86 ms.  Throughput: 258.77 iter/sec.
Timings for 4800K FFT length (6 cores hyperthreaded, 6 workers): 28.58, 27.07, 26.88, 26.78, 27.33, 27.24 ms.  Throughput: 219.77 iter/sec.
Timings for 5120K FFT length (6 cores, 1 worker):  3.87 ms.  Throughput: 258.24 iter/sec.
Timings for 5120K FFT length (6 cores, 6 workers): 28.08, 28.98, 27.85, 28.00, 27.91, 27.79 ms.  Throughput: 213.54 iter/sec.
Timings for 5120K FFT length (6 cores hyperthreaded, 1 worker):  4.26 ms.  Throughput: 234.53 iter/sec.
Timings for 5120K FFT length (6 cores hyperthreaded, 6 workers): 30.07, 29.25, 28.45, 29.06, 29.91, 28.71 ms.  Throughput: 205.28 iter/sec.
DrobinsonPE is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Perpetual "interesting video" thread... Xyzzy Lounge 43 2021-07-17 00:00
LLR benchmark thread Oddball Riesel Prime Search 5 2010-08-02 00:11
Perpetual I'm pi**ed off thread rogue Soap Box 19 2009-10-28 19:17
Perpetual autostereogram thread... Xyzzy Lounge 10 2006-09-28 00:36
Perpetual ECM factoring challenge thread... Xyzzy Factoring 65 2005-09-05 08:16

All times are UTC. The time now is 09:42.


Tue Oct 26 09:42:33 UTC 2021 up 95 days, 4:11, 0 users, load averages: 2.08, 2.30, 2.20

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.