mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
 
Thread Tools
Old 2021-11-17, 13:05   #67
tdulcet
 
tdulcet's Avatar
 
"Teal Dulcet"
Jun 2018

5×11 Posts
Default

Quote:
Originally Posted by drkirkby View Post
Why would 3 workers give the most throughput on a dual-socket computer?
I ran the throughput benchmark on a c5.metal instance and got different results. Specifically, two workers were faster at the higher FFT lengths. Here are the fastest numbers of workers for each supported FFT length benchmarked by default:
  • 6 workers: 2048K, 2100K, 2160K, 2240K, 2304K, 2400K
  • 4 workers: 2520K, 2560K, 2592K, 2688K, 2880K, 2940K, 3000K, 3072K, 3136K, 3200K, 3360K, 3456K, 3600K, 3840K, 3920K, 4200K, 4320K, 4480K, 4800K
  • 3 workers: 4032K
  • 2 workers: 4608K, 4704K, 5040K, 5120K, 5184K, 5376K, 5760K, 6048K, 6144K, 6272K, 6400K, 6720K, 7056K, 7168K, 7200K, 7680K, 8064K
Here are the actual results for one of the FFT lengths used for wavefront first time tests:
Code:
Timings for 6144K FFT length (48 cores, 1 worker): 1.35 ms. Throughput: 740.08 iter/sec.
Timings for 6144K FFT length (48 cores, 2 workers): 1.16, 1.19 ms. Throughput: 1697.08 iter/sec.
Timings for 6144K FFT length (48 cores, 3 workers): 3.04, 3.07, 1.23 ms. Throughput: 1470.23 iter/sec.
Timings for 6144K FFT length (48 cores, 4 workers): 3.05, 3.02, 3.02, 3.00 ms. Throughput: 1322.79 iter/sec.
Timings for 6144K FFT length (48 cores, 6 workers): 5.47, 5.47, 5.48, 5.39, 5.37, 5.39 ms. Throughput: 1105.26 iter/sec.
Timings for 6144K FFT length (48 cores, 8 workers): 7.56, 7.54, 7.56, 7.55, 7.41, 7.50, 7.46, 7.44 ms. Throughput: 1066.38 iter/sec.
Timings for 6144K FFT length (48 cores, 12 workers): 11.56, 11.61, 11.62, 12.32, 11.57, 11.51, 11.54, 11.55, 11.29, 11.40, 11.43, 11.25 ms. Throughput: 1039.05 iter/sec.
Timings for 6144K FFT length (48 cores, 16 workers): 20.99, 20.72, 20.95, 20.82, 20.78, 21.04, 20.89, 20.78, 14.67, 13.45, 14.54, 14.61, 13.46, 13.71, 13.94, 14.91 ms. Throughput: 949.13 iter/sec.
Timings for 6144K FFT length (48 cores, 24 workers): 57.30, 56.56, 56.51, 56.29, 56.69, 56.99, 56.67, 56.94, 56.71, 56.65, 56.85, 56.70, 26.03, 30.28, 25.78, 27.11, 29.24, 29.75, 27.03, 29.71, 27.35, 28.07, 30.19, 28.15 ms. Throughput: 637.96 iter/sec.
Timings for 6144K FFT length (48 cores, 48 workers): 130.05, 132.14, 128.50, 128.65, 129.51, 129.92, 128.45, 129.71, 128.78, 129.41, 130.18, 128.95, 130.09, 129.10, 130.14, 129.61, 128.04, 130.51, 129.25, 129.42, 129.92, 130.49, 129.53, 131.25, 86.04, 102.15, 87.65, 103.38, 74.75, 91.32, 91.88, 76.62, 75.89, 103.66, 101.44, 101.42, 95.30, 93.57, 79.79, 102.96, 72.71, 95.29, 98.47, 87.29, 100.54, 87.55, 94.32, 102.50 ms. Throughput: 449.43 iter/sec.
MPrime by default wanted to use 12 workers, but 2 workers is significantly faster.
tdulcet is offline   Reply With Quote
Old 2021-11-17, 15:06   #68
drkirkby
 
"David Kirkby"
Jan 2021
Althorne, Essex, UK

26×7 Posts
Default

Quote:
Originally Posted by tdulcet View Post
I ran the throughput benchmark on a c5.metal instance and got different results.
Are you using the same CPUs as I had - 2 x Intel Xeon Platinum 8275CL CPU @ 3.00GHz? The c5.metal does not specify the CPU type.

Amazon AWS seems to use some odd CPUs, which makes them difficult to use in other machines. I had a couple of high-spec CPUs (I think 26 core 2.6 GHz), but they would not run in my Dell 7920. Apparently they were used by Amazon for AWS, but were not supported by many motherboards.
drkirkby is offline   Reply With Quote
Old 2021-11-17, 16:39   #69
tdulcet
 
tdulcet's Avatar
 
"Teal Dulcet"
Jun 2018

5·11 Posts
Default

Quote:
Originally Posted by drkirkby View Post
Are you using the same CPUs as I had - 2 x Intel Xeon Platinum 8275CL CPU @ 3.00GHz? The c5.metal does not specify the CPU type.
Yes, the same CPU:
Code:
$ wget https://raw.github.com/tdulcet/Linux-System-Information/master/info.sh -qO - | bash -s

Linux Distribution: Ubuntu 20.04.3 LTS
Linux Kernel: 5.11.0-1020-aws
Computer Model: Amazon EC2 c5.metal 1.0
Processor (CPU): Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz
CPU Cores/Threads: 48/96
Architecture: x86_64 (64-bit)
Total memory (RAM): 193053 MiB (189GiB) (202431 MB (203GB))
Total swap space: 0 MiB (0 MB)
Disk space: nvme0n1: 512000 MiB (500GiB) (536870 MB (537GB))
I was not sure how many workers would be used, so I ended up getting much more disk space than I now need for the PRP proof files, even at proof power 10...
tdulcet is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
How-to guide for running LL tests on Google Compute Engine cloud GP2 Cloud Computing 4 2020-08-03 11:21
Is it possible to disable benchmarking while torture tests are running? ZFR Software 4 2018-02-02 20:18
Amazon Cloud Outrage kladner Science & Technology 7 2017-03-02 14:18
running single tests fast dragonbud20 Information & Answers 12 2015-09-26 21:40
LL tests running at different speeds GARYP166 Information & Answers 11 2009-07-13 19:39

All times are UTC. The time now is 02:26.


Tue Jan 18 02:26:53 UTC 2022 up 178 days, 20:55, 0 users, load averages: 1.30, 1.21, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔