mersenneforum.org How-to guide for running LL tests on the Amazon EC2 cloud
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

2021-11-17, 13:05   #67
tdulcet

"Teal Dulcet"
Jun 2018

5×11 Posts

Quote:
 Originally Posted by drkirkby Why would 3 workers give the most throughput on a dual-socket computer?
I ran the throughput benchmark on a c5.metal instance and got different results. Specifically, two workers were faster at the higher FFT lengths. Here are the fastest numbers of workers for each supported FFT length benchmarked by default:
• 6 workers: 2048K, 2100K, 2160K, 2240K, 2304K, 2400K
• 4 workers: 2520K, 2560K, 2592K, 2688K, 2880K, 2940K, 3000K, 3072K, 3136K, 3200K, 3360K, 3456K, 3600K, 3840K, 3920K, 4200K, 4320K, 4480K, 4800K
• 3 workers: 4032K
• 2 workers: 4608K, 4704K, 5040K, 5120K, 5184K, 5376K, 5760K, 6048K, 6144K, 6272K, 6400K, 6720K, 7056K, 7168K, 7200K, 7680K, 8064K
Here are the actual results for one of the FFT lengths used for wavefront first time tests:
Code:
Timings for 6144K FFT length (48 cores, 1 worker): 1.35 ms. Throughput: 740.08 iter/sec.
Timings for 6144K FFT length (48 cores, 2 workers): 1.16, 1.19 ms. Throughput: 1697.08 iter/sec.
Timings for 6144K FFT length (48 cores, 3 workers): 3.04, 3.07, 1.23 ms. Throughput: 1470.23 iter/sec.
Timings for 6144K FFT length (48 cores, 4 workers): 3.05, 3.02, 3.02, 3.00 ms. Throughput: 1322.79 iter/sec.
Timings for 6144K FFT length (48 cores, 6 workers): 5.47, 5.47, 5.48, 5.39, 5.37, 5.39 ms. Throughput: 1105.26 iter/sec.
Timings for 6144K FFT length (48 cores, 8 workers): 7.56, 7.54, 7.56, 7.55, 7.41, 7.50, 7.46, 7.44 ms. Throughput: 1066.38 iter/sec.
Timings for 6144K FFT length (48 cores, 12 workers): 11.56, 11.61, 11.62, 12.32, 11.57, 11.51, 11.54, 11.55, 11.29, 11.40, 11.43, 11.25 ms. Throughput: 1039.05 iter/sec.
Timings for 6144K FFT length (48 cores, 16 workers): 20.99, 20.72, 20.95, 20.82, 20.78, 21.04, 20.89, 20.78, 14.67, 13.45, 14.54, 14.61, 13.46, 13.71, 13.94, 14.91 ms. Throughput: 949.13 iter/sec.
Timings for 6144K FFT length (48 cores, 24 workers): 57.30, 56.56, 56.51, 56.29, 56.69, 56.99, 56.67, 56.94, 56.71, 56.65, 56.85, 56.70, 26.03, 30.28, 25.78, 27.11, 29.24, 29.75, 27.03, 29.71, 27.35, 28.07, 30.19, 28.15 ms. Throughput: 637.96 iter/sec.
Timings for 6144K FFT length (48 cores, 48 workers): 130.05, 132.14, 128.50, 128.65, 129.51, 129.92, 128.45, 129.71, 128.78, 129.41, 130.18, 128.95, 130.09, 129.10, 130.14, 129.61, 128.04, 130.51, 129.25, 129.42, 129.92, 130.49, 129.53, 131.25, 86.04, 102.15, 87.65, 103.38, 74.75, 91.32, 91.88, 76.62, 75.89, 103.66, 101.44, 101.42, 95.30, 93.57, 79.79, 102.96, 72.71, 95.29, 98.47, 87.29, 100.54, 87.55, 94.32, 102.50 ms. Throughput: 449.43 iter/sec.
MPrime by default wanted to use 12 workers, but 2 workers is significantly faster.

2021-11-17, 15:06   #68
drkirkby

"David Kirkby"
Jan 2021
Althorne, Essex, UK

26×7 Posts

Quote:
 Originally Posted by tdulcet I ran the throughput benchmark on a c5.metal instance and got different results.
Are you using the same CPUs as I had - 2 x Intel Xeon Platinum 8275CL CPU @ 3.00GHz? The c5.metal does not specify the CPU type.

Amazon AWS seems to use some odd CPUs, which makes them difficult to use in other machines. I had a couple of high-spec CPUs (I think 26 core 2.6 GHz), but they would not run in my Dell 7920. Apparently they were used by Amazon for AWS, but were not supported by many motherboards.

2021-11-17, 16:39   #69
tdulcet

"Teal Dulcet"
Jun 2018

5·11 Posts

Quote:
 Originally Posted by drkirkby Are you using the same CPUs as I had - 2 x Intel Xeon Platinum 8275CL CPU @ 3.00GHz? The c5.metal does not specify the CPU type.
Yes, the same CPU:
Code:
\$ wget https://raw.github.com/tdulcet/Linux-System-Information/master/info.sh -qO - | bash -s

Linux Distribution: Ubuntu 20.04.3 LTS
Linux Kernel: 5.11.0-1020-aws
Computer Model: Amazon EC2 c5.metal 1.0
Processor (CPU): Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz
CPU Cores/Threads: 48/96
Architecture: x86_64 (64-bit)
Total memory (RAM): 193053 MiB (189GiB) (202431 MB (203GB))
Total swap space: 0 MiB (0 MB)
Disk space: nvme0n1: 512000 MiB (500GiB) (536870 MB (537GB))
I was not sure how many workers would be used, so I ended up getting much more disk space than I now need for the PRP proof files, even at proof power 10...

 Thread Tools

 Similar Threads Thread Thread Starter Forum Replies Last Post GP2 Cloud Computing 4 2020-08-03 11:21 ZFR Software 4 2018-02-02 20:18 kladner Science & Technology 7 2017-03-02 14:18 dragonbud20 Information & Answers 12 2015-09-26 21:40 GARYP166 Information & Answers 11 2009-07-13 19:39

All times are UTC. The time now is 07:41.

Sat Jan 29 07:41:49 UTC 2022 up 190 days, 2:10, 1 user, load averages: 1.41, 1.41, 1.27

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔