mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2022-06-27, 02:14   #1
timbit
 
Mar 2009

22·5 Posts
Default How to get mprime to run self bench?

Hi,
I'm have a fresh install of mprime on a linux x64 (Ubuntu 2204) machine. I have it testing an exponent (ECM) and it's running really really slow. On a computer with almost identical hardware running Windows 10, the ECM on almost same exponent is running about 2 times faster.

I thought mprime would run a self bench (autobench) after a day or two of running? I suspect the linux mprime is using a non-optimized FFT, or the FFT size is too big.

I have explicitly set AutoBench=1 for prime.txt on the mprime machine, but I still haven't seen the program trigger the autobench.

Where is the optimized FFT data stored, so maybe i can copy the FFT knowledge from one machine to the other.
timbit is offline   Reply With Quote
Old 2022-06-27, 03:09   #2
MattcAnderson
 
MattcAnderson's Avatar
 
"Matthew Anderson"
Dec 2010
Oregon, USA

3×17×23 Posts
Smile

Welcome to mersenneforum.org !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
MattcAnderson is offline   Reply With Quote
Old 2022-06-27, 03:45   #3
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

3×3,581 Posts
Default

The memory on the slower machine, are all the modules same speed and same manufacturer? And how are the banks filled? Improper memory set-up can slow down a machine dramatically.
Uncwilly is offline   Reply With Quote
Old 2022-06-27, 04:46   #4
timbit
 
Mar 2009

22×5 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
The memory on the slower machine, are all the modules same speed and same manufacturer? And how are the banks filled? Improper memory set-up can slow down a machine dramatically.
Actually the slower machine has DDR4-2400, faster has DDR4-2133.

Again, slower machine is 2 times slower. No idea why. I'm trying to get slower machine to run an autobench, but does not do so for whatever reason. Faster one does.
timbit is offline   Reply With Quote
Old 2022-06-27, 15:28   #5
timbit
 
Mar 2009

22·5 Posts
Default

Quote:
Originally Posted by timbit View Post
Actually the slower machine has DDR4-2400, faster has DDR4-2133.

Again, slower machine is 2 times slower. No idea why. I'm trying to get slower machine to run an autobench, but does not do so for whatever reason. Faster one does.
Faster machine = Win10
4 sticks DDR4-2133 ECC RIMM Quad channel
Intel Xeon E5-1607 v3 @ 3.1 Ghz
Does ECM 4 threads at ~M1000000 range (B1=1000000) stage 1 = 1450 sec, stage 2, 850 sec, total = ~2300 sec

Slower machine = Ubuntu 2204
4 sticks DDR4-2400 ECC RDIMM Quad channel
Intel Xeon E5-2680 v4 @ 2.9 Ghz
Does ECM 6 threads at ~M1000000 range (B1=1000000) stage 1 = 3400 sec, stage2 = 1100 sec, total = ~4500 sec

Faster RAM, more threads, and 2 times slower? Again, I'm trying to trigger an autobench so the program can choose the best FFT algorithm. Surely the linux box can do a curve faster than 4500 sec.
timbit is offline   Reply With Quote
Old 2022-06-27, 15:56   #6
axn
 
axn's Avatar
 
Jun 2003

124358 Posts
Default

It would be more helpful if you post the screen outputs from both systems.

BTW, according to ark, E5-2680 v4 is a 14-core system, so you should be able to run 14 threads at the same time.
axn is offline   Reply With Quote
Old 2022-06-27, 16:09   #7
timbit
 
Mar 2009

22·5 Posts
Default

Quote:
Originally Posted by axn View Post
It would be more helpful if you post the screen outputs from both systems.

BTW, according to ark, E5-2680 v4 is a 14-core system, so you should be able to run 14 threads at the same time.
I am well aware the E5-2480 v4 is a 14 core system. Due to the slowness of any Mersenne work I assign to it, I only use 1 worker (8 threads).

Is 8 threads the max for any worker? I am setting 1 worker, 14 cores and the most I see for any worker is 8 threads. I would expect 14 unless that is some upper limit?

I don't know what you are expecting to see from the output logs. Looks like usual ECM logs except it is taking very long time to finish.

How does mprime select the fastest FFT implementation? Does manually starting a benchmark help?

Last fiddled with by timbit on 2022-06-27 at 16:19 Reason: I can obtain logs later on. Both machines are in different locations.
timbit is offline   Reply With Quote
Old 2022-06-27, 16:21   #8
axn
 
axn's Avatar
 
Jun 2003

540510 Posts
Default

The output will show the FFT selected, the worker configuration, affinities, etc. Let us understand the problem first before attempting a solution/

ECM work you're mentioning (M1000000) is very small and does not multithread. You'll most likely get the best thruput by running 14 workers.
axn is offline   Reply With Quote
Old 2022-06-27, 17:06   #9
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·29·59 Posts
Default

Quote:
Originally Posted by timbit View Post
Is 8 threads the max for any worker?
No. I've benchmarked up to 68 cores/worker on a Xeon Phi 7250.
prime95 number of cores (threads) supported is 512 or 1024 https://mersenneforum.org/showpost.p...&postcount=202

Yes, you can manually trigger a benchmark, and optionally specify what range of fft sizes are benchmarked, what list or range of core counts per worker, whether HT is tried or not, etc. Start by experimenting with few fft sizes for speed of experimentation. See also https://www.mersenneforum.org/showpo...4&postcount=11 and its attachments.

Last fiddled with by kriesel on 2022-06-27 at 17:07
kriesel is online now   Reply With Quote
Old 2022-06-28, 05:14   #10
timbit
 
Mar 2009

22·5 Posts
Default

OK I've deleted the results.bench.txt and the gwnum.txt files.
I've manually run the thoughput tests on the same size FFT.
Now it's running the ECM again. I'll let this go for a day or two and I'll see if the autobench runs again.
Regardless of the results in results.bench.txt or gwnum.txt, it never seems to select the FFT with the most throughput. Odd.

Also I still cannot get more than 8 threads on a worker.

Last fiddled with by timbit on 2022-06-28 at 05:17 Reason: I forgot something
timbit is offline   Reply With Quote
Old 2022-06-28, 05:48   #11
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×29×59 Posts
Default

As axn wrote, # of useful cores/worker is a function of fft size, which is a function of exponent.
I don't run 1M ECM, but run DC & first time primality test wavefront PRP, and up to 1G P-1, big exponents, big ffts, higher core counts.
Your 1M/ ~20 bits/word ~ 50K fft size. As axn wrote, that only needs/uses one core, not multithreaded.
Run lots of workers, one core each. Downside is that will multiply demand for main memory.
How many GB of ram do you have installed per system?
How much did you set the prime95 setting to, increased from the very low default for daytime and nighttime P-1, P+1, ECM stage 2?

Last fiddled with by kriesel on 2022-06-28 at 05:58
kriesel is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Running fstrim on SSD while mprime is running might cause errors in mprime AwesomeMachine Software 4 2021-10-07 23:49
Radeon VII on a mining-like bench Viliam Furik Viliam Furik 17 2021-01-14 08:12
mprime from git SELROC Software 2 2018-10-30 10:16
2 x AMD Opteron 2427 @ 2.39 GHz - prime95 bench- joblack Hardware 2 2010-03-12 19:38
Problem with mprime (Fixed with mprime -d) antiroach Software 2 2004-07-19 04:07

All times are UTC. The time now is 13:47.


Sat Oct 1 13:47:23 UTC 2022 up 44 days, 11:15, 1 user, load averages: 1.14, 1.17, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔