View Single Post
Old 2022-06-28, 16:18   #13
timbit
 
Mar 2009

22×5 Posts
Default

Hi, thanks for all the replies.

My slow system has 32 GB DDR4-2400 ECC RDIMM (4 * 8GB) RAM running in quad channel. (Intel Xeon E5-2680 v4, 14 cores, 28 threads). In the local.txt I have:

Memory=28672 during 7:30-23:30 else 28672

So essentially 28GB RAM available to mprime.

I've deleted the results.bench.txt and gwnum.txt. I'm then invoked mprime -m, did the self-throughput test, with 1 worker 4 cores, 48kB FFT size. Yes, I see the replies saying "it doesn't multithread will with small FFTs", and "use many workers, 1 core". Yes, I will lean towards that from now on. But I am doing 1 worker, 4 cores as my benchmark.

I've attached snippets of results.bench.txt, and gwnum.txt.

Now according to the results.bench.txt, the fastest is:

FFTlen=48K, Type=3, Arch=4, Pass1=768, Pass2=64, clm=1 (4 cores, 1 worker): 0.21 ms. Throughput: 4848.21 iter/sec.

Pass1=768, Pass2=64, clm=1. Right?

However, when I invoke mprime -d on my ECM 999181 I see:

[Work thread Jun 27 22:30] Using FMA3 FFT length 48K, Pass1=768, Pass2=64, clm=2, 4 threads

This isn't the fastest FFT selection given the results from the self-benchmark. Is this normal? Also I still cannot see any autobenchmark in my work, can anyone explain what triggers the autobench? I see other posts complaining about it, and how to prevent it, I want to do the opposite and invoke it!

The reason I am asking for the autobench, is on one of my other machines it was also running slow for a day or two on a new exponent, then the autbench kicked in, and it was much much faster after that. Obviously a different FFT implementation was chosen which made throughput higher.

In the attached log_snippet.txt, ECM curve 50 phase 1 is taking almost 3000 seconds. Other machine with DDR4-2133 it's taking < 2000 seconds for phase 1. That machine has been shutdown for the summer unfortunately, so I cannot access it again until the fall.


Also when I run htop there are no other processes taking up CPU cycles (mprime is using 4 out of 14 cores anyways).
Attached Files
File Type: txt gwnum.txt (681 Bytes, 46 views)
File Type: txt results.bench.txt (10.2 KB, 43 views)
File Type: txt log_snippet.txt (2.4 KB, 50 views)

Last fiddled with by timbit on 2022-06-28 at 16:22 Reason: formatting
timbit is offline   Reply With Quote