View Single Post
Old 2022-06-29, 19:57   #19
timbit
 
Mar 2009

22×5 Posts
Default

Quote:
Originally Posted by kriesel View Post
Found in prime95's undoc.txt (emphasis mine):
Code:
Most FFT sizes have several implementations.  The program uses throughput benchmark
data to select the fastest FFT implementation.  The program assumes all CPU cores
will be used and all workers will be running FFTs.  This can be overridden
in gwnum.txt:
    BenchCores=x
    BenchWorkers=y
If the program is selecting what would be fastest with all cores busy, as its documentation states, and your expectation is it would select what would be fastest with a few of the 14 cores busy, that may account for some discrepancies between expectation and observed operation.
I'd be interested to see what George's take is on the clm=2 vs. 1 performance and selection.
Hi, ya I saw that too. I didn't think much of it because the autobench would know your current configuration (plus the manual throughput test, the user is inputting the number of cores and workers).
I'll stop mprime, save the entire directory and give the new version a try (it's a beta and I'll keep that in mind) in a fresh directory.
timbit is offline   Reply With Quote