![]() |
|
|
#23 |
|
Undefined
"The unspeakable one"
Jun 2006
My evil lair
22×1,553 Posts |
Then the whole benchmark thing is useless. If you can't get reliable figures for real production usage then using benchmarks from a different runtime environment configuration is not going to help you.
I think bgbeuning had it correct. Use actual live data to select you run parameters. |
|
|
|
|
|
#24 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
22·7·269 Posts |
I think I have my general game plan in mind. Thanks to all for the lively discussion.
First off, I'll only include the best 2 to 4 implementations for each FFT size so that even if benchmarking fails to find the best implementation then we won't be losing too much performance. My plan is to store in local.txt several (10?) throughput benchmarks for each FFT size. Prime95 will schedule a job to run at random times. This job scans worktodo.txt to see which FFT sizes we will need now or in the near future. Then it runs 20-second benchmarks for each FFT implementation where we don't already have sufficient benchmark data. Prime95 will remember the CPU brand string from CPUID to detect moving local.txt to a new machine. Also, bench data will be dated and deleted after say 6 months. This also limits the damage from copying local.txt to new machines. The theory is that if we do several benchmarks, hopefully a substantial number will be done while interference from other apps is minimal or non-impactful. Next help I'll need is throughput benchmarks from your machines so that I can figure out the best 2 to 4 implementations to include. |
|
|
|
|
|
#25 |
|
"D"
Sep 2015
5·7 Posts |
respectfully request a user-selectable option to manually run all needed tuning jobs at once or on-demand and be done with it. I assume I can provide an FFT range set, and know my own system's loads.
|
|
|
|
|
|
#26 |
|
Einyen
Dec 2003
Denmark
7·11·41 Posts |
|
|
|
|
|
|
#27 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
22×7×269 Posts |
I'll need 1 and 4 (all cores), but you'll have to wait for me to make a special prime95 version that includes many more FFT implementations. Probably need all FFTs above 64K. I'll post instructions when the time comes.
Last fiddled with by Prime95 on 2016-05-01 at 16:00 |
|
|
|
|
|
#28 |
|
Jan 2013
4416 Posts |
Couldn't the benchmarks just be labeled as a "stress test/benchmark" that you run after the person clicks on Join GIMPS? I mean they've already clicked to join and logged in. Then the burn in crowd won't be bothered with it.
Alternately, have a version of the burn in stuff that builds a giant database of FFT performance |
|
|
|
|
|
#29 | |
|
Jul 2005
2·7·13 Posts |
Quote:
Do we know that the best implementation really depends on CPUID at all? Then all GIMPS users could share/re-use their benchmarks somehow via GIMPS. The benchmark is only needed to run if the is no cached benchmark available via GIMPS. But I have doubts about this. I guess it's more important what else is running on the machine, how is the CPU cache used by other processes or how is the OS-kernel (process scheduler) configured. The idealistic benchmark on the idle machine may not tell us the best FFT implementations for the case when the system is running normally. So I would not run extra benchmarks on the client. I would just run and measure from time to time the current real workload using different implementations. Maybe once per day, switch to the 4 implementations one after the other and continue with the best one. This way you would never need to waste CPU time (except that you are not using the best implementation a few minutes per day.) Last fiddled with by rudi_m on 2016-05-03 at 10:23 |
|
|
|
|
|
|
#30 | |
|
Jun 2003
22×33×47 Posts |
Quote:
What this thread is trying to achieve is to find out which FFT implementation is best for which hardware configuration(s). What George thought was improvement to some FFTs (as measured by his Haswell / Skylake) turned out to be a regression for some other machines. So... scientifically establish which is the best FFT implementation for as many h/w configurations as possible, and use that data to select at runtime the most optimal one by benchmarking a limited subset of FFT implementations. Last fiddled with by axn on 2016-05-03 at 11:17 |
|
|
|
|
|
|
#31 | ||
|
Jul 2005
B616 Posts |
Quote:
Quote:
BTW benchmarking the real workload could be also used to find the optimal CPU affinity automatically. For example I have one setup with two LL threads and two factoring threads. I get the most throughput when running LL on CPU 0 and 2, and factoring on CPU 1 and 3. Should be possible to find out this automatically by testwise moving the threads between CPUs. |
||
|
|
|
|
|
#32 | |
|
Jun 2003
22·33·47 Posts |
Quote:
I think you are misunderstanding the CPUID thing. That is to make sure that the locally recorded benchmark data can be discarded if it is copied to another machine. |
|
|
|
|
|
|
#33 | ||
|
Jul 2005
2·7·13 Posts |
Quote:
In theory it could even happen that two workers should use different implementations to get the most throughput together. (e.g. one worker uses more CPU cache the other less.) Quote:
George wrote: "The theory is that if we do several benchmarks, hopefully a substantial number will be done while interference from other apps is minimal or non-impactful." But I think 1. We should not use the absolute benchmark times from certain dates and situations of the history, They are in general not comparable. 2. I _want_ the benchmark running _while_ the usual interference from other apps (or other mprime workers!) rather than filtering for ideal, synthetic benchmarks. |
||
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Prime95 - stop all workers on error [feature request] | kql | Software | 1 | 2020-12-31 15:15 |
| New Feature! | Xyzzy | Lounge | 0 | 2017-01-07 22:52 |
| Feature request: Prime95 priority higher than 10 | JuanTutors | Software | 19 | 2006-10-29 04:09 |
| Prime95 Version 24.13 "Feature" | RMAC9.5 | Software | 2 | 2006-03-24 21:12 |
| Designing a home system for CNT. | xilman | Hardware | 6 | 2004-10-21 19:41 |