mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2016-04-28, 03:32   #12
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

3×1,993 Posts
Default

Quote:
Originally Posted by TObject View Post
Can figure out what makes certain algorithms run faster on certain architectures, so that no benchmarking is necessary?
No doubt, more information would help making better informed guesses. But to do that we'd want to write benchmarking software and have lots of people with different hardware setups run it, right? So we end up in the same place.
CRGreathouse is offline   Reply With Quote
Old 2016-04-28, 10:18   #13
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7×11×41 Posts
Default

Maybe at the start of an LL test you can run 20 sec benchmark on 2 or 3 possible FFTs based on the exponent.

Even if the computer is in use the relative difference in speed between FFTs should be the same unless the user starts or stops anything during those 40-60 sec?
ATH is offline   Reply With Quote
Old 2016-04-28, 10:59   #14
0PolarBearsHere
 
0PolarBearsHere's Avatar
 
Oct 2015

2·7·19 Posts
Default

Quote:
Originally Posted by Prime95 View Post
1) The benchmark really wants to run when nothing else is going on. Even so, the OS may fire up some process that skews a run. So, I need a good algorithm that tracks multiple runs and throws out outlier data.
2) My first thought was to have a menu choice to run the throughput benchmark. I suspect it will not get run on many machines. Alternatively, I could run the benchmark when prime95 is launched or at a late night hour or both until we have enough runs we are confident in our throughput data. Ideas? How many runs before we are confident in the results?
3) Any ideas on how best to detect a significant change (CPU, memory, whatever?) and discard the accumulated data?
1) and 3) could be partially worked around by having P95 generate cpu load itself, and then have the benchmark results track, size, cpu usage, and performance.

2) If you're only looking at 20 second benchmarks, then there should be no issue doing them not only when P95 starts, but also each time a worker process is unpaused/resumed. Ultimately, even pausing every hour for a 20second benchmark will add less than 5 hours to a one month run-time. And this would only be until enough has been run anyway, after which, if successful, the improved results could make up for a lot of that.
0PolarBearsHere is offline   Reply With Quote
Old 2016-04-29, 05:18   #15
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

7×11×43 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I was going to use 20 second benchmarks running on all cores to determine a machine's throughput for each possible FFT implementation.
I may be crazy for saying so (and feel free), but what about bumping the priority up from "idle" when doing that initial benchmark. Maybe not quite "normal", but something in the 5-6 range, just a notch or two down from normal priority? If it's only 20 seconds... I mean, I could live with that, but maybe not everyone would be on board. Prime95 running in idle-priority is kind of a big selling feature, so even a higher-than-idle for a short duration could be a bad move.

As for when to run the benchmark, my opinion is that running a benchmark at the beginning of each LL test to determine the beset FFT method/size. If things change on the system mid-run, well... it's not the end of the world, and it might make a better choice when it starts the next one.

If the benchmark runs at idle priority and happens to run when the system is busy, then it probably doesn't matter one way or the other if it picks a slightly less than efficient method/FFT size... the system is apparently used for other things and won't be running Prime95 at full speed anyway, so a percent decrease in throughput in a "pure" benchmark is probably far less of a loss in real world use.
Madpoo is offline   Reply With Quote
Old 2016-04-29, 06:02   #16
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

22·1,553 Posts
Default

Quote:
Originally Posted by Madpoo View Post
I may be crazy for saying so (and feel free), but what about bumping the priority up from "idle" when doing that initial benchmark. Maybe not quite "normal", but something in the 5-6 range, just a notch or two down from normal priority? If it's only 20 seconds... I mean, I could live with that, but maybe not everyone would be on board.
You are crazy for the following reason. If during normal usage it runs at lowest priority, then it is pointless to get false readings by doing benchmarks at a different priority.

My car can do 300mph on a test track with a professional driver[1], so that makes it the best car for me to drive around town in a traffic jam when I am sleepy.

[1] Not really, of course. It can really only reach 299mph.
retina is online now   Reply With Quote
Old 2016-04-29, 16:14   #17
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

33×192 Posts
Default

Quote:
Originally Posted by retina View Post
You are crazy for the following reason. If during normal usage it runs at lowest priority, then it is pointless to get false readings by doing benchmarks at a different priority.
I disagree. (Not that Aaron isn't crazy; that's an orthogonal question...)

If the very short benchmarks are dependant on "wall time" then bumping up the priority might better help determine which FFT parameters are optimal. Although only a 20 second temporal sample seems too small to me.

This statement assumes that optimal performance of the FFT will be the same under load as not, and it is simply the benchmarking which might be impacted. This seems like a reasonable Ass-u-me'tion....
chalsall is offline   Reply With Quote
Old 2016-04-29, 20:25   #18
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

3×491 Posts
Default

Quote:
Originally Posted by retina View Post
If during normal usage it runs at lowest priority, then it is pointless to get false readings by doing benchmarks at a different priority.
This vaguely reminds me of the VW emissions scandal...
lycorn is offline   Reply With Quote
Old 2016-04-30, 05:30   #19
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

7·11·43 Posts
Default

Quote:
Originally Posted by chalsall View Post
I disagree. (Not that Aaron isn't crazy; that's an orthogonal question...)

If the very short benchmarks are dependant on "wall time" then bumping up the priority might better help determine which FFT parameters are optimal. Although only a 20 second temporal sample seems too small to me.

This statement assumes that optimal performance of the FFT will be the same under load as not, and it is simply the benchmarking which might be impacted. This seems like a reasonable Ass-u-me'tion....
Yeah, you get it. The point being, if you're doing A/B benchmarking, you really want to ensure that everything else is essentially equal, and it's only your code branches that differ.

If you leave it at idle priority and you're spiking the CPU with some other app while benchmark A is going, then it'll give you false readings if you stop whatever it was while benchmark B is going.

You want to compare apples to apples and bumping the priority up from idle during the benchmark is a good way of doing that, I just don't know if that's the right/appropriate/best way to do it.
Madpoo is offline   Reply With Quote
Old 2016-04-30, 11:34   #20
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7·11·41 Posts
Default

Quote:
Originally Posted by Madpoo View Post
If you leave it at idle priority and you're spiking the CPU with some other app while benchmark A is going, then it'll give you false readings if you stop whatever it was while benchmark B is going.
But even if there is other cpu usage the same FFT should still be fastest as in an idle benchmark. Except if the cpu usage varies a lot during those 20 sec?
ATH is offline   Reply With Quote
Old 2016-04-30, 13:17   #21
axn
 
axn's Avatar
 
Jun 2003

22·33·47 Posts
Default

Quote:
Originally Posted by ATH View Post
Except if the cpu usage varies a lot during those 20 sec?
Exactly. Once you allow uncontrollable variables, everything's a crapshoot.

Last fiddled with by axn on 2016-04-30 at 13:18
axn is online now   Reply With Quote
Old 2016-04-30, 17:39   #22
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

63578 Posts
Default

Quote:
Originally Posted by ATH View Post
But even if there is other cpu usage the same FFT should still be fastest as in an idle benchmark. Except if the cpu usage varies a lot during those 20 sec?
That's exactly the problem... the CPU usage can vary wildly.

Well, at least in my experience... that's the whole reason I decided to setup the affinity scramble stuff in the first place because the routine that tries to autodetect which CPU threads are pairs of the same core kept failing and doing weird things. And that was just from server activity within a few seconds time period. All it takes is for something like IIS to have a spike in CPU for a split second while compiling something, and things are thrown off.

If there was a way to guarantee a quiet-period when nothing else was running while these benchmarks happened, that'd be great, but you know that'll never happen. Even something like moving the mouse around will throw things off a bit (old interrupt driven hardware drivers sure, but still...)
Madpoo is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Prime95 - stop all workers on error [feature request] kql Software 1 2020-12-31 15:15
New Feature! Xyzzy Lounge 0 2017-01-07 22:52
Feature request: Prime95 priority higher than 10 JuanTutors Software 19 2006-10-29 04:09
Prime95 Version 24.13 "Feature" RMAC9.5 Software 2 2006-03-24 21:12
Designing a home system for CNT. xilman Hardware 6 2004-10-21 19:41

All times are UTC. The time now is 08:36.


Tue Jul 27 08:36:58 UTC 2021 up 4 days, 3:05, 0 users, load averages: 1.25, 1.58, 1.68

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.