![]() |
|
|
#1 |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
I want to share my recent findings about the distribution of roundoff error in PRP iterations in gpuowl, which may apply more generally.
The setup: The PRP iteration consists of a floating point convolution. After each iteration, a conversion is done from double (DP FP) to integer. This conversion produces the correct result as long as the roundoff error is smaller than 0.5. A higher bits-per-word results in higher roundoff error, and this is the limiting factor in how large an exponent can be tackled by a given FFT size. gpuowl can now record the max-roundoff error in every iteration, and also compute some statistics over a number of iterations such as: the mean of the iteration-max-roundoff, and the max of the iteration-max-roundoff. (and other software, such as mprime, also can compute similar statistics of the max-roundoff error). The question: Let's say that I have access to a iteration-max-roundoff sample of size N. I want to know how likely it is that the max-roundoff over the whole test (i.e. K iterations, K much larger than N) would reach some limit threshold (such as 0.5). As an example let's say that, for an exponent around 100M, I run N=40000 iterations, and I observe a mean max-roundoff of 0.2487 and a max max-roundoff of 0.382, can I infer whether this setup is good to run the exponent for the remaining 100M iterations? What if the mean is 0.2794 and the max is 0.407, is this safe for 100M iterations? And, do I have enough information above to answer the questions? What if I threw in the variance of the max-roundoff sample, would that help -- how? To be able to evaluate how worrisome (or not) are those numbers, it would be useful to know more about the underlying distribution of the max-roundoff. Any guess as to what the distribution might be? Note: there is a Take-aways section at the end with a non-technical summary. Turns out, the relevant distribution here is Gumbel https://en.wikipedia.org/wiki/Gumbel_distribution which describes the distribution of the max ("extreme value") of a sample of some distribution (not any distribution generates a Gumbel for extreme value). I did validate empirically, and the iteration-max-roundoff values we see are excellently fitted by a Gumbel distribution. So the task now is to derive the parameters of the Gumbel distribution given our sample of N iterations. Gumbel has two parameters, u "location" and beta "scale". It also has: mean = u + beta * gamma (gamma is the Euler-Mascheroni constant) variance = pi^2 / 6 * beta^2. Given the relative large size of our sample (N > 10000), we can reverse-approximate the distribution params this way: beta ~= sqrt(variance * 6 / pi^2) ~= SD * C, (where SD = "standard deviation" (sqrt(variance), because N is large) and C = sqrt(6)/pi ) u ~= mean - beta * gamma At this point we have the distribution, and using its CDF we can compute the answers to the initial questions. Given a point 'x' for which we want to compute the CDF, it is helpful to map it to a "standard" z: z(x, u, beta) = (x - u) / beta, then CDF(x) = e^(-e^(-z)), In z(), we can substitute u=mean - beta*gamma, and beta = SD*C, and we get: z(x,mean,SD) = (x - mean)/(SD*C) + gamma Long story short, it turns out that if we want a probability "p" of max-roundoff to be under a limit "limit" over a run of k iteration (in the example k == 100M), then we need z(limit) > log(k) - log(-log(p)) (log being the natural logarithm) the computed z value can afterwards be expressed in terms of "number of standard deviations from mean": (limit - mean) / SD = (z - gamma) * C As an example, 16 standard-deviations from mean bounds max-roundoff over 100M iterations with about 93% probability. Take-aways: 1. the max-roundoff-per-iteration is governed by a distribution "Gumbel" that is quite different from the Normal Distribution. Gumbel has a much fatter right-tail -- meaning that the max-roundoff sees a lot of variation and is spread over a much larger region (relative to what one might naively expect based on the standard deviation of the sample). 2. this makes the "max" statistic not very useful in characterizing the roundoff error -- because of its high jumpiness. More useful are the mean and the variance (or SD) of the max-roundoff sample. 3. "rule of thumb": for a 100M exponent, we can take 16 standard-deviations from mean as an upper bound on the roundoff error likely to be reached during the test. A halving or doubling of the number of iterations corresponds (roughly) to a half-a-SD step. Last fiddled with by preda on 2020-04-14 at 11:35 |
|
|
|
|
|
#2 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
752610 Posts |
Always cool to discover some new practical math.
Can you summarize the results for gpuowl on a 5M or 5.5M FFT? For example, if we target a 1 in 10000 chance of a fatal roundoff error in a 100M iteration run, what would be the target average roundoff error? IIRC correctly, prime95 targets a 0.243 average roundoff error in exponents this size. |
|
|
|
|
|
#3 | |
|
"Mihai Preda"
Apr 2015
3·457 Posts |
Quote:
But for your question, we'd need roughly mean <= 0.5 - 21.1 * SD for a 1 in 10000 chance of a 0.5 roundoff over the whole 100M test (note, a 0.5 roundoff has 1/2 chance of being "fatal"). The key number there is "21 standard deviations". for a 1 in 1000 chance, that distance becomes 19.3 standard deviations, and for 1 in 100 is 17.5 SDs. I worked those numbers this way: C=sqrt(6)/pi gamma=0.5772 p=0.9999 k=100000000 (log(k) - log(-log(p)) - gamma) * C When running gpuowl with "-use STATS" it displays both the mean and the SD of the observed roundoff. |
|
|
|
|
|
|
#4 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
752610 Posts |
Some quick tests with gpuowl, 5M FFT show an average roundoff error of 0.26 has 15 std.devs of safety, while an error of 0.25 has 17 std.devs of safety, 0.241 error has 18.5 std devs.
The above was done using a mere 30,000 iterations of data. What should gpuowl's default settings be? PRP might select looser defaults than LL and P-1. Perhaps a command line argument to let the user tune how aggressive they'd like to be. Plus, for unattended operations, if PRP does run into a repeatable error near the FFT limit it should switch to a higher FFT length to get past the troublesome spot. Right now, I think gpuowl tries 3 times and then aborts. One could lose valuable compute time before noticing that gpuowl has exited. |
|
|
|
|
|
#5 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
123638 Posts |
I'm running some -use STATS in v6.11-259 on my suddenly problematic RX550 in 5M fft.
The performance hit over v6.11-257 without -use STATS is about 17%; system cpus are dual xeon E5645; OS Win7 X64. |
|
|
|
|
|
#6 | |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
Quote:
limit-mean(n) = 0.5 / (1 + 0.059 * n) where "n" is the number of standard-deviations desired based on target probability of fatal roundoff. For 100M iterations, 1-in-10000: n=21.1, limit-mean=0.223 1-in-1000 : n=19.3, limit-mean=0.234 1-in-100 : n=17.5, limit-mean=0.246 1-in-10 : n=15.7, limit-mean=0.259 Also note that a limit-mean producing e.g. 1-in-100 for 100M, is approximately the same as 1-in-200 for 50M and 1-in-50 for 200M (because one can see 100M iterations as 2x 50M etc). Last fiddled with by preda on 2020-04-17 at 12:11 |
|
|
|
|
|
|
#7 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
14F316 Posts |
Quote:
Your figures imply about 1 in 30 for 100Mdigit, 1 in 10 for ~999M. |
|
|
|
|
|
|
#8 | |
|
∂2ω=0
Sep 2002
República de California
103×113 Posts |
Quote:
For such cases, would it be possible to enable carry-step-with-ROE-checking for the inteval-retry, in order to confirm that the G-check error is accompanied by (= presumably due to) a danger-level fractional part? |
|
|
|
|
|
|
#9 | |
|
"Mihai Preda"
Apr 2015
137110 Posts |
Quote:
This is not a problem, but it shows that it is not possible to use just the average of the iteration-max-roundoff -- the standard-deviation (or something equivalent) is also needed. |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Radix-8 roundoff / noise | Prime95 | Math | 8 | 2018-11-03 16:38 |
| Roundoff error | bcp19 | Software | 4 | 2013-02-14 21:23 |
| Prime95 roundoff errors | pjaj | Software | 18 | 2011-07-20 03:04 |
| Roundoff Error Penalty | nevarcds | Software | 5 | 2004-08-28 14:29 |
| Roundoff Error Message | Teseo77Madrid | Hardware | 21 | 2004-06-02 14:59 |