mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2020-04-14, 11:18   #1
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3·457 Posts
Default Analysis of roundoff error in LL/PRP iterations

I want to share my recent findings about the distribution of roundoff error in PRP iterations in gpuowl, which may apply more generally.

The setup:
The PRP iteration consists of a floating point convolution. After each iteration, a conversion is done from double (DP FP) to integer. This conversion produces the correct result as long as the roundoff error is smaller than 0.5. A higher bits-per-word results in higher roundoff error, and this is the limiting factor in how large an exponent can be tackled by a given FFT size.

gpuowl can now record the max-roundoff error in every iteration, and also compute some statistics over a number of iterations such as: the mean of the iteration-max-roundoff, and the max of the iteration-max-roundoff. (and other software, such as mprime, also can compute similar statistics of the max-roundoff error).

The question:
Let's say that I have access to a iteration-max-roundoff sample of size N. I want to know how likely it is that the max-roundoff over the whole test (i.e. K iterations, K much larger than N) would reach some limit threshold (such as 0.5).

As an example let's say that, for an exponent around 100M, I run N=40000 iterations, and I observe a mean max-roundoff of 0.2487 and a max max-roundoff of 0.382, can I infer whether this setup is good to run the exponent for the remaining 100M iterations? What if the mean is 0.2794 and the max is 0.407, is this safe for 100M iterations?

And, do I have enough information above to answer the questions? What if I threw in the variance of the max-roundoff sample, would that help -- how?

To be able to evaluate how worrisome (or not) are those numbers, it would be useful to know more about the underlying distribution of the max-roundoff. Any guess as to what the distribution might be?

Note: there is a Take-aways section at the end with a non-technical summary.

Turns out,
the relevant distribution here is Gumbel https://en.wikipedia.org/wiki/Gumbel_distribution
which describes the distribution of the max ("extreme value") of a sample of some distribution (not any distribution generates a Gumbel for extreme value).

I did validate empirically, and the iteration-max-roundoff values we see are excellently fitted by a Gumbel distribution. So the task now is to derive the parameters of the Gumbel distribution given our sample of N iterations.

Gumbel has two parameters, u "location" and beta "scale". It also has:
mean = u + beta * gamma (gamma is the Euler-Mascheroni constant)
variance = pi^2 / 6 * beta^2.

Given the relative large size of our sample (N > 10000), we can reverse-approximate the distribution params this way:

beta ~= sqrt(variance * 6 / pi^2) ~= SD * C,
(where SD = "standard deviation" (sqrt(variance), because N is large)
and C = sqrt(6)/pi )

u ~= mean - beta * gamma

At this point we have the distribution, and using its CDF we can compute the answers to the initial questions.

Given a point 'x' for which we want to compute the CDF, it is helpful to map it to a "standard" z:
z(x, u, beta) = (x - u) / beta, then CDF(x) = e^(-e^(-z)),

In z(), we can substitute u=mean - beta*gamma, and beta = SD*C, and we get:
z(x,mean,SD) = (x - mean)/(SD*C) + gamma

Long story short, it turns out that if we want a probability "p" of max-roundoff to be under a limit "limit" over a run of k iteration (in the example k == 100M), then we need
z(limit) > log(k) - log(-log(p)) (log being the natural logarithm)

the computed z value can afterwards be expressed in terms of "number of standard deviations from mean":

(limit - mean) / SD = (z - gamma) * C

As an example, 16 standard-deviations from mean bounds max-roundoff over 100M iterations with about 93% probability.


Take-aways:
1. the max-roundoff-per-iteration is governed by a distribution "Gumbel" that is quite different from the Normal Distribution. Gumbel has a much fatter right-tail -- meaning that the max-roundoff sees a lot of variation and is spread over a much larger region (relative to what one might naively expect based on the standard deviation of the sample).

2. this makes the "max" statistic not very useful in characterizing the roundoff error -- because of its high jumpiness. More useful are the mean and the variance (or SD) of the max-roundoff sample.

3. "rule of thumb": for a 100M exponent, we can take 16 standard-deviations from mean as an upper bound on the roundoff error likely to be reached during the test. A halving or doubling of the number of iterations corresponds (roughly) to a half-a-SD step.

Last fiddled with by preda on 2020-04-14 at 11:35
preda is offline   Reply With Quote
Old 2020-04-14, 20:52   #2
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

11101011001102 Posts
Default

Always cool to discover some new practical math.

Can you summarize the results for gpuowl on a 5M or 5.5M FFT?
For example, if we target a 1 in 10000 chance of a fatal roundoff error in a 100M iteration run, what would be the target average roundoff error?

IIRC correctly, prime95 targets a 0.243 average roundoff error in exponents this size.
Prime95 is online now   Reply With Quote
Old 2020-04-14, 23:24   #3
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101010110112 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Always cool to discover some new practical math.

Can you summarize the results for gpuowl on a 5M or 5.5M FFT?
For example, if we target a 1 in 10000 chance of a fatal roundoff error in a 100M iteration run, what would be the target average roundoff error?

IIRC correctly, prime95 targets a 0.243 average roundoff error in exponents this size.
To work that out, we'd also need the standard deviation (of a sample of the roundoff error). I suspect there is a relation linking the standand deviation and the mean of the roundoff error, but I don't know that relation yet -- I'm looking for some empirical hints to what it might be. This relation (linking SD and the mean) may also depend on some particulars of the implementation -- e.g. it could be that changing the FFT configuration affects the mean and the SD in different ways -- then we don't have a relation between mean and SD that "always holds".

But for your question, we'd need roughly
mean <= 0.5 - 21.1 * SD
for a 1 in 10000 chance of a 0.5 roundoff over the whole 100M test (note, a 0.5 roundoff has 1/2 chance of being "fatal"). The key number there is "21 standard deviations".

for a 1 in 1000 chance, that distance becomes 19.3 standard deviations, and for 1 in 100 is 17.5 SDs.

I worked those numbers this way:
C=sqrt(6)/pi
gamma=0.5772
p=0.9999
k=100000000
(log(k) - log(-log(p)) - gamma) * C


When running gpuowl with "-use STATS" it displays both the mean and the SD of the observed roundoff.
preda is offline   Reply With Quote
Old 2020-04-15, 02:35   #4
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

2×53×71 Posts
Default

Some quick tests with gpuowl, 5M FFT show an average roundoff error of 0.26 has 15 std.devs of safety, while an error of 0.25 has 17 std.devs of safety, 0.241 error has 18.5 std devs.

The above was done using a mere 30,000 iterations of data.

What should gpuowl's default settings be? PRP might select looser defaults than LL and P-1. Perhaps a command line argument to let the user tune how aggressive they'd like to be.

Plus, for unattended operations, if PRP does run into a repeatable error near the FFT limit it should switch to a higher FFT length to get past the troublesome spot. Right now, I think gpuowl tries 3 times and then aborts. One could lose valuable compute time before noticing that gpuowl has exited.
Prime95 is online now   Reply With Quote
Old 2020-04-17, 01:06   #5
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

31×173 Posts
Default

I'm running some -use STATS in v6.11-259 on my suddenly problematic RX550 in 5M fft.
The performance hit over v6.11-257 without -use STATS is about 17%; system cpus are dual xeon E5645; OS Win7 X64.
kriesel is online now   Reply With Quote
Old 2020-04-17, 12:02   #6
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

55B16 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Some quick tests with gpuowl, 5M FFT show an average roundoff error of 0.26 has 15 std.devs of safety, while an error of 0.25 has 17 std.devs of safety, 0.241 error has 18.5 std devs.
If we assume and approximate that the ratio SD/mean of the iteration-max-roundoff is roughly equal to 0.059, then we get the formula

limit-mean(n) = 0.5 / (1 + 0.059 * n)

where "n" is the number of standard-deviations desired based on target probability of fatal roundoff.
For 100M iterations,
1-in-10000: n=21.1, limit-mean=0.223
1-in-1000 : n=19.3, limit-mean=0.234
1-in-100 : n=17.5, limit-mean=0.246
1-in-10 : n=15.7, limit-mean=0.259


Also note that a limit-mean producing e.g. 1-in-100 for 100M, is approximately the same as 1-in-200 for 50M and 1-in-50 for 200M (because one can see 100M iterations as 2x 50M etc).

Last fiddled with by preda on 2020-04-17 at 12:11
preda is offline   Reply With Quote
Old 2020-04-17, 16:04   #7
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

31·173 Posts
Default

Quote:
Originally Posted by preda View Post
If we assume and approximate that the ratio SD/mean of the iteration-max-roundoff is roughly equal to 0.059
...
Also note that a limit-mean producing e.g. 1-in-100 for 100M, is approximately the same as 1-in-200 for 50M and 1-in-50 for 200M (because one can see 100M iterations as 2x 50M etc).
I get about 0.0595 for a 94.7M run.
Your figures imply about 1 in 30 for 100Mdigit, 1 in 10 for ~999M.
kriesel is online now   Reply With Quote
Old 2020-04-17, 19:05   #8
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
República de California

103·113 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Plus, for unattended operations, if PRP does run into a repeatable error near the FFT limit it should switch to a higher FFT length to get past the troublesome spot. Right now, I think gpuowl tries 3 times and then aborts. One could lose valuable compute time before noticing that gpuowl has exited.
You're referring to a repeatable Gerbicz-check error, right? In which case repeatability is to be inferred via some short-length checksum on the G-check residue?

For such cases, would it be possible to enable carry-step-with-ROE-checking for the inteval-retry, in order to confirm that the G-check error is accompanied by (= presumably due to) a danger-level fractional part?
ewmayer is offline   Reply With Quote
Old 2020-04-20, 05:23   #9
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

3×457 Posts
Default

Quote:
Originally Posted by preda View Post
If we assume and approximate that the ratio SD/mean of the iteration-max-roundoff is roughly equal to 0.059, then we get the formula

limit-mean(n) = 0.5 / (1 + 0.059 * n)
SD/mean is commonly called "coefficient of variation", CV. Looking at more experimental data, it appears that the CV for the iteration-max-roundoff is not stable, it changes quite a lot -- I see values between 0.055 to 0.06 -- depending on exponent and FFT config.

This is not a problem, but it shows that it is not possible to use just the average of the iteration-max-roundoff -- the standard-deviation (or something equivalent) is also needed.
preda is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Radix-8 roundoff / noise Prime95 Math 8 2018-11-03 16:38
Roundoff error bcp19 Software 4 2013-02-14 21:23
Prime95 roundoff errors pjaj Software 18 2011-07-20 03:04
Roundoff Error Penalty nevarcds Software 5 2004-08-28 14:29
Roundoff Error Message Teseo77Madrid Hardware 21 2004-06-02 14:59

All times are UTC. The time now is 18:24.


Fri Jul 16 18:24:53 UTC 2021 up 49 days, 16:12, 1 user, load averages: 2.31, 2.59, 2.32

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.