mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Math (https://www.mersenneforum.org/forumdisplay.php?f=8)
-   -   Statistics problem: max. runtime of n instances? (https://www.mersenneforum.org/showthread.php?t=3762)

Mystwalker 2005-02-28 02:21

Statistics problem: max. runtime of n instances?
 
Note: This question is unrelated to GIMPS and purely theoretical.

Assume a number "n" of instances of a certain process. The runtime is normal distributed with mean "m" and standard deviation "sd".

What is the expected max(runtime) for "n" instances?
Obviously, for n=1, max(runtime) = mean.

My intuition says: for n=1, max(runtime) = mean + sd

Maybe take this example:
mean: 10 sec
sd: 2 sec
n: 2,5 and 10

I guess the formula should be something like max(runtime) = mean + sqrt(n-1) * sd

Thanks in advance! :smile:

dave_dm 2005-02-28 10:37

I've had a crack at doing this symbolically (integrating x * pdf of max(X_1, X_2, ..., X_n)) but I don't seem to be getting anywhere.

Here's the numerical data for 1 <= n <= 20. Since

{ max of n normal(mu, sigma^2) } = mu + sigma * { max of n normal(0,1) },

we only need to work out the values for a standard normal(0,1) distribution.

[CODE]n : max
1 : 0
2 : 0.5641895835
3 : 0.8462843753
4 : 1.029375373
5 : 1.162964474
6 : 1.267206361
7 : 1.352178376
8 : 1.423600306
9 : 1.485013162
10: 1.538752731
11: 1.586436352
12: 1.62922764
13: 1.667990177
14: 1.703381554
15: 1.735913445
16: 1.765991393
17: 1.793941981
18: 1.820031879
19: 1.844481512
20: 1.86747506[/CODE]

It would be nicer to see a symbolic expression though!

Dave

leifbk 2005-02-28 11:00

A normal distribution has neither maximum nor minimum, it's open at both ends. But IIRC, about 96% of the distribution will occur in the interval m pluss/minus 2 sd units.

/Leif

dave_dm 2005-02-28 19:02

[QUOTE=leifbk]A normal distribution has neither maximum nor minimum, it's open at both ends. But IIRC, about 96% of the distribution will occur in the interval m pluss/minus 2 sd units.

/Leif[/QUOTE]

If X_1, ..., X_n are independent, normally-distributed random variables then Y = max(X_1, ..., X_n) is perfectly well-defined. Mystwalker's problem is to find the expectation of Y.

Your point about the normal distribution bell curve being heavily weighted in the middle does show that the guess "E(max(runtime) = mean + sqrt(n-1) * sd" cannot be correct - take for instance n=5.

Dave

Mystwalker 2005-03-01 04:59

Thanks for your help!
I've hoped that there is an easy formula. Oh well, maybe I figure it out when I put some more thoughts into it.
Priority for a solution has dropped, though, because I just submitted the paper in which a solution would have been nice. It was no big deal to leave it out. When I do a follow-up paper, it could be relevant again. Let's see...

maxal 2005-03-01 21:33

[QUOTE=dave_dm]If X_1, ..., X_n are independent, normally-distributed random variables then Y = max(X_1, ..., X_n) is perfectly well-defined. Mystwalker's problem is to find the expectation of Y.[/QUOTE]
This is a well-studied problem. You can find the answer and some references at [url]http://mathworld.wolfram.com/ExtremeValueDistribution.html[/url]

Mystwalker 2005-03-01 23:10

Thanks a lot, this seems to be exactly what I'm looking for! :smile:


All times are UTC. The time now is 15:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.