mersenneforum.org Prime k-value density rating and factorizations
 Register FAQ Search Today's Posts Mark Forums Read

2009-07-30, 09:02   #23
gd_barnes

May 2007
Kansas; USA

100111011011102 Posts

Quote:
 Originally Posted by henryzz how do you plan to get that much data into a spreadsheet?
Ah well, gee. I don't know. lol

Think about it. It's only 501 k's and 4-5 columns of data and a graph or two. Do you only have 256 KB of RAM?

2009-07-30, 10:14   #24
kar_bon

Mar 2006
Germany

53178 Posts

Quote:
 Originally Posted by henryzz how do you plan to get that much data into a spreadsheet?
i'll what i can do. to get data with k-value,#primes and search-range i can do instantly.
to get the primes in a special range it's a bit tricky or manual work.

i could make the range upto k=3000, higher k's are not really representative.

2009-07-30, 10:44   #25
Flatlander
I quite division it

"Chris"
Feb 2005
England

31·67 Posts

Quote:
 Originally Posted by gd_barnes All 501 k's up to k=1001 should be sufficient to get close to statistical significance within weight ranges if there is any significance to be found.
I've been itching for something like that for ages.

Quote:
 Originally Posted by gd_barnes .....IF the k-value has more than ~30 primes so as to have statistical significance...Gary
Is the '~30' a guess? That's not like you Gary.
Guesses should have at least two decimal places to sound convincing.

2009-07-30, 12:06   #26
gd_barnes

May 2007
Kansas; USA

2×72×103 Posts

Quote:
 Originally Posted by Flatlander Now your talking! I've been itching for something like that for ages. Is the '~30' a guess? That's not like you Gary. Guesses should have at least two decimal places to sound convincing.

LMAO. More like a SWAG! I stated that because in general it is true but it is possible to get significance with far fewer primes. It's just that the actual percentage deviation (not standard deviation) from the mean must be far greater to get the same significance. In other words, you might need triple the # of expected primes if the expectation is only 5 primes but you might only need 50% higher if the expectation is near 30 primes.

There are actually two sets of deviations needed. A number of standard deviations that a specific k has from the normal number of primes for its particular weight range -and- the number of standard deviations that the AVERAGE of a particular weight range has from the norm. It will take some thought to get it right. Looking at the rating percentage within weight ranges will be far easier to get some significance on.

As a general rough rule, where t = expected number of trials needed for the event to occur one time and where o = expected # of occurrences within all trials, 1 st dev. will equal:

sqrt (o) * (t-1) / t

The formula is slightly over-simplified and where o is very small, the bell-shaped curve becomes skewed with its highest point actually higher than the mean to make up for the fact that you cannot have less than zero occurrences of something. But as a general rule, as o becomes reasonable sized, the formula works quite well. The key is to have a large enough sample size.

So if the rating percentage comes out to .002 and the expected # of primes for the k-value based on its weight range is 5, then you need 1/.002 = 500 "trials" to expect 1 prime. Therefore 1 st. dev. would be:

sqrt (5) * 499 / 500 = ~2.23.

Therefore to be at 3 st. devs., which would be almost signficant if you were looking at 100 k's, you'd have to have at least 5 + 2.23*3 = 11.69 or 12 primes for the k in question.

For a k with an expectation of 30 primes, which would likely have a rating percentage of .004, 1 st. dev. would be:

sqrt (30) * 249 / 250 = ~5.455.

So you would need 30 + 5.455*3 = 46.37 or 47 primes to be at 3 st. devs.

As you can see, in the first example, it takes 2.4 times the mean to hit 3 st. devs. whereas in the second, it only takes 1.57 times the mean. That means the first example will have a far higher rating percentage than will the second example yet the # of st. devs. is the same.

The point of all of this jibberish is that a k with a specific high rating percentage but few primes will not have near the significance as a k with the same high rating percentage but with many primes. The latter will be a much higher st. dev. from the norm. The more the # of primes, the more difficult it is to acquire EITHER a HIGH or LOW rating percentage.

The best and most extreme example of all of this is k=2293, which has no primes. It would have a rating percentage of zero and a prime density rating (my original formula) of infinity, which would be worse than anything and apparently very significant until viewed in the realm of st. devs. from the norm. For such a low weight k, having zero primes is likely not statistically significant. That said, with its search range past n=4M now, its "true expectation" may now be nearing 5 primes. Assuming so, the first example above may be close to representative, which would put it slightly outside of a negative 2 st. devs. Now...if we test to n=20M without a prime, it's likely to be near 4 st. devs. and it would make one wonder what is causing it. But then you have to take into account degrees of freedom. In doing so, out of 2293 / 2 = ~1150 k's, we're bound to have a few over 3 st. devs. so 4 st. devs. may still only give an 85-90% confidence level that the k is abnormally barren as a result of some unforeseen situation, which I speculate to be an abnormally high # of large factors due to the high # of small factors.

Gary

 Similar Threads Thread Thread Starter Forum Replies Last Post wblipp Math 20 2011-09-07 21:45 CRGreathouse Math 1 2010-08-22 23:47 Random Poster Math 1 2008-12-15 01:14 jasong Math 18 2006-03-31 03:14 hbock Lone Mersenne Hunters 1 2004-03-07 19:51

All times are UTC. The time now is 20:13.

Sat Apr 4 20:13:57 UTC 2020 up 10 days, 17:47, 0 users, load averages: 1.31, 1.78, 1.64