View Single Post
2019-07-08, 01:50   #5
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

194C16 Posts
Probability estimates for trial factors in same bit level or class

The odds of finding a factor in a bit level of trial factoring are ln(bitlevel/(bitlevel-1)), or approx. 1/(bitlevel - 1/2) (better than 18 ppm at bitlevel 70, 15 ppm at bitlevel 76, 12 ppm at 86). These estimates are probabilities for completing the entire bit level, not quitting after the first class containing a factor.

What are the odds of finding more than one trial factor in a bit level?
per https://mersenneforum.org/showpost.p...postcount=1147
no factors: (bitlevel-1)/bitlevel
one factor: (bitlevel-1)/bitlevel * (ln(bitlevel/(bitlevel-1)))
Two factors: (bitlevel-1)/bitlevel * (ln(bitlevel/(bitlevel-1)))2 / 2
N factors: (bitlevel-1)/bitlevel * (ln(bitlevel/(bitlevel-1)))n / n!
The probability of finding n factors declines quickly with increasing n.

In the same class of the same bit level?

Two factors/bit-level case
This case appears likely to occur many thousands of times in p<109.
If two factors or more are found in the same bit level, it doesn't matter which class the first one occupies.
Assuming the class location of the next factor found is independent, there's a one in c chance it coincides with the class of the first factor, where c is the ACTIVE number of classes. For MORE_CLASSES, the 2x2x3x5x7x11 = 4620 classes are winnowed down to 960 by whole-class presieving for very small primes, so c=960; for LESS_CLASSES, the 420 classes are winnowed down to 96, so c=96.
LESS_CLASSES: 96/420 = 1/2 x 2/3 x 4/5 x 6/7 = 48 / 210 = ~.2286
MORE_CLASSES: 960/4620= 1/2 x 2/3 x 4/5 x 6/7 x 10/11 = 480 / 2310 =~ .2078 (9.1% smaller)
hypothetical EXTRA_CLASSES: 11520 / 60060 = 1/2 x 2/3 x 4/5 x 6/7 x 10/11 x 12/13 = 5760 / 30030 = ~.1918 (7.7% smaller)

The chance of two factors both in the same class is then (bitlevel-1)/bitlevel * (ln(bitlevel/(bitlevel-1)))2 / 2 / c.
For 75 bits, 74/75 * (ln(75/74))2/2/c = 8.89E-5/c; c=96 for less-classes so 9.26E-7, or c=960 for more-classes so 9.26E-8 (using Fortran style exponential notation).

If two factors are found in the same bit level, the chance of them landing in the same class is 1/c, and the chance of different classes 1-1/c. (1/c +(1-1/c) = 1, sum of probabilities = 1, check.)

Three factors/bit-level case
This case appears likely to occur hundreds of times in p<109.
If three factors are found in the same bit level, the chance of none of them coinciding in the same class is p0= (c-1)/c * (c-2)/c = (c2-3c+2)/c2 = 1- (3c-2)/c2
The chance of any two of them landing in the same class is the second one matches the first but the third does not, or the first two do not match but the third matches one of them,
p2 = 1/c (c-1)/c +(c-1)/c 2/c = (c-1)/c2 + (2c-2)/c2 = 3(c-1)/c2
while the chance of all 3 landing in the same class is 1/c2, for a combined chance of p3 = (3c-2)/c2.
(p0 +p2 +p3 = 1- (3c-2)/c2 + 3(c-1)/c2 + 1/c2 = 1, check.)

For less-classes the chance of 2 or more landing in the same class is ( 3 x 96 - 2 ) / 962 =~ .031033
while for more-classes it's (3 x 960 - 2 ) / 9602 =~ .0031228 chance of coincident classes.
The probability of 3 factors is (bitlevel-1)/bitlevel * (ln(bitlevel/(bitlevel-1)))3 / 3!
For bitlevel 75, p=74/75 * (ln(75/74))^3 / 3! = 3.978E-7, so per factoring attempt,
less-classes yields a chance of coincident factors by 3 found of 1.233E-8 while more-classes yields 1.241E-9.

The combinations of two or three factors per bit level coinciding in a class are then, for 75 bits,
Less-classes: 9.26E-7 + 1.233E-8 = 9.38E-7
More-classes: 9.26E-8 + 1.241E-9 = 9.38E-8
The odds of four or more are not zero, but they fall to the right of the rounding of the above figures.

Note, for three or more factors per bit level, these are not binomial distributions (since probabilities of matching are not constant with repeated trials)

Four factors/bit-level case
This case appears likely to occur few times in p < 109.
If four factors are found in the same bit level, the chance of each falling in a separate class is the chance of 3 successive misses. The chance of any two of them landing in the same class is, the second matches the first but the third and fourth do not, otherwise the third matches the first or second but the others do not, or otherwise the fourth matches one of the preceding 3 but the first three did not match each other.
The chance of 3 coinciding is 4-fold; 123, 124, 134, or 234; 1/c2 * (c-1)/c + (1/c)(c-1)/c * 1/c + (c-1)/c(1/c)2 + (c-1)/c(1/c)2= 4(c-1)/c3
The chance of all four coinciding is 1/c3.
no coincidences: p0= (c-1)/c (c-2)/c (c-3)/c = (c-1)(c-2)(c-3)/c3 = 1 - (6 c2 -11c +6)/c3)
coincidence of just 2: p2= 1/c (c-1)(c-2)/c2 + (c-1)(c-2)/c2 * 2 / c + 3/c *(c-1)(c-2)/c2 = 6(c-1)(c-2)/c3
coincidence of 3 in a class: p3= 4(c-1)/c3
coincidence of 4 in the same class: p4= 1/c3
The probability sums for the four-factors cases listed above don't add up to probability one, so something's not quite right yet. P0+p2+p3+p4 = 1 -3(c-1)/c3 instead of one.
There's also a slight possibility of a double coincidence of two, which could occur one of three ways.
A) First and second match, third misses first/second but third and fourth match. p22a=1/c * (c-1)/c * 1/c for this subcase.
B) First and third match, second and fourth match, first and second do not match. p22b=p22a.
C) First and fourth match, second and third match, first and second do not match. p22c=p22a.
Total, p22=3(c-1)/c3

In total, chances of 2 or more coinciding: 6(c-1)(c-2)/c3 + 4(c-1)/c3 + 1/c3 +3(c-1)/c3 = ( 6(c-1)(c-2) + 4(c-1) + 1 +3(c-1)) / c3
= ( 6c2 -18c +12 +7c -7 +1 ) / c3 = (6 c2 -11c +6) / c3

none: c=96, p0=0.938687; c=960, p= 0.9937619
2 coincide: c=96, p2=0.060560; c=960, p2= 0.0062305
3: c=96, p3=4.295e-4; c=960, p3= 4.34E-6
4: c=96, p4=1.13e-6; c=960, p4= 1.13E-9
dual-2: c=96, p22=0.000322; c=960, p22=3.25e-6

2 to 4: c=96, pn=0.061313; c=960, pn= 0.0062381
c=96, p0+pn= 1, check;
c=960, p0+pn= 1, check.

More than four factors/bit-level cases
These cases are very unlikely to occur in p<109 and so are not calculated here.
They also rapidly become more complex, as illustrated by the numbers of coincidence subcases possible versus number of factors found rises.
5: 2; 3; 4; 5; 22; 23
6: 2; 3; 4; 5; 6; 22; 23; 24; 33; 222
7: 2; 3; 4; 5; 6; 7; 22; 23; 24; 25; 33; 34; 222; 223
8: 2; 3; 4; 5; 6; 7; 8; 22; 23; 24; 25; 26; 33; 34; 35; 44; 222; 223; 224; 233; 2222
9: 2; 3; 4; 5; 6; 7; 8; 9; 22; 23; 24; 25; 26; 27; 33; 34; 35; 36; 44; 45; 222; 223; 224; 225; 233; 234; 333; 2222; 2223
10: 2; 3; 4; 5; 6; 7; 8; 9; 10; 22; 23; 24; 25; 26; 27; 28; 33; 34; 35; 36; 37; 44; 45; 46; 55; 222; 223; 224; 225; 226; 233; 234; 235; 244; 333; 334; 2222; 2223; 2224; 2233; 22222

Summary
Odds are estimated for a variety of bit levels and number of factors per level or class in the attachment. Odds are sufficient that in p<109, up to 2 factors per class seems likely to occur sometime (dozens to thousands of occurrences depending on factoring depth and frequency of use of less-classes), but 5 or more factors per class occurring sometime is unlikely; 3 or more is unlikely from 65 bits upward.

Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1
Attached Files
 tf odds.pdf (14.1 KB, 290 views) tf odds3.pdf (16.9 KB, 243 views)

Last fiddled with by kriesel on 2021-12-13 at 13:10