Quote:
Originally Posted by SethTro
I've have recently doubled my rate of finding records with two tricks.
I have a new algorithm that speeds up sieving by ~10,000x. This means deeper sieving meaning fewer PRPs. I have two additional tricks that also give a huge speed advantage.
For each 'm' I calculate prob(record gap) given the results of deep sieving, this forms a distribution (see attachment bottom row). I only test records from the top ~20% (why waste time on 'm' with low chance of finding a record).
Once I have an m with high prob(record gap) I find prev_prime and recalculate prob(record gap  prev prime), 90% of the time prob(record gap  prev prime) < prob(record gap) / 2 so I skip finding next_prime as the chance of record is lower than going to the next m. I get (.1 * prob(record) + .9 * .75 * prob(record)) / (.1 * 2 + .9 * 1) => 70% of the prob(record) in covered in 50% of the time, in practice it's even better because the faster prev_prime are more likely to skip so the denominator is more like (0.1 * (1.5 + 1) + .9 * 0.5).
These tricks mean that I'm finding ~54 records / day vs ~25 records / day (testing all m, both sides).

To explain this post slightly more. This is the debug info from a run over
N = [1,100,000] * 3001#/2190 note that
M = 100,000 but only ~27,000 values of m are coprime with d=2190.
The 9 graphics are
Top Row:
1. probability of gap = X
2. observed gaps from 15,000 tests and average gap (16207)
3. cdf of gap <= X
Middle Row:
1. Before the run I calculate the probability of gap > 12 merit. orange line is the sum of those probabilities if I perform the tests in order, green if I perform the tests best to work, blue is the observed count of gaps with merit > 12 (notice how nicely it matches!)
2. Histogram of count of m with prob(gap > 12 merit)
3. CDF of over those probabilities (e.g. 100%95% = 5% of m have more than a 0.0501 chance of having gap with merit greater than 12, 10% have greater than a 0.0474 chance.
Bottom Row (same as middle but with Prob(gap is record) instead of Prob(gap > 12 merit).
1. Note that 5 records were found (you can see them here:
https://github.com/primegaplistpro...ommit/88f092c1), this exceeds the expectation of finding ~2.2 records. I assume that's just some extra luck (my current run has found 34 and expects 40 from 65000 intervals).
2. Ditto middle row
3. My point in #100 was why test the 50% of records with probability of record < 0.0000951 when we could be testing the 5% of numbers with probability > 0.000148.