mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Prime Gap Searches

Reply
 
Thread Tools
Old 2020-11-01, 11:36   #111
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×2,909 Posts
Default

It strikes me that we are sieving the wrong dimension. In A*p#/d+x we currently sieve over x. Why don't we sieve over A? For each A we will count how many sieves over x it has survived. The As with the lowest score can then be looked at in the traditional way. Generally, we have a much larger range of A than the maximum gap size so the number of sieves should be reduced. This should mean deeper sieving.
henryzz is offline   Reply With Quote
Old 2020-11-03, 15:46   #112
robert44444uk
 
robert44444uk's Avatar
 
Jun 2003
Oxford, UK

22×13×37 Posts
Default

Quote:
Originally Posted by henryzz View Post
It strikes me that we are sieving the wrong dimension. In A*p#/d+x we currently sieve over x. Why don't we sieve over A? For each A we will count how many sieves over x it has survived. The As with the lowest score can then be looked at in the traditional way. Generally, we have a much larger range of A than the maximum gap size so the number of sieves should be reduced. This should mean deeper sieving.
Hi henryzz - generally speaking we do sieve over A in our deficient primorial searches - A is favoured when 5mod30 or 25mod30, but others also get a look in - so I tend to look at 1,5,7,11,13,17,19,23,25,and 29mod30. In terms of mod210, the best is achieved at 35mod210 and its partner 175mod210. This is quite useful when looking at very small gaps.
robert44444uk is online now   Reply With Quote
Old 2020-11-04, 11:45   #113
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

10110101110102 Posts
Default

Quote:
Originally Posted by robert44444uk View Post
Hi henryzz - generally speaking we do sieve over A in our deficient primorial searches - A is favoured when 5mod30 or 25mod30, but others also get a look in - so I tend to look at 1,5,7,11,13,17,19,23,25,and 29mod30. In terms of mod210, the best is achieved at 35mod210 and its partner 175mod210. This is quite useful when looking at very small gaps.
I was more thinking about fairly deep sieving rather than what is effectively a wheel sieve. I have coded a crude sieve that can do things like sieve A=1 to 1M for x=-50k to 50k up to p=100k in about 2-3 hours. Using this to find gaps for 6199#/11# currently and am finding record gaps several times faster than before. It seems to reckon that I should be using d=7# instead of 11# as a lot of the As with the least remaining seem to have 11 as a factor. I included a table below of the probability(only based on the size of the primorial and sieve depth ignoring A) of the whole 100k range not being prime at every 5 percentiles. The probability seems to drop really fast after testing the best 25% of the dataset. Probably the best use of this is to search a range far bigger than you hope to test to find the best candidates. I am currently searching in descending order.

Code:
          0%           5%          10%          15%          20%          25%          30% 
1.599538e-10 4.355472e-08 3.659155e-07 4.189505e-07 1.694673e-06 2.834374e-06 1.046412e-05 
         35%          40%          45%          50%          55%          60%          65% 
1.260456e-05 1.339614e-05 1.423743e-05 1.597335e-05 6.638588e-05 1.672100e-04 1.875974e-04 
         70%          75%          80%          85%          90%          95%         100% 
6.385621e-04 9.265170e-04 9.913896e-04 1.035973e-03 1.078905e-03 1.138927e-03 1.744451e-03
There a few things that could improve this. Sieving deeper gives a better ordering of the candidates. Searching a larger range of candidates generally seems to find much better candidates. I suspect that given a large enough range of A you probably don't want to sieve more than 1 or 2 percent of the whole dataset as that will be the best bit.

Here are the results of a regression fit predicting the count remaining as the gcd of A and the divisor varies. 11 is the best by quite a bit and then 1, 7 and 77 are fairly close together. The fit is amazingly accurate.
Code:
Call:
lm(formula = count ~ . - 1, data = sparse[sparse$k > 0, ])

Residuals:
     Min       1Q   Median       3Q      Max 
-137.862  -17.733    0.148   17.603  143.204 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
gcddiv1    2.029e+03  5.882e-02   34491   <2e-16 ***
gcddiv2    3.314e+03  5.882e-02   56349   <2e-16 ***
gcddiv3    2.561e+03  8.318e-02   30785   <2e-16 ***
gcddiv5    2.166e+03  1.176e-01   18411   <2e-16 ***
gcddiv6    4.361e+03  8.318e-02   52425   <2e-16 ***
gcddiv7    2.032e+03  1.441e-01   14105   <2e-16 ***
gcddiv10   3.910e+03  1.176e-01   33236   <2e-16 ***
gcddiv11   1.963e+03  1.860e-01   10554   <2e-16 ***
gcddiv14   3.765e+03  1.441e-01   26135   <2e-16 ***
gcddiv15   3.153e+03  1.664e-01   18952   <2e-16 ***
gcddiv21   2.871e+03  2.038e-01   14089   <2e-16 ***
gcddiv22   3.571e+03  1.860e-01   19199   <2e-16 ***
gcddiv30   5.100e+03  1.664e-01   30658   <2e-16 ***
gcddiv33   2.693e+03  2.630e-01   10239   <2e-16 ***
gcddiv35   2.344e+03  2.882e-01    8135   <2e-16 ***
gcddiv42   5.042e+03  2.038e-01   24743   <2e-16 ***
gcddiv55   2.216e+03  3.720e-01    5957   <2e-16 ***
gcddiv66   4.798e+03  2.631e-01   18240   <2e-16 ***
gcddiv70   4.530e+03  2.882e-01   15721   <2e-16 ***
gcddiv77   2.048e+03  4.556e-01    4495   <2e-16 ***
gcddiv105  3.678e+03  4.075e-01    9025   <2e-16 ***
gcddiv110  4.285e+03  3.720e-01   11518   <2e-16 ***
gcddiv154  4.117e+03  4.556e-01    9035   <2e-16 ***
gcddiv165  3.420e+03  5.261e-01    6500   <2e-16 ***
gcddiv210  5.921e+03  4.075e-01   14529   <2e-16 ***
gcddiv231  3.096e+03  6.443e-01    4806   <2e-16 ***
gcddiv330  5.648e+03  5.260e-01   10738   <2e-16 ***
gcddiv385  2.485e+03  9.111e-01    2728   <2e-16 ***
gcddiv462  5.580e+03  6.443e-01    8662   <2e-16 ***
gcddiv770  5.004e+03  9.111e-01    5493   <2e-16 ***
gcddiv1155 4.044e+03  1.289e+00    3138   <2e-16 ***
gcddiv2310 6.571e+03  1.290e+00    5094   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 26.81 on 999967 degrees of freedom
Multiple R-squared:  0.9999,	Adjusted R-squared:  0.9999 
F-statistic: 4.568e+08 on 32 and 999967 DF,  p-value: < 2.2e-16
henryzz is offline   Reply With Quote
Old 2020-11-04, 12:03   #114
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

FB16 Posts
Default

Quote:
Originally Posted by henryzz View Post
I was more thinking about fairly deep sieving rather than what is effectively a wheel sieve. I have coded a crude sieve that can do things like sieve A=1 to 1M for x=-50k to 50k up to p=100k in about 2-3 hours.
More on this today. I'm going to releasing my code which can do this interval is a handful of seconds.

Last fiddled with by SethTro on 2020-11-04 at 12:04
SethTro is offline   Reply With Quote
Old 2020-12-07, 11:29   #115
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts
Default

Does a divisor not make sense when using a CRT offset?

I have minimized the remaining candidates ignoring the primes in the divisor, calculated the CRT offset and then sieved multiple ks over the same range of x for k*p#/d+CRT+x and got worse results than for just k*p#/d+x. Am I doing something wrong or does it just not make sense?

Is there a way of combining the two methods usefully?
henryzz is offline   Reply With Quote
Old 2020-12-07, 14:39   #116
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2×2,909 Posts
Default

Quote:
Originally Posted by henryzz View Post
Does a divisor not make sense when using a CRT offset?

I have minimized the remaining candidates ignoring the primes in the divisor, calculated the CRT offset and then sieved multiple ks over the same range of x for k*p#/d+CRT+x and got worse results than for just k*p#/d+x. Am I doing something wrong or does it just not make sense?

Is there a way of combining the two methods usefully?
The more I think about it the more a divisor doesn't make sense with a CRT offset as you are not trying for the trade off of more really close factors and less further away in the same way. Would love to be wrong though.
henryzz is offline   Reply With Quote
Old 2020-12-08, 02:35   #117
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

251 Posts
Default

I published Arxiv: "Combined Sieve Algorithm for Prime Gaps" today.

There's a write up on running the code in https://www.mersenneforum.org/showthread.php?t=26286
SethTro is offline   Reply With Quote
Old 2020-12-08, 17:39   #118
mart_r
 
mart_r's Avatar
 
Dec 2008
you know...around...

2×313 Posts
Default

Quote:
Originally Posted by SethTro View Post
I published Arxiv: "Combined Sieve Algorithm for Prime Gaps" today.

There's a write up on running the code in https://www.mersenneforum.org/showthread.php?t=26286

That's amazing! Excellent work!
(Now I only need to wrap my head around running the program... things like building GMP and compiling libraries is something I haven't actually done much yet )
mart_r is offline   Reply With Quote
Old 2020-12-10, 11:59   #119
robert44444uk
 
robert44444uk's Avatar
 
Jun 2003
Oxford, UK

22×13×37 Posts
Default

Quote:
Originally Posted by henryzz View Post
The more I think about it the more a divisor doesn't make sense with a CRT offset as you are not trying for the trade off of more really close factors and less further away in the same way. Would love to be wrong though.
Hi henryzz

I think you are right, over the long run, deficient primorial offsets are not as efficient as defiicent primorials around the centre point, but they still provide a pretty sparse set of non-smooth integers in a range - enough to be interesting.

Over a wider range of say 40 merit, I don't think any solution provides a good result although in theory, the offset will produce a better result than the centre based solution, but the number of non-smooth candidates to check is still such that a result is unlikely at that level. In fact to date, only one 40 merit has been found out of the infinite number of gaps of this size (!!)

But certainly, the deficient primorial offsets I am generating with ATH software produces at lot of 20 merit and I have 4 records in the last week at the 3k gap area
robert44444uk is online now   Reply With Quote
Old 2020-12-10, 13:04   #120
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

2·2,909 Posts
Default

I can generate offsets a fair bit better than ATH's code. I have been meaning to backport my code into his.
My code follows the approach of varying the offset for each prime individually. Even one pass is pretty good and better than just trying random guesses and fiddling with the last few primes like ATH's code does.

I have implemented changing 2, 3 or 4 offsets at once. This can provide small improvements but is a lot slower(especially as the number of primes increases).
henryzz is offline   Reply With Quote
Old 2020-12-21, 11:06   #121
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

FB16 Posts
Default

I wanted to write down a little bit of analysis I've been doing recently on the computation of Probability of Record, P(record) given a partial sieve.

This uses a number of conventions from the combined sieve.
K = P#/d
N = m * K for many m
SL = sieve length (how far after and before N to sieve)

At a high level we can estimate the P(Record) by taking the double sum,
Code:
for X in {x | 0 < x <= SL, (x, K) = 1}
    for Y in {y | -SL <= y < 0, (y, K) = 1}
        if record[X+Y] > N:
            P_record += ProbFirstPrime(X) * ProbFirstPrime(Y)


 ProbPrimeCoprime = 1 / log(N) * Product(1 - 1/p, p | K)
 ProbFirstPrime(X) = (1 - ProbPrime)^(smaller coprimes) * ProbPrime
I call this the "direct" computation and we can use a number of couple of tricks to make it faster (store log_record instead of record, keep min_gap_for_record and skip Y with X+Y < min_gap_for_record, ...).

This doesn't make any use of the partial sieve but it's easy to replace {X | 0 < x < SL, (x, K) = 1} with the result of the partial sieve and ProbPrimeCoPrime with ProbPrimeAfterSieve = 1 / log(N) * Product(1 - 1/p, p <= P_limit).

The next part of the analysis is to mix and match using the partial sieve results up to SL and coprime results up to some larger limit (~30*log(N)).

X = partial_sieve + larger coprimes

This gives us four regions of X cross Y; partial_sieve * partial_sieve, partial * coprime, coprime * partial, coprime * coprime. I call these results "direct", "extended" (partial * coprime, coprime * partial) and "extended^2" (coprime * coprime).

Note that the results from "extended^2" is the same for all m so it doesn't help with differentiating P(record) per m which is useful for various reasons.

From previous work it seems optimal to use SL \(= \approx 10\log(N)\) but I was worried if P(record) could be accurately computed with SL that "small". It turns out a reasonable percentage of P(record) is calculated at this point but maybe SL \(= \approx 12\log(N)\) is more "optimal".

NOTE: If you inspect the graphs you see that the P(record) is overestimated by extended & extended^2 after a bit of work I've explained this to myself as primes just larger than P# expect to divide 1/prime numbers in X but that slightly over counts because of the weird way that P# aligns small primes. I don't have the right code to validate this statement but it's my operating assumption.
Attached Thumbnails
Click image for larger version

Name:	p_record_1511.png
Views:	19
Size:	30.1 KB
ID:	24001   Click image for larger version

Name:	p_record_1667.png
Views:	16
Size:	30.8 KB
ID:	24002  
SethTro is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Basic Number Theory 4: a first look at prime numbers Nick Number Theory Discussion Group 6 2016-10-14 19:38
Before you post your new theory about prime, remember firejuggler Math 0 2016-07-11 23:09
Mersene Prime and Number Theory Ricie Miscellaneous Math 24 2009-08-14 15:31
online tutoring in prime number theory jasong Math 3 2005-05-15 04:01
Prime Theory clowns789 Miscellaneous Math 5 2004-01-08 17:09

All times are UTC. The time now is 17:10.

Mon Mar 1 17:10:31 UTC 2021 up 88 days, 13:21, 0 users, load averages: 3.07, 2.90, 2.59

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.