 mersenneforum.org > Math Twin primes expected first k formula
 Register FAQ Search Today's Posts Mark Forums Read  2008-03-08, 07:45 #1 robert44444uk   Jun 2003 Oxford, UK 3·661 Posts Twin primes expected first k formula I am having problems trying to find an empirical formula to calculate the expected k value of the first twin prime for a given n in k*2^n+/-1. Can anyone help?   2008-03-08, 20:33   #2
m_f_h

Feb 2007

1101100002 Posts Quote:
 Originally Posted by robert44444uk I am having problems trying to find an empirical formula to calculate the expected k value of the first twin prime for a given n in k*2^n+/-1. Can anyone help?
seems at least highly nontrivial to me...
but I don't know if the specification of the problem is sufficiently precise...
what does "expected" mean here ?
there must be some averaging done - on what ? should the estimation just take into account the size of n? or some properties like numdiv(), ... ?   2008-03-09, 01:37 #3 wblipp   "William" May 2003 New Haven 23·103 Posts Do you mean empirical or heuristic? For a quick and dirty heuristic we can ignore the additional information given by k, and observe the probability that any two odd numbers x and x+2 are a twin pair is 4C/ln(x)^2, so the expected value of k is approximately 0.182 n2 For a more refined estimate we can argue analogously to the twin prime constant. The twin prime constant can be developed by the argument that the probability that either number is prime is 1/ln(x). But the probabilities aren't independent. For p>2, the probability each is not divisible by p is (p-1)/p, so the independent approximation would have the probability that neither is divisible by p as ((p-1)/p)2. But the real probability that neither is divisible by p is (p-2)/p. So the independent probability needs to increased by the factor p*(p-2)/(p-1)2 to compensate for the error in divisibility by p. Doing the same thing for all values of p gives the twin prime constant. p=2 must be argued separatelyThe presence of "k" in the fomula changes this argument to: the probability that either number is prime is 1/ln(x). But the probabilities aren't independent. For p>3, the probability each is not divisible by p is (p-1)/p, so the independent approximation would have the probability that neither is divisible by p as ((p-1)/p)2. But the real probability that neither is divisible by p depends on k. With probability 1/p, p divides k and hence divides neither number. With probability (p-1)/p p does not divide k. In this case there are p-1 values near x with one divisible p, so the probability neither of our pair is divisible by p is (p-3)/(p-1). Hence the probability neither of these is divisible by p is 1/p * 0 + (p-1)/p * (p-3)/(p-1) = (p-3)/p So the independent probability needs to increased by the factor p*(p-3)/(p-1)2 to compensate for the error in divisibility by p. Doing the same thing for all values of p gives an expression like the product expression for the twin prime constant, but with p-2 changed to p-3. p=2 and p=3 must be argued separately.So you could evaluate this adjusted twin constant. Or, if you really meant empirical rather than heurisitc, you could fit the known values of k to k(n) = x * n2.   2008-03-09, 07:20 #4 robert44444uk   Jun 2003 Oxford, UK 198310 Posts Thank you for your responses. Apologies for using empirical loosely. Actually I mean empirical as well as heuristic! http://www.mersenneforum.org/showthr...211#post128211 posts 33 and 41 have the factual first k for n from 1 to 2000. Using 0.182 n^2 has the advantage that it is simple and depends only on n, but it appears to give results that are too low. For example, at n=2000, the formula gives k=728000, but a moving average over the previous 50 n gives a k value in excess of 1 million for all n>1740. An important observation is that it is a polynomial relationship to k and not logarithmic relationship. The dependence on k only comes from the fact that we are looking for the first k. This k has the additional property that no k smaller than it provides a twin. So I would think some formula might exist that would have to evaluate the product of probabilities of a twin that every k up the chosen value has, and then deducing a multiplier constant to the resulting probability product, to match the average observed values. How one does this I do not know. In terms of probability, one empirical hunch I had is that A=ln(ln(k/n)) --> 2. 87% of A values for n from 1501 to 2000 fall in the range 1.5 to 2.1 most exceptions between 2 and 2.1 and the trend of A values looks to be increasing quite slowly - at 2000 A is approx 1.73 on a 50 moving average and 1.77 on a logarithmic trend line. Regarding variables, I am looking for a formula that depends only on n. My reason for pursuing this is to prove (at least to myself) that randomly targeting 333333 for example, is not the most efficient way to find a large twin. To begin with, 333333 might be a rogue n with no small k values that are twins. The A>1.5 observation, suggests that it is not worth testing small k, which has been done by the Twin Prime Search. It might be more sensible to start testing for a twin at close to the expected k and then expanding out (up and down) from there. Certainly testing more than one n seems sensible, as it is possible to apply portfolio theory to overcome the reliance on one n.   2008-03-17, 13:36 #5 robert44444uk   Jun 2003 Oxford, UK 3·661 Posts And here is another thing - An analysis of the first k for n from 1 to 3000 shows a preponderance for 0mod5 and 0mod7 and a lesser preponderance towards 0mod11..... I can understand that k should be 0mod3, (except for n=2), otherwise, one of k*2^n+/-1 will be 0mod3 if the other is prime k=xmod5: x count 0 1016 1 489 2 498 3 487 4 510 I suppose this is because k=5#.Y/2 where Y is some integer, guarantees that, if k*2^n-1 is prime, then k*2^n+1 is not 0mod3 whereas any other random k might produce this condition on average 50% of the time. Results for xmod7: 0 591 1 387 2 400 3 428 4 417 5 403 6 374 and xmod11: 0 335 1 275 2 270 3 258 4 289 5 270 6 262 7 225 8 256 9 296 10 264 Last fiddled with by robert44444uk on 2008-03-17 at 13:45   2008-03-24, 12:03 #6 robert44444uk   Jun 2003 Oxford, UK 3·661 Posts An analysis of the first 4000 k values, with found first twin k=y and n=x provides "best fit" equations, according to Microsoft Excel: y = 0.3049x^1.9514 with R2 = 0.6926 The power series seems to consistently underestimate the moving average (over 100n), which appears to increasing along its own curve which mimics the polynomial: y = 0.2566x^2 + 383.78x - 174906 (with R2 = 0.2964) rather well. (I am mystified why the power equation produces such a high R2 compared to the polynomial) Plotting the moving average (over previous 100 n) provides a best fit: y = 0.1498x^2.1119 with R2 = 0.9947 And average over 100n looking forward 50n and back 50n: y = 0.5442x^1.951 R2 = 0.9956 I am taking the actual data to much higher levels to see whether I can get a better feel for it. When the values are viewed graphically they appear like atoms bubbling off a substrate, especially when a relatively narrow bank of values is taken, such as, say n=3000 to n=4000 Last fiddled with by robert44444uk on 2008-03-24 at 12:04   2008-03-25, 11:37   #7
R.D. Silverman

Nov 2003

22×5×373 Posts Quote:
 Originally Posted by robert44444uk An analysis of the first 4000 k values, with found first twin k=y and n=x provides "best fit" equations, according to Microsoft Excel: y = 0.3049x^1.9514 with R2 = 0.6926 The power series seems to consistently underestimate the moving average (over 100n), which appears to increasing along its own curve which mimics the polynomial: y = 0.2566x^2 + 383.78x - 174906 (with R2 = 0.2964) rather well. (I am mystified why the power equation produces such a high R2 compared to the polynomial) Plotting the moving average (over previous 100 n) provides a best fit: y = 0.1498x^2.1119 with R2 = 0.9947 And average over 100n looking forward 50n and back 50n: y = 0.5442x^1.951 R2 = 0.9956 I am taking the actual data to much higher levels to see whether I can get a better feel for it. When the values are viewed graphically they appear like atoms bubbling off a substrate, especially when a relatively narrow bank of values is taken, such as, say n=3000 to n=4000

You are trying to apply "curve fitting" without any use of the
underlying mathematics. Anyone can fit a curve to data. Whether that
curve is *meaningful* depends whether it models the underlying process(es).

In this case, your curves have no meaning.

I suggest that you look closely at Hardy & Wright's "Introduction to
Number Theory" where they make conjectures about the distribution
of twin primes.

Theoretically, the number of twin primes of the form k*2^n +/- 1
less than N will be O(N/ [log(k*2^n+1)]^2) as either k varies or
as n varies while the other is held fixed. The implied constant in the
O() estimate can be computed (in theory) as a singular infinite product.
Look again at Hardy & Wright's derivation of the twin prime constant.   2008-03-25, 12:44 #8 robert44444uk   Jun 2003 Oxford, UK 3×661 Posts Bob, thank you for the guidance and direction, I am certain it is spot on. Will look it up for sure from my e-copy of H&W, although I must admit, I am skating at the very edge of what I can actually understand with that book. Large parts of it are unintelligible to the layman! I am slowly getting comfortable with pi(2) given the excellent and confidence building paper by DA Goldston "Are there infinitely many twin primes?" I will for sure come back when I understand the O and N in your formula Last fiddled with by robert44444uk on 2008-03-25 at 12:51   2008-03-25, 13:54   #9
R.D. Silverman

Nov 2003

22·5·373 Posts Quote:
 Originally Posted by robert44444uk Bob, thank you for the guidance and direction, I am certain it is spot on. Will look it up for sure from my e-copy of H&W, although I must admit, I am skating at the very edge of what I can actually understand with that book. Large parts of it are unintelligible to the layman!
Yes. I quite agree. Ready H&W requires a lot of mathematical maturity
and will be hard to read for a "layman".

What I say now is no reflection on you personally:

I argue that a "layman" trying to do foundation-less curve
fitting to a mathematical process that the "layman" does not understand
is a case of "trying to run before you walk". It is NOT something that
a "layman" can reasonably do, any more than a "layman" can do (say)
brain surgery.   2008-03-26, 04:32 #10 robert44444uk   Jun 2003 Oxford, UK 3×661 Posts Tried reading H&W last night, but maybe I should try in the mornings, and not with a glass of wine in my hand. In answer to Bob, Maybe that is why I posted this here, knowing that it was beyond my abilities! If I knew how to arrive at the answer, why would I post? That is why I really appreciate the responses to the post. The reason for curve fitting is to ensure, at least in my mind, that a formula of the kind A*n^2, which has basis in the underlying mathematics, would be appropriate. The fact that I am getting best fit curve fitting of the form a*n^1.95 and b*n^2.11 suggest that in fact, A*n^2 is probably the underlying formula, even if it does not give best fit, it would be close. wblipp has placed A=0.182 with an underlying theoretical basis involving the Twin Prime Constant, but the results suggest A is in fact larger than that. Curve fitting over many thousands of values will at least give an approximation to A that can be retrofitted into a theory, I would have thought. I have the confidence I think, in the next set of curve fitting to search for a value of A. I will do this by fitting a curve A*n^2 to moving average values that look forward 50 values and back 50 values. At least I will establish something to mull over. Last fiddled with by robert44444uk on 2008-03-26 at 04:34   2008-03-26, 07:23 #11 robert44444uk   Jun 2003 Oxford, UK 3×661 Posts Line of best fit of the form k=A*n^2 for the first 4621 n values, comparing the moving average of 50 values n-25 to n+25, appears to be A=0.368, which is close to twice wblipps A=0.182 Now it will be interesting to back into the theory Applying to n=333333, and using A=0.364, then the average expected first prime, on average, would be at k=40.4G. The TPS site is sieving to 100G which is quoted at providing 90% certainty. Last fiddled with by robert44444uk on 2008-03-26 at 07:59   Thread Tools Show Printable Version Email this Page Similar Threads Thread Thread Starter Forum Replies Last Post carpetpool Miscellaneous Math 3 2017-08-10 13:47 Batalov Computer Science & Computational Number Theory 5 2016-08-11 01:17 ewmayer Probability & Probabilistic Number Theory 6 2015-11-10 16:33 mart_r Math 2 2010-10-29 17:31 gd_barnes Riesel Prime Search 15 2010-10-14 22:00

All times are UTC. The time now is 12:44.

Tue Dec 7 12:44:57 UTC 2021 up 137 days, 7:13, 0 users, load averages: 2.43, 1.95, 1.74