 mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Conjectures 'R Us (https://www.mersenneforum.org/forumdisplay.php?f=81)
-   -   Odds of prime discussion (https://www.mersenneforum.org/showthread.php?t=18765)

 TheCount 2013-10-04 09:01

[QUOTE=gd_barnes;354538]On our "difficulty" stats, our programming guru, Mark (rogue), uses a sieve to P=1M, which is very clearly accurate enough for determing such a stat.[/QUOTE]
What range does rogue sieve for determining the "difficulty" stats? For Nash weight you sieve the range 100,001 to 110,000.

To determine the weight for a random number I can just sieve "k*n-1" for different k values (first 10,000 say) and find the average. srsieve does not support integer sequences of this type though.

I note R916 fell pretty quickly, and with a double hit.

 rogue 2013-10-04 11:00

[QUOTE=TheCount;355202]What range does rogue sieve for determining the "difficulty" stats? For Nash weight you sieve the range 100,001 to 110,000.

To determine the weight for a random number I can just sieve "k*n-1" for different k values (first 10,000 say) and find the average. srsieve does not support integer sequences of this type though.

I note R916 fell pretty quickly, and with a double hit.[/QUOTE]

IIRC, it sieves to 1e6, but also eliminates numbers with algebraic factorizations. The difficulty is a rough guess to the number of tests needed at 1e12. The reason it sieves to 1e6 is because many of the k are larger than 1e5. It isn't perfect, but hopefully more accurate.

 gd_barnes 2013-10-04 12:19

[QUOTE=rogue;355210]IIRC, it sieves to 1e6, but also eliminates numbers with algebraic factorizations. The difficulty is a rough guess to the number of tests needed at 1e12. The reason it sieves to 1e6 is because many of the k are larger than 1e5. It isn't perfect, but hopefully more accurate.[/QUOTE]

That's sieving P-depth. I think he wants to know what n-range that it sieves. In other words, does it sieve n=100001-110000 like for Nash weight? I'm not sure about that. Do you remember?

On the P-depth, it does sieve to P=1e6 and then uses a multiplier to estimate a sieve to P=1e12...and to me it's more than a "rough guess" on the number of tests at P=1e12. I've found it to be very accurate.

 rogue 2013-10-04 14:25

[QUOTE=gd_barnes;355215]That's sieving P-depth. I think he wants to know what n-range that it sieves. In other words, does it sieve n=100001-110000 like for Nash weight? I'm not sure about that. Do you remember?

On the P-depth, it does sieve to P=1e6 and then uses a multiplier to estimate a sieve to P=1e12...and to me it's more than a "rough guess" on the number of tests at P=1e12. I've found it to be very accurate.[/QUOTE]

Yes, that is correct, it uses that range for n. It's good to know that you have found it to be very accurate. Of course you would have let me know if it were wildly off. :smile:

 TheCount 2013-10-04 18:22

OK, so the "n per 25000 at 1e12" value in the Unproven Conjectures table is derived by calling:
> srsieve -n 100001 -N 110000 -P 1e6 "k*b^n+/-1"
For each k, multiplying the results by 1.25, and adding them all together.

The point is that P=1e6 gives a more accurate weight than P=511, so I will use that from now on.

 gd_barnes 2013-10-04 20:04

[QUOTE=TheCount;355239]OK, so the "n per 25000 at 1e12" value in the Unproven Conjectures table is derived by calling:
> srsieve -n 100001 -N 110000 -P 1e6 "k*b^n+/-1"
For each k, multiplying the results by 1.25, and adding them all together.

The point is that P=1e6 gives a more accurate weight than P=511, so I will use that from now on.[/QUOTE]

To clarify, he runs srsieve only one time for each base, which sieves all k's at once usually in just a few seconds. There is no multiple runs for all k's followed by adding them all together. Actually, srsieve gets run once only for bases that have changed since the last time the page was updated. It's very efficient.

 TheCount 2013-10-06 02:18

Part of the reason I like prime finding is the fun of learning some new maths and number theory.
In the Proth/Riesel probability equation I am working on the weight w was the Nash weight divided by 1751.542 giving a coefficient of 1.00 for the average k of base 2.
The trouble with that is it assumes base 2 has the same density of primes as a randomly chosen set of odd numbers of the same magnitude.
So I want to find weight of the a randomly chosen set of odd numbers of the same magnitude as k*b^n-1.

I tried by finding the weight of the term k*n-1.
I could just average sieve runs for the first 100,000 k's but sieve programs don't deal with the term k*n-1.
Instead I used Prime Number Theory which says that the chance of a random integer x being prime is about 1/log x
For x = k*n-1, chance is about 1/log (k*n-1)

To find the probability for a fixed k over a given range we integrate from n = A to n = B.
Integrating 1/log (k*n-1) [URL="http://integrals.wolfram.com/index.jsp?expr=1%2F+Log[kx-1]&random=false"]gives[/URL] li(k*n-1)/k, where li() is the logarithmic integral.
So the number of primes, or fraction thereof, expected in a given range n = A to B for k*n-1 is:
(li(k*B-1)-li(k*A-1))/k

There is a logarithmic integral calculator here:
[URL]http://keisan.casio.com/exec/system/1180573428[/URL]

Now I use this for R620, where b=620 and k=20.
The Unproven Conjectures table has the P=1e6 weights in the n per 25000 at 1e12 column if you divide by 1.25.
For R620 the P=1e6 weight is 1007/1.25=805.6
The weight of the term 20*n-1 is (li(k*B-1)-li(k*A-1))/k = (li(20*110000-1)-li(20*100001-1))/20 = 686.885
So w = 805.6/686.885 = 1.1728

This compares to my old version where the Nash weight was 1855, so w = 1855/1751.542 = 1.0591

So the probability increases from 11.42% to 12.64% using this new w.
This is still too low compared to 19.5%. I'll have to keep thinking on it. :stirpot:

 henryzz 2013-10-06 17:09

1 Attachment(s)
TheCount's calculations fail for b != 2 as base 2 is assumed in the calculation of 1751.

I have spent today delving though the paper linked in [URL]http://www.mersenneforum.org/showpost.php?p=4563&postcount=34[/URL] and have implemented the method it uses in the spreadsheet. It produces the same results as the original method.
I also extended both methods to support ranges of n based on integration. At some point in the future I might add ranges of k as well.

In the attached spreadsheet the first tab is the original method. The second is the new method and also includes a proth weight calculation. It wouldn't surprise me if either the new or old methods have regions where they give better/worse estimates than the other. If you find they are different for certain input then please let me know. I have been unable to find an example.
I left it set up for R620 using Gary's data sieved to 5T.

 TheCount 2013-10-13 09:08

Thanks for that post henryzz. I will read the paper you linked to see what a real Mathematician thinks.

I noted with the integrated probability equation that if you make the reasonable assumptions: ln k << B ln b, and ln k << A ln b, (where w is the weight, b is the base, k is the k, testing range from n=A to n=B)
and use some logarithmic identities that the equation: #Primes = w * (ln(ln k + B*ln b) - ln(ln k + A*ln b))/ln b
becomes #Primes = w * log[sub]b[/sub]B/A
Quote a bit simpler!

Also I used the term k*b*n-1 for the weight of a random set of odd numbers of the same magnitude and I got 18.23% for R620. Much closer to 19.5%.
I'll compare all the 1 k's for the old and new methods in your spreadsheet and see what I can find.

 henryzz 2013-10-14 14:12

In this kind of spreadsheet formula simplification of that form isn't necessary. I would prefer the accuracy myself.

 TheCount 2013-10-17 12:44

1 Attachment(s)
Well, I completed checking the probability formulas for the 1k's. What I found is that:
- henryzz new method is the same as the old. Same inputs and results.
- these methods require the range to be pre-sieved. Only 21% of the 1k's are sieved.
- my new method don't require pre-sieving and is 98.9% correlated on the Riesel side and 97.0% correlated on the Sierpinski side.
- I don't know which method is more accurate.

The weight (w) I used in the spreadsheet column O, attached, is:
w = k*b*P[sub]1e6[/sub] / (li(110,000*k*b)-li(100,001*k*b))

Where li(x) ∼ x/ln(x) + x/ln(x)² + 2x/ln(x)³ + 6x/ln(x)⁴ + 24x/ln(x)⁵ + ...

[ATTACH]10364[/ATTACH]

[I]In the early 1970s, following a counting session, the Count would laugh maniacally, "AH AH AH AH AH!"[/I]

All times are UTC. The time now is 08:22.