![]() |
I hope to have something ready tomorrow.
I had considered merging the two sets of scripts (CRUS_tab.htm and CRUS_stats.htm), but quickly discovered that would be more difficult than anticipated considering I wanted to reuse as much code as I could. It's easier to not reuse the code, although I still think that the scripts could be merged and thus have one cron job instead of two. |
I've posted the following:
1) [URL="http://home.roadrunner.com/~mrodenkirch/crus-stats.htm"]crus-stats.htm[/URL], which are the statistics for the project from the original page. 2) [URL="http://home.roadrunner.com/~mrodenkirch/crus-top20.htm"]crus-top20.htm[/URL], which has a collection of Top 20 lists of unproven conjectures. 3) [URL="http://home.roadrunner.com/~mrodenkirch/crus-proven.htm"]crus-proven.htm[/URL], which has a complete list of all proven conjectures. 4) [URL="http://home.roadrunner.com/~mrodenkirch/crus-unproven.htm"]crus-unproven.htm[/URL], which has a complete list of all unproven conjectures with relative difficulty. 5) [URL="http://home.roadrunner.com/~mrodenkirch/vstats.zip"]vstats.zip file[/URL], the gawk scripts used to build the html. The tables on the top20, proven, and unproven pages have sortable columns, except for those columns which make no sense to sort. I don't intend to make any significant changes from this point. Hopefully any modifications that are necessary are minor in nature. |
What is the difficulty trying to measure?
I measured the testing time of doubling the search depth for S368 and S392. These bases have similar difficulty. S392 would take ~121 CPU hours to take from 25k-50k S368 would take ~248 CPU hours to take from 50k-100k If your current formula: int(tempDifficulty * (searchLimit / 100000) * (searchLimit / 100000)) was changed to int(tempDifficulty * (searchLimit / 100000)) then the proportions between the difficulty for the two listed bases would be about right for doubling the search depth. This might just be a coincedence. What do you intend it to mean? It might be nice to have the weight of each base somewhere on the pages as well as the difficulty. |
[QUOTE=henryzz;289403]What is the difficulty trying to measure?
I measured the testing time of doubling the search depth for S368 and S392. These bases have similar difficulty. S392 would take ~121 CPU hours to take from 25k-50k S368 would take ~248 CPU hours to take from 50k-100k If your current formula: int(tempDifficulty * (searchLimit / 100000) * (searchLimit / 100000)) was changed to int(tempDifficulty * (searchLimit / 100000)) then the proportions between the difficulty for the two listed bases would be about right for doubling the search depth. This might just be a coincedence. What do you intend it to mean? It might be nice to have the weight of each base somewhere on the pages as well as the difficulty.[/QUOTE] You're looking at this incorrectly. Compare n=25K-50K for S392 to n=50K-75K for S368. You'll find that they should be reasonably close in testing times. (Actually S368 should be a bit longer if the avg. weight of the k's was the same. This has to do with diverenge of the difficulty ratings as similar bases advance by a fixed n. See more below.) The difficulty reflects how hard it is to advance all k's on a base by a fixed # of n, not by a multiplier of n. Actually, it's most accurate to tell you how hard it is to advance all k's on a base by an infinitely small amount because the difficulty changes at different rates as 2 similarily difficultied bases advance in n. But for our purposes, I'll just use an n=100 range. As an example: If two hypothetical bases that are very close together, let's say bases 368 and 369, have the same difficulty rating, a different n-search depth (let's say n=50K vs. n=25K as in your example), a different # of k's, and all the k's have the same average weight on both bases, then it should take about the same amount of time to search one base from n=50000 to 50100 as it takes to search the other from n=25000 to 25100. Now...as the bases advance in their search depth, the difficulty ratings will diverge from one another, albeit very slowly, if primes are not found. The reason being is that for the hypthetical bases, if you advance both by n=500K, then one base will be at n=550K with 3 k's remaining and the other will be at n=525K with 6 k's remaining. Clearly the latter base will have a much higher difficulty rating and take much longer to continue advancing. I use formula's similar to Mark's to come up with some of the bases that I recommend. Back when I came up with them independently of Mark, I mentally went through what the effects of very different bases that were at similar difficulty ratings. Mark's the math major. I pretend to be. :smile: Perhaps he can shed more light or state a better way of looking at this. |
The difficulty is a relative measure of time needed to do a range of fixed size for the conjecture. If you do a range of 25K for two different conjectures with the same difficulty, then it should take the same amount of time to test that range for each conjecture, presuming no primes are found.
If you take S392 form 25K to 75K, then the numbers will be much closer. I have considered using the weight (of all remaining k) along with an estimated sieving depth to indicate how many n would need to be tested in each range of 10000. The difficulty is computed by sieving a range of 10000 n and sieves to 1e6. Presuming 10% of the n are removed for each 10x sieve depth, it should be possible to do this. Given S368 (weight 864), you can back compute the weight by multiplying by 4 (864 = weight * (50000/100000)^2). 864*4 = 3456. 3456*(.9)^5 = 204 (presuming sieve depth of 1e11). 204 is the expected number of tests in a range of 10000 n. Since you are doing 5x that size, 204*5 = 1020, thus 1020 is the expected number of tests that you need to do to take S368 from 50K to 100K. If my presumption of 10% per 10x is anywhere near accurate, it would be interesting to see how far off my estimate is from reality. |
1 Attachment(s)
R468 , 5k, from 25k to 50 k, sieved up to 2e11 (200e9) ,2919 test left (difficulty of 990)
R653, 6k, from 25k to 50 k, sieved up to 2e11 (200e9), 2728 test left (difficulty of 999) I hope it will help to calculate difficulty. |
[QUOTE=rogue;289412]The difficulty is a relative measure of time needed to do a range of fixed size for the conjecture. If you do a range of 25K for two different conjectures with the same difficulty, then it should take the same amount of time to test that range for each conjecture, presuming no primes are found.
If you take S392 form 25K to 75K, then the numbers will be much closer. I have considered using the weight (of all remaining k) along with an estimated sieving depth to indicate how many n would need to be tested in each range of 10000. The difficulty is computed by sieving a range of 10000 n and sieves to 1e6. Presuming 10% of the n are removed for each 10x sieve depth, it should be possible to do this. Given S368 (weight 864), you can back compute the weight by multiplying by 4 (864 = weight * (50000/100000)^2). 864*4 = 3456. 3456*(.9)^5 = 204 (presuming sieve depth of 1e11). 204 is the expected number of tests in a range of 10000 n. Since you are doing 5x that size, 204*5 = 1020, thus 1020 is the expected number of tests that you need to do to take S368 from 50K to 100K. If my presumption of 10% per 10x is anywhere near accurate, it would be interesting to see how far off my estimate is from reality.[/QUOTE] I think I understand what the difficulty measure is dong now. S368 has 1350 tests remaining from 50-100k at a sieve depth of 5e11 |
[QUOTE=rogue;289412]
Given S368 (weight 864), you can back compute the weight by multiplying by 4 (864 = weight * (50000/100000)^2). 864*4 = 3456. 3456*(.9)^5 = 204 (presuming sieve depth of 1e11). 204 is the expected number of tests in a range of 10000 n. Since you are doing 5x that size, 204*5 = 1020, thus 1020 is the expected number of tests that you need to do to take S368 from 50K to 100K.[/QUOTE] I realized I made a mistake in this post after using this formula on firejuggler's numbers. The searchLimit is the [b]decimal length[/b], not n. Applying that to S368 one gets: 864 = weight * (128293/100000)^2 weight = 524 From sr2sieve, compute the number of n remaining after sieving as: 524 - 524*(1-log(1e6)/log(5e11)) = 524 - 524*(1-.51) = 524*.51 ~= 267 Since the size of the range is 50000 and 524 is computed based upon a range of size 10000: 267 * 5 = [b]1325[/b] which is fairly close to the [b]1350[/b] actually remaining after sieving. If I take firejugglers values for R468: 990 = weight *(66757/100000)^2 weight = 2221 2221 - 2221*(1-log(1e6)/log(2e11)) = 2221 - 2221*(1-.53) = 2221*.53 = 1177 1177 * 2.5 = [b]2942.5[/b] which is close to the actual number of tests, [b]2919[/b]. And then for R653: 999 = weight * (70373/100000)^2 weight = 2017 2017 - 2017*(1-log(1e6)/log(2e11)) = 2017 - 2017*(1-.53) = 2017*.53 = 1069 1069 * 2.5 = [b]2672.5[/b] again, close to the actual number of tests, [b]2728[/b]. |
I added a new column to [URL="http://home.roadrunner.com/~mrodenkirch/crus-unproven.htm"]crus-unproven.htm[/URL].
The existing "Relative difficulty" columns allows you to compare two conjectures to get an idea how it would take one compared to the other to test a range of 10000 n. The new "n per 25000 at 1e12" column gives you an approximately number of n to test per 25000 if you sieve to 1e12. For the conjectures in this list, most need to be sieved to at least 1e11 and many of the ones with a higher search limit need to be sieved to 1e13 or higher. The actual number of tests will vary by +/- 10% of this value. As more conjectures are pushed to n=100K, then I could bump that to 1e13 instead of 1e12, but I think that 1e12 is sufficient for now. For example, compare S676 (diff 9424, n 653) to S257 (diff 19056, n 656). If you take a range of 25000 n for each conjecture, expect to do the same number of tests, but also expect it to take twice as long to test that range for S257 as it would for S676. Or you can look at it this way, compare R72 (diff 20808, n 471) to R187 (diff 19509, n 4724). If you take a range of 25000 n for each conjecture, expect each range to take the same amount of time even though you would do 10x as many tests for R187 than what you would do for R72. I have not updated the vstats.zip file to reflect this change. |
[QUOTE=rogue;289457]
For example, compare S676 (diff 9424, n 653) to S257 (diff 19056, n 656). If you take a range of 25000 n for each conjecture, expect to do the same number of tests, but also expect it to take twice as long to test that range for S257 as it would for S676. [/QUOTE] Don't forget to state that S676 has been tested up to 150k and 257 up to 250k. so, difficulty increase, when n rise and diff drop when a prime is found? |
[QUOTE=firejuggler;289458]Don't forget to state that S676 has been tested up to 150k and 257 up to 250k.
so, difficulty increase, when n rise and diff drop when a prime is found?[/QUOTE] My numbers were based upon the stats I generated last week. The values in the last two columns will increase until a prime is found. They will decrease, but then rise again as n is increased and no prime is found. |
| All times are UTC. The time now is 04:47. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.