mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Conjectures 'R Us

Reply
 
Thread Tools
Old 2012-02-01, 13:49   #89
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

11000110100002 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
On your original page, to differenciate between the two different kinds of "remaining k", please change the other table headings to show "# of remaining k".

Your latest page would need to be a separate page from the original page.
Updated.

Agreed. I just put it out there to get opinions on content and format.
rogue is offline   Reply With Quote
Old 2012-02-01, 17:20   #90
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

143208 Posts
Default

It seems to never end, doesn't it.

I've posted the following:

1) crus-stats.htm, which are the statistics for the project from the original page.
2) crus-top20.htm, which are the Top 20 tables from the original page.
3) crus-lists.htm, which is similar to the Top 20 tables, but list everything.
4) vstats.zip file, the gawk scripts used to build the html.

I split the stats and Top 20 data because it just makes more sense that way. The scripts have been modified so that both the Top 20 and "Lists" pages can be generated from the same script. In fact, the gawk script takes a parameter so that one could have a "Top 10", "Top 20" or "Top 100" if they desired. Using the parameter "all" generates the "Lists" page. The "Top 20" and "Lists" pages have sortable columns. I've also changed some of the column headings so that they now can take multiple lines, which means that some columns won't take up an ridiculous amount of horizontal space. "remaining k" or "k remaining" can now have more meaningful names. Right now they are "Count of Remaining k" and "List of Remaining k". I'll change them to whatever Gary wants them to be.

Another thing to consider would be splitting the "Unproven" and "Proven" lists and putting them onto separate pages. The "Proven" page could then be the complete list of proven conjectures, not just the top 20.
rogue is offline   Reply With Quote
Old 2012-02-01, 18:40   #91
MyDogBuster
 
MyDogBuster's Avatar
 
May 2008
Wilmington, DE

285210 Posts
Default

Good job Mark. About the only thing missing are the lowest air fares
between major cities of the world.

One thing. Would it be possible to identify which conjectures have a posted
sieve file? Don't need to know anything else, just if one exists or not.
Not a 'must have' but nice.

Last fiddled with by MyDogBuster on 2012-02-01 at 19:05
MyDogBuster is offline   Reply With Quote
Old 2012-02-01, 20:58   #92
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101×103 Posts
Default

Quote:
Originally Posted by rogue View Post
It seems to never end, doesn't it.

I've posted the following:

1) crus-stats.htm, which are the statistics for the project from the original page.
2) crus-top20.htm, which are the Top 20 tables from the original page.
3) crus-lists.htm, which is similar to the Top 20 tables, but list everything.
4) vstats.zip file, the gawk scripts used to build the html.

I split the stats and Top 20 data because it just makes more sense that way. The scripts have been modified so that both the Top 20 and "Lists" pages can be generated from the same script. In fact, the gawk script takes a parameter so that one could have a "Top 10", "Top 20" or "Top 100" if they desired. Using the parameter "all" generates the "Lists" page. The "Top 20" and "Lists" pages have sortable columns. I've also changed some of the column headings so that they now can take multiple lines, which means that some columns won't take up an ridiculous amount of horizontal space. "remaining k" or "k remaining" can now have more meaningful names. Right now they are "Count of Remaining k" and "List of Remaining k". I'll change them to whatever Gary wants them to be.

Another thing to consider would be splitting the "Unproven" and "Proven" lists and putting them onto separate pages. The "Proven" page could then be the complete list of proven conjectures, not just the top 20.
Pretty cool but I think we're getting way ahead of ourselves and creating additional problems. It's reaking of scope creep big time right now and the top 20 page is becoming overly complicated with too many tables. Can I ask that we stop the enhancements and concentrate on getting what we have correct, getting it implemented into production, and then considering incremental enhancements? Each time that changes are made, I have to do a detailed check to make sure what was previously correct was not affected.

Problems on the top-20 page:
1. The "Top 20 Conjectures Tested to at least n=10K" and the "Top 20 Conjectures Tested to at least n=100K" tables are repeated twice and the 2nd occurrence of the tables are sorted differently. Please delete the 2nd occurrence of the tables or somehow label them differently.

2. The "Top 20 Conjectures Tested to 25K <= n < 100K" is named incorrectly and is inconsistent with all of the other "tested to" tables. It is the only "tested to" table sorted ascending by # of k's remaining. My suggestion: Delete it. We're getting too many tables. If you don't delete it, the parameters in the name need to be changed and it needs to be renamed. It is intended for conjectures that are n>25K (not n>=25K) and the word "to" doesn't make grammatical sense in the title. It should probably be named something like "Top 20 conjectures for 25K < n < 100K by fewest # of k's remaining" or something like that.

I guess we're destined to go around and round about the "k remaining" thing. lol But there is definite inconsistency right now so I'll be specific and orient it towards the way you appear to want them labeled. This is for the top 20 lists:

1. In the "Top 20 Unproven Conjectures by Decimal Length" table, it should be "count of k remaining" instead of "k remaining".

2. In the "Top 20 Conjectures Tested to at least n=10K" table, it should be "count of k remaining" instead of "list of remaining k".

3. In the "Top 20 Conjectures Tested to at least n=100K" table, it should be "count of k remaining" instead of "list of remaining k".

4. In the "Top 20 Conjectures Tested to 25K <= n < 100K" table (if that table is not deleted as per above suggestion), it should be "count of k remaining" instead of "list of remaining k".

5. In the "Top 20 Conjectures with 1k Remaining by Highest Weigth", it should be "count of k remaining" instead of "list of remaining k".

But...like I've said a couple of times now, my preference is that they be exactly like the current pages, which show "k remaining" or "k's remaining" for a list of them and "number of k's remaining" where a count is required. IMHO, that is the easiest to understand. My problem with your new method is that showing "list of k's remaining" looks very weird when there is only one k remaining.

Regardless, the above ensures that all tables where a count of k's are labeled as "count of k remaining" and all tables where specific k's are labeled as "list of remaining k".

Please take your time and get it all correct, and please, no more enhancements. Thanks.


Gary
gd_barnes is online now   Reply With Quote
Old 2012-02-02, 00:22   #93
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

18D016 Posts
Default

I've made most of the changes you asked for, such as fixing the column names. The duplicate table names were due to an issue I creating merging the older script I posted with the one that has sortable tables. I do agree that there are too many tables. I have removed the table that you suggested I remove. Fortunately that is easily done by commenting one line of code. The reason you see "List of Remaining k" for the 1k table is that I use the same subroutine for the 1k, 2k, and 3k tables. I could change it for the 1k table, but I think that is a fairly minor issue.

I respectfully disagree with wanting to keep things as close as possible to what they are today. What I'm trying to do is get the available data in forms that make it easier for users to mine data. I'm also trying to keep meaningful data together. I would like to believe that a majority of CRUS participants agree. In reality I will be using these tables to help me choose the next conjectures that I work on. IMO, I would have the following html:

1) A "Statistics & Progress" html.
2) A "List of Unproven Conjectures" html, which be the first table from the "List" html. It would be optional to include the 1k conjectures with weight in a separate table.
3) A "List of Proven Conjectures" html, with a single table.
4) A "Top 20" html, which would only contain Top 20 tables of unproven conjectures and could be viewed as "Most Wanted" lists by various people.

Since tables on the middle pages would be sortable, I would have a default sort thus not needing additional tables. As for the fourth page, I would consider merging the two 2k tables and two 3k tables. Those lists tend to be small (notice that some conjectures are in both 2k tables and some are in both 3k tables).

Fortunately for you, you should only have to install the scripts one time, presuming no bugs are found once you do so. I don't envision this creating a lot of work for you outside of reviewing what I'm doing and giving your opinions on it.
rogue is offline   Reply With Quote
Old 2012-02-03, 09:27   #94
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

28A316 Posts
Default

The tables are a nice piece of work. As far as I can tell, there are no more problems.

I didn't say that I wanted them to be like they are today. Only that I liked the way the "remaining k" issue was today and that we get them correct before any more enhancements, i.e. adding additional tables, are made. I agree that all CRUS users will find them useful including me.

I won't even be doing the installing. I'm leaving that up to Max as he knows what goes where on the server. I will decide where to put the links...the main place will probably be in the first post of the "Come join us thread" where the current table links are. We also might want to put them on the CRUS home page.

Before we implement, does anyone else have any more comments or suggestions about the tables?

Last fiddled with by gd_barnes on 2012-02-03 at 09:41 Reason: grammar
gd_barnes is online now   Reply With Quote
Old 2012-02-03, 09:36   #95
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101·103 Posts
Default

I hadn't previously tested the sorts, which I must say are excellent! :-) Anyway, I found a minor issue that you may already be aware of. When sorting on the "list of remaining k" for the conjectures with 2k or 3k remaining, it does an alphanumeric sort, i.e. in the 2k list, first comes 10, 11, then comes 114, 134, then comes 12, 22, etc.

IMHO, a sort on k for the 2k and 3k lists is somewhat invalid. I might suggest just removing the sort for that particular column altogether. Alternatively to make it correct, you'd have to parse out the lowest k of the 2 k's or 3 k's remaining, which I'm guessing would be a big hassle.

This is not a deal breaker. If you want to leave it as is, that's fine. It's just my two cents at this point.

Edit: You might consider removing the sort on "Largest prime" in the proven conjectures table also. It also does an alphanumeric sort, which means very little in that sense.

Last fiddled with by gd_barnes on 2012-02-03 at 09:40 Reason: edit
gd_barnes is online now   Reply With Quote
Old 2012-02-03, 16:19   #96
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

18D016 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
I hadn't previously tested the sorts, which I must say are excellent! :-) Anyway, I found a minor issue that you may already be aware of. When sorting on the "list of remaining k" for the conjectures with 2k or 3k remaining, it does an alphanumeric sort, i.e. in the 2k list, first comes 10, 11, then comes 114, 134, then comes 12, 22, etc.

IMHO, a sort on k for the 2k and 3k lists is somewhat invalid. I might suggest just removing the sort for that particular column altogether. Alternatively to make it correct, you'd have to parse out the lowest k of the 2 k's or 3 k's remaining, which I'm guessing would be a big hassle.

This is not a deal breaker. If you want to leave it as is, that's fine. It's just my two cents at this point.

Edit: You might consider removing the sort on "Largest prime" in the proven conjectures table also. It also does an alphanumeric sort, which means very little in that sense.
The javascript responsible for the sorting looks at the data in the column to determine how it is sorted. Because those columns have a comma, they are treated as alphanumeric. As for this and the largest prime column, I can remove the sort, but it would require a some changes to the script WRT how I build the tables. Basically the column headers need class="sorttable_nosort" added to them, but the way I'm building the html for the headers is in a single routine shared by all functions. I'll probably do it at some point, but let's get everyone to agree on the pages and content of those pages first.

I added something of possible interest called "Relative Difficulty", which you can see here. This has not been added to the main scripts. It's just something that might have potential. The "Relative Difficulty" column gives an idea as to how difficult it is to work on the conjecture given the remaining k and search limit. It's computed using both the weight of the k (up to whatever is listed on the main page) and the search limit (decimal length, not tested n). It weighs the decimal length so that doubling that length multiplies the difficulty by four. The big negative is that it doesn't have weights attached to the conjectures with more than 25 k. That can be done, but it would take a bit of work and I would lean towards storing pre-computed values for them, which wouldn't be hard, but would be a manual effort.
rogue is offline   Reply With Quote
Old 2012-02-03, 21:05   #97
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

11000110100002 Posts
Default

I updated the html on the link I just posted. It how has most of the other "Relative Difficulties" filled in. I think that some might want to debate the rankings and wonder how S63 got to the top of the Sierpinski list. It is purely due to the number of k remaining. As one might guess, taking on that conjecture would be a huge amount of work even though n=25000.

Compare how S406 and S428 are both tested to nearly the same n, yet have vastly different difficulties. This is due to the relative weight of the remaining k for those conjectures.

Compare how S53 and S785 have similar relative difficulties, even though S53 is only tested to n=25000 and S785 to n=100000. S53 has 23 k while S785 has one. The reason for the similar values is that taking a small range of S53 should take about the same amount of time as a small range of S785. Now if someone were to take S53 to n=100000, there would most likely be more than 1 k remaining, thus the relative weight of S53 would increase significantly. If half of the k were removed, then the relative difficulty of S53 would grow by a factor of 8, computed as ((100K/25K) ^ 2) / 2. The first two is weight I attach to the search limit. The second two is due to the approximate halving of the cumulative Proth weight of the remaining k.

I view the list as such. If someone wants to bite off something small, aim towards a conjecture with a lower relative difficulty. If someone has tons of resources and wants something hard, then they would aim towards a conjecture with a higher relative difficulty. If someone wants to find Top 5000 primes, aim towards a conjecture higher minimum decimal lengths.
rogue is offline   Reply With Quote
Old 2012-02-03, 21:05   #98
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101×103 Posts
Default

Quote:
Originally Posted by rogue View Post
The javascript responsible for the sorting looks at the data in the column to determine how it is sorted. Because those columns have a comma, they are treated as alphanumeric. As for this and the largest prime column, I can remove the sort, but it would require a some changes to the script WRT how I build the tables. Basically the column headers need class="sorttable_nosort" added to them, but the way I'm building the html for the headers is in a single routine shared by all functions. I'll probably do it at some point, but let's get everyone to agree on the pages and content of those pages first.

I added something of possible interest called "Relative Difficulty", which you can see here. This has not been added to the main scripts. It's just something that might have potential. The "Relative Difficulty" column gives an idea as to how difficult it is to work on the conjecture given the remaining k and search limit. It's computed using both the weight of the k (up to whatever is listed on the main page) and the search limit (decimal length, not tested n). It weighs the decimal length so that doubling that length multiplies the difficulty by four. The big negative is that it doesn't have weights attached to the conjectures with more than 25 k. That can be done, but it would take a bit of work and I would lean towards storing pre-computed values for them, which wouldn't be hard, but would be a manual effort.
OK on the 1st para.

On the 2nd para., how can you know the weight of remaining multiple k ? I can see how this would work for 1k conjectures but I can't see how it would work for more than 1k unless you individually calculated the weights of all of the k and averaged them.
gd_barnes is online now   Reply With Quote
Old 2012-02-03, 23:32   #99
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

101·103 Posts
Default

Quote:
Originally Posted by rogue View Post
I updated the html on the link I just posted. It how has most of the other "Relative Difficulties" filled in. I think that some might want to debate the rankings and wonder how S63 got to the top of the Sierpinski list. It is purely due to the number of k remaining. As one might guess, taking on that conjecture would be a huge amount of work even though n=25000.

Compare how S406 and S428 are both tested to nearly the same n, yet have vastly different difficulties. This is due to the relative weight of the remaining k for those conjectures.

Compare how S53 and S785 have similar relative difficulties, even though S53 is only tested to n=25000 and S785 to n=100000. S53 has 23 k while S785 has one. The reason for the similar values is that taking a small range of S53 should take about the same amount of time as a small range of S785. Now if someone were to take S53 to n=100000, there would most likely be more than 1 k remaining, thus the relative weight of S53 would increase significantly. If half of the k were removed, then the relative difficulty of S53 would grow by a factor of 8, computed as ((100K/25K) ^ 2) / 2. The first two is weight I attach to the search limit. The second two is due to the approximate halving of the cumulative Proth weight of the remaining k.

I view the list as such. If someone wants to bite off something small, aim towards a conjecture with a lower relative difficulty. If someone has tons of resources and wants something hard, then they would aim towards a conjecture with a higher relative difficulty. If someone wants to find Top 5000 primes, aim towards a conjecture higher minimum decimal lengths.
It's interesting that you came up with a relative difficulty value. I set up something like this in a spreadsheet on my PC about a year ago for bases <=256. It is actually how I come up with bases to recommend. But I have to keep it manually updated. The main difference is that in my spreadsheet, I only consider base, # of k's remaining, and the test limit...not weights. I also add a "bias" towards smaller bases because I consider them a little more important even if they are somewhat more difficult to advance. Essentially I do (base * n search depth)^2 * # of k's remaining. I then apply the low-base bias to that. Effectively I assume that all k's are the same weight. It gives an OK estimate of difficulty but obviously not as accurate as considering the weight of the k's.

I'm still curious to find out how you are computing the average weights for multiple k's. Do you have srsiseve automatically run for every k in every base (for all bases with <= 25 k's) ?


Gary
gd_barnes is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sieving for CRUS rebirther Conjectures 'R Us 638 2021-06-15 07:55
BOINC effort for CRUS gd_barnes Conjectures 'R Us 75 2015-06-17 14:25
What are your CRUS plans? rogue Conjectures 'R Us 35 2013-11-09 09:03
how high will CRUS go Mini-Geek Conjectures 'R Us 1 2010-11-08 20:50
CSVs for stats available + New combined stats opyrt Prime Sierpinski Project 3 2010-05-31 08:13

All times are UTC. The time now is 10:32.


Tue Jul 27 10:32:26 UTC 2021 up 4 days, 5:01, 0 users, load averages: 1.74, 1.85, 1.86

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.