20090729, 01:06  #12 
I quite division it
"Chris"
Feb 2005
England
31·67 Posts 
Yeah, I did spot it. I had done some analysis. I have an old (messy) spreadsheet somewhere with some data in. I think I tested loads of ks (mostly multiples of 15) to n=10k and then the interesting ones to 50k. Then looked for top5000 primes in the best of them. They all went a bit 'jumpy' except for 3045 which had the most even distribution of primes and so became my fave.
Found the spreadsheet. Sheet one is a partial record of my searches. On sheet two you can see colour coded (by size) lists of primes for my fave ks. Cell L4 shows that I marked 3045 as being the most fruitful. Column AD shows the increases in size from prime to prime for k=3045. A beautiful distribution imho. The mean and standard deviation are in AE. The percentages in row 2 shows some sort of score I gave them in comparison to k=1785. 'Ruff' means a 'jumpy' distribution. 15345+ was a Proth. (Wash my mouth out with soap.) Now I'm waiting for you to find a relationship between the 'performance' of a k and the number and size of its factors. In other words, there must a reason why some ks perform better than others??? Last fiddled with by Flatlander on 20090729 at 01:08 
20090729, 01:43  #13 
May 2007
Kansas; USA
2×7^{2}×103 Posts 
That sounds interesting. I'll take a look at it later tonight.
There's no specific reason that I can think of why a k of the same weight would be better than its coharts of the same weight with the exception of Mersenne's that have been proven can contain only factors of a specific form. Most of it would be random fluctuations and the difficult part is telling what is random and what is not. To prove that a k with a weight of 3822 was "better" (as for candidates searched per prime) than another k with the same weight would likely require testing each k to n=1G or even 1T or higher. In most cases, there would need to be several hundred primes to get a statistical significance. Also, there is degrees of freedom to concern ourselves with. As it applies here, if you look at 100's or 1000's of k's with the same weight, there are bound to be some that are 2, 3, and even possibly 4 standard deviations from the mean so that if you compare the highest and lowest of them, the deviation could seem huge on the surface but not necessarily mean a hill of beans about future primes on them. The key is to determine what is statistically significant for the number of k's that you are looking at. For 100 k's, 3 st. devs. from the mean for the highest amongst them might only give you a 80% confidence that that particular k will continue to produce above average primes in the future. But if you had only 10 k's, 3 sd's could give you a 95% confidence level. (Made up figures but probably not too far off.) If you were looking at 1000 k's, I'm certain that having a few k's at 3 sd's would be very normal. You'd likely need 3.54 sd's to have reasonable statistical significance for any one k. As I remember from stats class, for 1 degree of freedom, something 1 s.d. from the mean will occur about 32% of the time (counting both above and below the mean), 2 sd's from the mean will occur about 5% of the time and 3 sd's about 1% of the time. But those figures change greatly as you get up to 100 or 1000 degrees of freedom. It is my conjecture (or really speculation at this point) that highweight k's, because they have fewer smaller factors below a typical sieve depth will also have fewer higher factors above a typical sieve depth. Therefore, I speculate that highweight k's will give more top5000 prime score per CPU hour than lowweight k's of approximately the same size if you take into account the total sieving plus prime testing time. Gary Last fiddled with by gd_barnes on 20090729 at 01:50 
20090729, 02:02  #14 
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
4267_{10} Posts 
Did someone say factors?
Here are the full factorizations of 3045*2^n1 with 1<n<252: http://factordb.com/search.php?query...F=1&v=n&pp=250 If you want a 'normal' k (I don't think there's anything too special about it anyway...it didn't give me a prime in the 600K<n<850K range, but I don't suppose that's too unlikely) to compare it to, here's 349*2^n1 with 1<n<352 (minus a few >c90s that withstood a moderate amount of ECM; someone can work on 'em if they want, the ECM statuses are all in the DB, some are ready for QS/GNFS): http://factordb.com/search.php?query...F=1&v=n&pp=350 It's interesting how easy it is to spot factors that repeat at certain n==x mod y, e.g. for 3045: ( means "divides", N refers to 3045*2^n1, n refers to...n) 11N when n=4 mod 10 13N when n=8 mod 12 61N when n=8 mod 60 (these cover n={4, 14, 24, 34, 44, 54, 8, 20, 32, 44, and 56} mod 60) and 17N when n=7 mod 8 (lcm(60,8)=120, so I'd have to multiply all that stuff to include this case) It doesn't seem so daunting to discover a covering set, if one exists. All you need is some small factors. Last fiddled with by MiniGeek on 20090729 at 02:23 
20090729, 10:57  #15 
May 2007
Kansas; USA
2×7^{2}×103 Posts 
You just went through exactly what I've gone through 1000 times with covering sets at CRUS and trying to find patterns that make full covering sets for specific k's in other bases. Occasionally I'll get one that gives a full covering set when combining it with algebraic factors. It's extremely rare to find one that unexpectedly gives a full covering set. What I usually end up with is a partial covering set like you did with the nvalues narrowed down much more than what you have. Alas, at CRUS, I'm looking at low weights whereas you are looking at a very high weight.
If a prime factor f occurs at all, it will always repeat every f1 nvalues or every (f1)/q nvalues for k*b^nc forms. q can be just about anything but is usually 1 or 2. The most interesting factors to find are very big ones where q is large and hence occur far more frequently than every f1 nvalues. I've seen factors over 2000 occur every < 30 n. This is a normal thing for GFNs, which can only be prime if n is a power of 2 but for more random kvalues, it doesn't happen often; mostly on low weights. Syd's DB Is a boon for this. I used to do it manually using Alperton's factorization applet. Now it's quick and easy to see factors that occur unusually frequently. An example of an exercise that I have gone through is for the lowest k without a prime: 2293*2^n1 and it's sister form 2^n2293, which has the same covering set and occurrences of factors. If you look in the DB, you'll see that there is a specific covering set that covers 23 out of every 24n. I've fully factored all except a couple of nvalues for both forms for n<400. But the interesting part is that there is a secondary effort that I took on to attempt to factor that one n out of every 24 that doesn't have a known recurring factor. If you want to see what I've done so far with both sequences, look at: 2293*2^(24*n1)1 2^(24*n+1)2293 This introduces a whole new way of looking at covering sets. If you only concentrate on every 24th value of these forms, you'll notice that on a typical sieve of them, they would be more "average" weight in that they contain far fewer small factors than the original form. Alas for this k as well as many other I've observed that contain large #'s of small factors, that 24th occurrence of each n apparently contains an abnormally high # of large factors and I speculate that is because the main form itself contains so many small factors. There seems to be something going with a large # of high factors when there is also a large # of small factors on the original form itself, even though the nvalues being looked at can contain only few small factors. Interestingly enough, even reduced to only looking at every 24th nvalue, the factors still occur in the (f1)/q frequency as is the case if you looked at all n's. There are just very few small factors when you get outside of the main covering set. For instance, the factor of 2017 occurs every (20171)/144 = 14 exponents for these "every 24 n forms". In the original form, 2017 occurs every (20171)/6 = 336 n. To have a qvalue of 144 is huge and it does happen occasionally in the original form itself. This frequency of the factor of 2017 within every 24th n is part of what makes k=2293 so hard to find a prime for. But the small factors of 23 and 29 that occur every 11 and 7 n respectively within every 24th n is a bigger part of the reason. Most of the factors that DO occur within every 24th n occur much more frequently than they otherwise should. BTW, I've also tested 2^n2293 up to n=200K. No primes there either so far! Gary Last fiddled with by gd_barnes on 20090729 at 11:11 
20090729, 16:30  #16 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT)
2^{4}×353 Posts 
could this conversation about prime density/nash weight be moved to its own thread?
it is an incredibly interesting subject which is worth investigating further Admin edit: done Last fiddled with by gd_barnes on 20090729 at 20:16 Reason: admin edit 
20090729, 20:35  #17  
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
17×251 Posts 
Quote:
By the way, saying f1 or (f1)/q and then that q is usually 1 or 2 is partially redundant, since the q=1 case is clearly covered by "f1". I think it'd be better to ignore the separate "f1" case, and just include it as (f1)/q with q=1. Or alternately say that it will repeat every f1 nvalues, and more specifically every (f1)/q nvalues (with q usually 1 or 2). Is it proven that for any f that divides k*b^n+c, fk*b^n+c for every (f1)/q nvalues? From our older discussion on the topic, (which I can not locate) I thought this had to be somehow proven for each candidate factor, but it seems to hold up all the time, and I'm pretty sure you're saying this is so. Maybe sieves already take advantage of this, maybe it would be terribly inefficient, maybe it is completely unnecessary, but I am still curious and feel I must ask: How good would it be to check for a divisor based on the n? e.g. if you discover that 7k*2^71, and find that it divides only when n==1 mod 6, (and not also with q=2 for n==1 mod 3) then you can remove all n==1 mod 6 candidates. This is simple enough, but is probably not worth the effort for such a small factor. But what if you were to do this for many factors? If you were to factor a small group of n's well beyond the normal depth (maybe to the extent of 0<n<500 being fully factored and n<10,000 factored to the normal depth of n=1,000,000, or something like that) and build a table of such exclusions, you could eliminate a great many n's for any n size without needing to directly handle such large numbers. Depending on the depths you presieved to and how far you're going, further traditional sieving may be best. Quote:
e.g. 20172293*2^(24*(14*n+1)1)1 for every n, but 20172^(24*(14*n1)+1)2293 for every n. (14*n+1 vs 14*n1) This would mean that they would share the covering set's factors, but not the allimportant fact of where they are that lets the covering set be complete and lets you ask where exactly you should search for more that you're missing. Or am I just missing something? 

20090729, 20:35  #18 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT)
5648_{10} Posts 
i have attached a spreadsheet for ks up to 49 which compares Gary's formula with the Nash weights.
Sheet1 contains the data for where they are all currently at Sheet2 is a graph based on Sheet1 that compares Prime Density and Nash Weight Sheet3 is similar to Sheet1 but contains only data up to n=1M Sheet4 is similar to Sheet2 but is based on Sheet3 instead of Sheet1 Any comments? 
20090730, 01:50  #19 
May 2007
Kansas; USA
2·7^{2}·103 Posts 
A mathematical nitpick: In order to properly compare the ratings, they have to be converted to a percentage because we are taking arbitrary candidates remaining (weight) and dividing by # of primes. Otherwise things that are based off of an average such as the leastsquares fit line will be strongly skewed by k's with a very high rating (low prime density) as a result of having the bad luck to have few primes. Taken to extreme, if you included k=2293 in there, its rating would be infinite since is has zero primes. It's kind of hard to average or fit a leastsquares line to infinite. lol
Example: Thing A occurs 1 in every 1000 times so its "rating" is 1000. Thing B occurs 1 in every 250 times so its "rating" is 250. (i.e. similar to the way these ratings work) If you average the two, you get that the average occurence is 1 in every 625 but that would be incorrect. You have to invert them to a percentage such as: Thing A occurs 1/1000 = .001 of the time. Thing B occurs 1/250 = .004 of the time. Averaging the two gives .0025. This converts to 1 every 400 or a "rating" of 400. That would be correct. If you attempt to pick something out of hat A with a 1/1000 chance and hat B with a 1/250 chance, over the long run, you'll pick that thing out of one of the random hats 1/400 times. Therefore for the spreadsheet to have a correct leastsquares fit line, the ratings have to be inverted. The reason I chose it the way I did is that I didn't want ratings like .003, .004, etc. where the higher rating meant a more prime dense k. Well, what do you know, at least from this small sampling the lowerweight k's have a slightly higher prime density. I even removed the Mersenne's and the lower weights were still better. Alas, this is a very small sampling. More data is needed. I took your spreadsheet, changed the ratings by inverting them to percentages, removed the Mersenne's, and deleted sheets 3 and 4. The first sheet has more data and should make for more accurate analysis. The above inversion could be done by inverting the original formula or simply taking 1 divided by the rating. I inverted the formula. For future comparison purposes and leastsquares line fitting, the Prime kvalue density rating "percentage" would be: # of primes / Nash weight / log (search depth) Anything lower than a rating of 200 or higher than a rating percentage of .005 would be outstanding IF the kvalue has more than ~30 primes so as to have statistical significance. Obviously Mersenne's fit the "outstanding" category. Gary 
20090730, 05:53  #20 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT)
2^{4}×353 Posts 
what sort of quantity of data do we need?
how many ks? do we need an automated process? 
20090730, 07:50  #21 
May 2007
Kansas; USA
2×7^{2}×103 Posts 
All 501 k's up to k=1001 should be sufficient to get close to statistical significance within weight ranges if there is any significance to be found.

20090730, 08:33  #22 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT)
1011000010000_{2} Posts 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Prime Density Correction Terms?  wblipp  Math  20  20110907 21:45 
Asymptotic density of kalmost primes  CRGreathouse  Math  1  20100822 23:47 
Density of Mersenne divisors  Random Poster  Math  1  20081215 01:14 
prime density vs. sieving vs. prp time  jasong  Math  18  20060331 03:14 
Small range with high density of factors  hbock  Lone Mersenne Hunters  1  20040307 19:51 