![]() |
Sieving, PRPing, and what I feel is a common misunderstanding
I sieve on a Linux box with an Athlon chip. I PRP on a dual-core Pentium-D. I sieve about 6 values for every one I PRP on one of my Pentium-Ds cores.
Most people would say,"Run the sub-project that removes the most values." So Athlons sieve, Pentium-4s PRP, P3s sieve, etc.. They're happy as long as their doing the thing that removes the most values. But there's a problem. Most of them haven't factored in the possibility of primes in their equation. Think about it. The numbers most likely to yield prime are the ones that have the most values left after sieving(More values, more chances to yield prime). The thing is, since there are a lot of values for these numbers, it's more likely that they'll yield a factor. So if you find 90% of your factors for numbers that yield prime below, say, n=750,000, you may have actually wasted time. You don't know for sure, it's a statistical crap shoot. Everybody can do whatever they want. But, statistically, a person should be PRPing long, LONG before they start running out of numbers to sieve. For me, if I'm sieving less than 5 values for every one I can PRP on the same machine, I'm going to be PRPing. |
Btw, the above post wasn't aimed at anyone in particular. I was simply feeling extra geeky at the time. :)
|
My response here assumes fixed k sieves, not fixed n sieves.
IF you are working on a project where you test all candidates in a range, such as the Ray Ballinger's Proth search, then your statement is completely incorrect. You should sieve until sieving removes candidates at about the same rate as the median PRP test. IF you are working on SoB, RieselSieve, or SR5, then the rules are different, but not by much. The point in these projects is that you are sieving many values of k in tandem. The cost of sieving additional k is not significant (normally), maybe 1-2%. IF you are working on a stand-alone k on another project where you want to stop at the first k, I would argue that you break your range of n into smaller ranges where each range represents a different FFT size. I would then sieve (the whole range of n) until the rate of removal is at the same rate as s PRP test for that FFT size. When that rate is met, PRP test that range of n. When PRP testing is done, continue sieving and repeat the process until a prime is found or the range is complete. An example of this might be Joe McLean's Extended Sierpinski Search. For your statements, you might find low-hanging fruit faster, but it will take longer to complete the range. |
and don't forget that finding a factor at n=600000 take almost the same time than finding a factor at n=50000 but prping is not the same (on my centrino 1400 it's 60 second for n=50000 and 100000 seconds for n=600000)
so sieving is very effective for removing large exponent. prp is very effective at removing small exponent. thats why people say try to sieve until rate is below prp then prp a little then sieve then ...you have to take the best choice. and you are true, prp remove a lot of candidate because we are only searching for one prime ... that's why people are doing prp in parralel. you can even consider stopping sieving when rate is half prp's rate (due to double prp check). in reality, you do what you want, every help is welcome. |
[QUOTE=rogue;94712]For your statements, you might find low-hanging fruit faster, but it will take longer to complete the range.[/QUOTE]
Whether or not the range is completed slower or faster depends on random chance and whether there's been a probability study performed that gone by. In the case of the base-5 project, my choice of 5 factors for every test being the dividing line was a number picked out of thin air. On the other hand, I did a bit of mathematical study of the Riesel problem and decided that they sieved way, WAY too far, considering the n-value they're at. I had to drop out of high school at the very beginning of my 11th grade year, so the math I do is very seat-of-your-pantsish. I know it works, but I couldn't explain the rationale to save my life. If you're interested, I could give the link to the thread I started over at Riesel Sieve involving what we're discussing. I had a lot of false starts, it's probably best that you start at the end of the thread and work your way up, not that the thread's that long. |
[QUOTE=jasong;94716]Whether or not the range is completed slower or faster depends on random chance and whether there's been a probability study performed that gone by. In the case of the base-5 project, my choice of 5 factors for every test being the dividing line was a number picked out of thin air.[/QUOTE]
Then you have NO statistical evidence that what you are doing is the fastest way to find a prime. [QUOTE=jasong;94716]On the other hand, I did a bit of mathematical study of the Riesel problem and decided that they sieved way, WAY too far, considering the n-value they're at.[/QUOTE] I can't speak for that project as I have not participated in it. I don't know what their current factor removal rate is. [QUOTE=jasong;94716]I know it works, but I couldn't explain the rationale to save my life. If you're interested, I could give the link to the thread I started over at Riesel Sieve involving what we're discussing. I had a lot of false starts, it's probably best that you start at the end of the thread and work your way up, not that the thread's that long.[/QUOTE] Unless you can clearly demonstrate your reasoning, you will be viewed as a crackpot. BTW, you have started many threads in that forum. I don't know which one you speak of. |
[QUOTE=rogue;94739]BTW, you have started many threads in that forum. I don't know which one you speak of.[/QUOTE]
In the quote before that, I offered to give a link.:rolleyes: [url=http://www.rieselsieve.com/forum/viewtopic.php?t=561]Here.[/url] There is an explanation of my method, although the main guy running the project(b2) didn't agree with my findings. |
I have two issues (actually more than two, but I'll point out two here).
1) You are focused on scoring more than finding primes. Although some people care about scoring, I believe that most are focused on finding primes because they are interested in solving the problem not in a score. 2) You assume that pr_prob.exe correctly evaluates the probability of finding primes. That is a terrible assumption. pr_prob.exe assumes the numbers being tested are random. I'm not an expert, but I would have to believe that the Proth weight or Nash weight of a specific form must be included any any such equation before it would have merit for these projects. There are others in this forum (Phil Carmody for one) who are far more qualified to calculate the probability of a number (given a specific form) being prime than myself. |
[QUOTE=rogue;94751]I have two issues (actually more than two, but I'll point out two here).
1) You are focused on scoring more than finding primes. Although some people care about scoring, I believe that most are focused on finding primes because they are interested in solving the problem not in a score.[/quote] When they created the scoring method, they meant for it to make a test that takes twice as long be worth twice as much, so the score is a good indicator, within maybe +/-10% of the time taken. It would be like taking the time taken and multiplying it by some constant. The proportions stay the same, like a triangle with sides 3 in./4 in./5 in. and one 4.5 in./6 in./7.5 in. [quote]2) You assume that pr_prob.exe correctly evaluates the probability of finding primes. That is a terrible assumption. pr_prob.exe assumes the numbers being tested are random. I'm not an expert, but I would have to believe that the Proth weight or Nash weight of a specific form must be included any any such equation before it would have merit for these projects. There are others in this forum (Phil Carmody for one) who are far more qualified to calculate the probability of a number (given a specific form) being prime than myself.[/QUOTE] that's a good point. I'm not sure how to include the weights of all the values. |
[QUOTE=jasong;94793]When they created the scoring method, they meant for it to make a test that takes twice as long be worth twice as much, so the score is a good indicator, within maybe +/-10% of the time taken. It would be like taking the time taken and multiplying it by some constant. The proportions stay the same, like a triangle with sides 3 in./4 in./5 in. and one 4.5 in./6 in./7.5 in.[/QUOTE]
I don't know how they do their scoring, but I know it isn't an exact science. They could easily skew the scoring to give members more credit for a PRP test than for finding a factor. This would put AMD 64 users at a disadvantage because the AMD 64 is better at sieving than PRP testing (compared to a P4). I suspect that finding a factor of a 2e6 bit number is not worth as much in scoring as performing a PRP test on that same number. I suspect that the "score" for a factor is based upon the size of the factor, not the size of the number it is a factor of. Again, this puts factorers at a disadvantage. |
I hope that I have made these points clear:
[LIST=1][*]The objectivity of scoring is limited. Those who coordinate a project choose how they want to assign points for a PRP test vs. finding a factor. From what I have seen sieving is not worth as much as PRP testing.[*]If you really care about scoring, find out exactly how points are awarded and gear your participation on sub-projects to that scale. As PRP testing appears to give higher scores, focus on that.[*]If you only care about scores and not about solving the problem, it will take LONGER to solve the problem.[*]pr_prob cannot be used to determine the probability that a number in the form k*b^n+/-1 is prime. I don't know if I could write a formal proof for this statement for all k/b/n, but I can easily prove it for some k/b/n.[/LIST] Here is one last point WRT scoring that is missed by many. If you have an AMD 64, the ratio of finding factors to PRP testing will be much different than on an Intel CPU. As I don't have an AMD 64 at my disposal and as I don't care about my "score" on this project, I don't know what that ratio is. So if there is any misunderstanding, it is WRT scoring and not WRT a faster proof to the Sierpinski or Riesel problems in any base. |
| All times are UTC. The time now is 09:31. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.