mersenneforum.org Expected Time To Complete A Quest Function
 Register FAQ Search Today's Posts Mark Forums Read

 2011-12-22, 15:56 #1 SaneMur   Jul 2011 11710 Posts Expected Time To Complete A Quest Function Greetings fellow primers, I had a thought recently (always a dangerous proposition!) about "ballparking" how long it might take to find a prime currently being sought as a function of a few variables. 1. Your percentage sieved (or, perhaps better, your percentage of candidates remaining). 2. The number of exponents your hardware is capable of examining per day. 3. The number of unsieved exponents you might have to examine before finding a prime. Let r = the decimal form of the percentage of remaining candidates. Let n = the number of exponents the hardware can examine per day. Let u = the number of unsieved exponents you would need to examine before finding a prime. Let d = the number of days you will have to search to find a prime d = u/(n/r) = ur/n. So if you sieved a range of 1,000,000 exponents to a depth of 93% (70,000 candidates remained) and your hardware can examine 500 of these exponents per day, and the likelihood is that only 1 prime exists among your candidates, then d = (1000000)(.07)/500 = 140 days to find your prime. Of course, the value for n will decrease as the exponent increases, this is just an average value. Other than that, are there any "holes" in this speculation? Last fiddled with by SaneMur on 2011-12-22 at 15:57
 2011-12-22, 16:03 #2 SaneMur   Jul 2011 1658 Posts I went though all of this to also see what the cost-benefit is for each additional percentage point of sieving. In the example above, each reduction by 1% saves 20 days of search time. In my case, it will take about 32 days to sieve the next 1%, so it is not worth sieving and time is better spent on working with the resulting set. But, for each sieving scenario and depth, this will, of course, be different. It's a convenient way to see whether you should spend more time sieving or get searching.
 2011-12-23, 03:47 #3 lavalamp     Oct 2007 Manchester, UK 17×79 Posts The traditional wisdom is to only start testing when candidates are being removed from the sieve at a slower rate than you can test them. If you want to work out when this might be then you can make the assumption that the sieve speed remains constant (basically true when you get into the high billions or trillions), and another assumption that the rate of removal from the sieve is proportionate to 1/log(sieve depth). Here is a completely artificial example which assumes I am correct about the 1/log(sieve depth) thing, I may very well not be so please call me on it if I'm not. Suppose you're at a sieve depth of 1T, and you have 10 candidates being removed every second. The sieve is progressing at a rate of ~250 million / s and the smallest candidates you want to test take 0.125 seconds. So you want to calculate, in this case, how long it will take for the sieve to drop to 8 candidates removed / second (0.125 seconds / candidate). You have two equations here: k / log(10^12) = 10 k / log(x) = 8 I'll save the algebra, but it's relatively easy to see that: x = 10^15 So now you know the sieve target is 1Q, and you can work out that it will take 46.3 days to reach it. Obviously you would then have to calculate the time to run the primality tests on the remaining candidates which for most tests is a relatively simple matter. Calculation of your variable u could be a trickier proposition though.
2011-12-23, 04:15   #4
axn

Jun 2003

32·19·29 Posts

Quote:
 Originally Posted by lavalamp I may very well not be so please call me on it if I'm not.
Ok. Consider it done

Quote:
 Originally Posted by lavalamp So you want to calculate, in this case, how long it will take for the sieve to drop to 8 candidates removed / second (0.125 seconds / candidate). You have two equations here: k / log(10^12) = 10 k / log(x) = 8 I'll save the algebra, but it's relatively easy to see that: x = 10^15
If you had the same number of candidates left in the sieve file, then the correct calculation is as simple as 10^12 / (8/10) = 1.25*10^12. The reason : a given candidate 'p' divides 1/p of the factors. So number of factors found/sec is likewise proportional to 1/p (not 1/log(p)).

But, the number of candidates left in the sieve file won't be the same either. It'll be about 1-log(p1)/log(p2)=0.8% less. (which in this case is small enough that we can ignore it -- but if the ration of p2:p1 is much larger, then that'll add a small but significant reduction in the number of factors found).

 2011-12-23, 06:41 #5 lavalamp     Oct 2007 Manchester, UK 134310 Posts Hm, is that true even though the sieve does not eliminate all composite potential factors? Apologies if this is obvious but it's a bit late here and there are soft cats, so I'm finding it a bit hard to brain.
2011-12-23, 09:37   #6
axn

Jun 2003

32×19×29 Posts

Quote:
 Originally Posted by lavalamp Hm, is that true even though the sieve does not eliminate all composite potential factors?
Yes. Mostly. You can do a minor correction for that using log(p). You can see this thread for some more info

2011-12-23, 15:41   #7
SaneMur

Jul 2011

32×13 Posts

Quote:
 Originally Posted by lavalamp The traditional wisdom is to only start testing when candidates are being removed from the sieve at a slower rate than you can test them.
Ah, yes, but I am also prime-hunting while the sieving is taking place, call it "live sieving" if you want.

Consider, in your example, how many primes you could have tested in those 40+ days. That complicates things a bit.

My sieve ranges are arbitrarily large, so that whatever I am prime-hunting is only a small subset of the total. That way, I can "start looking", and when I see the siever has reached a certain depth, I stop both processes, and re-hunt on the better-sieved file(s). Then I resume sieving, etc.

But no matter how deeply sieved and how many primes I find, there comes a point where the sun would burn out before I find the next one.

I am looking for "reasonable lengths of time" to be returned by my function, so that I don't waste CPU months hunting what won't be found.

2011-12-23, 17:45   #8
lavalamp

Oct 2007
Manchester, UK

53F16 Posts

Quote:
 Originally Posted by SaneMur Ah, yes, but I am also prime-hunting while the sieving is taking place, call it "live sieving" if you want. Consider, in your example, how many primes you could have tested in those 40+ days. That complicates things a bit.
Unless you have a machine that is better at testing than sieving, and another machine that is better at sieving than testing, then this approach is sub-optimal. Even if you do have two such machines, there's still only a narrow range in which it's better to have one sieving and one testing at the same time.

If you can remove candidates faster by sieving, then any cycles you spend on testing are slowing you down.

The only reason to run a few tests on numbers that aren't optimally sieved is so that you can figure out how long the tests take so you can optimally sieve the rest of the candidates.

2011-12-23, 23:41   #9
SaneMur

Jul 2011

32·13 Posts

Quote:
 Originally Posted by lavalamp Unless you have a machine that is better at testing than sieving, and another machine that is better at sieving than testing, then this approach is sub-optimal. Even if you do have two such machines, there's still only a narrow range in which it's better to have one sieving and one testing at the same time.
There is another scenario where live sieving makes sense.

I am using bases > 2, and exponent ranges up to 10,000,000.

Of course, there is no way for me to get through all of them in time periods less than years, but I did this for a reason.

I sieve to some arbitrary depth, let's call it 1T.

No reason to carry on much further for the first 100,000 exponents in the list, I am sure we can both agree.

So, I split the prime testing up on 5 cores, and core 6 sieves while these others churn away.

I usually reach a sieve depth of 10T on one core by the time the other 5 cores finish. Now I have what is a "great baseline" to use for the parallel sieving operation.

I use all 6 cores of machine #2 to do the prime testing for 100,000 < n < 500,000. And, again, recall my exponent base is >2, so the tests take longer than for base 2.

On machine #1 I engage 5 cores for parallel sieving and one core for the prime testing.

When this parallel sieving completes, I have sieved to a depth of 100T and searched all prime candidates up to 500,000. The very deep sieve usually means > 48 hours would be required to sieve just a single exponent if I were to use just one core, so there is no real need to sieve any longer.

At this point I have 12 cores at 5.0 GHz work on all of the parceled exponents n > 500,000 until "the sun burns out."

Should I have waited to reach 100T from the start and do no prime searching during that time, I would not have optimized the use of time best to my advantage.

 2011-12-24, 20:56 #10 Christenson     Dec 2010 Monticello 5×359 Posts At the risk of getting RDS to yell at me, can you supply me with an example of the problem you are working on? And, by the way, we *do* have machines that are better at TF than at sieving...they are called GPUs.
2011-12-24, 22:14   #11
Mini-Geek
Account Deleted

"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts

Quote:
 Originally Posted by Christenson And, by the way, we *do* have machines that are better at TF than at sieving...they are called GPUs.
TF and sieving are (roughly) the same. I assume you meant "better at TF/sieving than testing". This is true, but the software to efficiently use GPUs does not currently exist for as wide a range of sieving problems as for CPUs. E.g. CPUs are still the best (AFAIK) for sieving CRUS work.

 Similar Threads Thread Thread Starter Forum Replies Last Post masser Sierpinski/Riesel Base 5 2 2006-04-23 16:05 Citrix Prime Sierpinski Project 5 2006-01-09 03:45 JuanTutors Software 3 2004-06-28 10:47 nitro Lone Mersenne Hunters 0 2003-12-07 13:50 dsouza123 Software 12 2003-08-21 18:38

All times are UTC. The time now is 23:34.

Fri May 7 23:34:08 UTC 2021 up 29 days, 18:15, 0 users, load averages: 1.74, 1.75, 1.72