mersenneforum.org  

Go Back   mersenneforum.org > New To GIMPS? Start Here! > Information & Answers

Reply
 
Thread Tools
Old 2011-12-22, 15:56   #1
SaneMur
 
Jul 2011

11710 Posts
Default Expected Time To Complete A Quest Function

Greetings fellow primers,

I had a thought recently (always a dangerous proposition!) about "ballparking" how long it might take to find a prime currently being sought as a function of a few variables.

1. Your percentage sieved (or, perhaps better, your percentage of candidates remaining).

2. The number of exponents your hardware is capable of examining per day.

3. The number of unsieved exponents you might have to examine before finding a prime.

Let r = the decimal form of the percentage of remaining candidates.
Let n = the number of exponents the hardware can examine per day.
Let u = the number of unsieved exponents you would need to examine before finding a prime.

Let d = the number of days you will have to search to find a prime

d = u/(n/r) = ur/n.

So if you sieved a range of 1,000,000 exponents to a depth of 93% (70,000 candidates remained) and your hardware can examine 500 of these exponents per day, and the likelihood is that only 1 prime exists among your candidates, then

d = (1000000)(.07)/500 = 140 days to find your prime.

Of course, the value for n will decrease as the exponent increases, this is just an average value.

Other than that, are there any "holes" in this speculation?

Last fiddled with by SaneMur on 2011-12-22 at 15:57
SaneMur is offline   Reply With Quote
Old 2011-12-22, 16:03   #2
SaneMur
 
Jul 2011

1658 Posts
Default

I went though all of this to also see what the cost-benefit is for each additional percentage point of sieving.

In the example above, each reduction by 1% saves 20 days of search time. In my case, it will take about 32 days to sieve the next 1%, so it is not worth sieving and time is better spent on working with the resulting set.

But, for each sieving scenario and depth, this will, of course, be different. It's a convenient way to see whether you should spend more time sieving or get searching.
SaneMur is offline   Reply With Quote
Old 2011-12-23, 03:47   #3
lavalamp
 
lavalamp's Avatar
 
Oct 2007
Manchester, UK

17×79 Posts
Default

The traditional wisdom is to only start testing when candidates are being removed from the sieve at a slower rate than you can test them.

If you want to work out when this might be then you can make the assumption that the sieve speed remains constant (basically true when you get into the high billions or trillions), and another assumption that the rate of removal from the sieve is proportionate to 1/log(sieve depth).

Here is a completely artificial example which assumes I am correct about the 1/log(sieve depth) thing, I may very well not be so please call me on it if I'm not.

Suppose you're at a sieve depth of 1T, and you have 10 candidates being removed every second. The sieve is progressing at a rate of ~250 million / s and the smallest candidates you want to test take 0.125 seconds.

So you want to calculate, in this case, how long it will take for the sieve to drop to 8 candidates removed / second (0.125 seconds / candidate).

You have two equations here:
k / log(10^12) = 10
k / log(x) = 8

I'll save the algebra, but it's relatively easy to see that:
x = 10^15

So now you know the sieve target is 1Q, and you can work out that it will take 46.3 days to reach it.

Obviously you would then have to calculate the time to run the primality tests on the remaining candidates which for most tests is a relatively simple matter.

Calculation of your variable u could be a trickier proposition though.
lavalamp is offline   Reply With Quote
Old 2011-12-23, 04:15   #4
axn
 
axn's Avatar
 
Jun 2003

32·19·29 Posts
Default

Quote:
Originally Posted by lavalamp View Post
I may very well not be so please call me on it if I'm not.
Ok. Consider it done


Quote:
Originally Posted by lavalamp View Post
So you want to calculate, in this case, how long it will take for the sieve to drop to 8 candidates removed / second (0.125 seconds / candidate).

You have two equations here:
k / log(10^12) = 10
k / log(x) = 8

I'll save the algebra, but it's relatively easy to see that:
x = 10^15
If you had the same number of candidates left in the sieve file, then the correct calculation is as simple as 10^12 / (8/10) = 1.25*10^12. The reason : a given candidate 'p' divides 1/p of the factors. So number of factors found/sec is likewise proportional to 1/p (not 1/log(p)).

But, the number of candidates left in the sieve file won't be the same either. It'll be about 1-log(p1)/log(p2)=0.8% less. (which in this case is small enough that we can ignore it -- but if the ration of p2:p1 is much larger, then that'll add a small but significant reduction in the number of factors found).
axn is offline   Reply With Quote
Old 2011-12-23, 06:41   #5
lavalamp
 
lavalamp's Avatar
 
Oct 2007
Manchester, UK

134310 Posts
Default

Hm, is that true even though the sieve does not eliminate all composite potential factors?

Apologies if this is obvious but it's a bit late here and there are soft cats, so I'm finding it a bit hard to brain.
lavalamp is offline   Reply With Quote
Old 2011-12-23, 09:37   #6
axn
 
axn's Avatar
 
Jun 2003

32×19×29 Posts
Default

Quote:
Originally Posted by lavalamp View Post
Hm, is that true even though the sieve does not eliminate all composite potential factors?
Yes. Mostly. You can do a minor correction for that using log(p). You can see this thread for some more info
axn is offline   Reply With Quote
Old 2011-12-23, 15:41   #7
SaneMur
 
Jul 2011

32×13 Posts
Default

Quote:
Originally Posted by lavalamp View Post
The traditional wisdom is to only start testing when candidates are being removed from the sieve at a slower rate than you can test them.
Ah, yes, but I am also prime-hunting while the sieving is taking place, call it "live sieving" if you want.

Consider, in your example, how many primes you could have tested in those 40+ days. That complicates things a bit.

My sieve ranges are arbitrarily large, so that whatever I am prime-hunting is only a small subset of the total. That way, I can "start looking", and when I see the siever has reached a certain depth, I stop both processes, and re-hunt on the better-sieved file(s). Then I resume sieving, etc.

But no matter how deeply sieved and how many primes I find, there comes a point where the sun would burn out before I find the next one.

I am looking for "reasonable lengths of time" to be returned by my function, so that I don't waste CPU months hunting what won't be found.
SaneMur is offline   Reply With Quote
Old 2011-12-23, 17:45   #8
lavalamp
 
lavalamp's Avatar
 
Oct 2007
Manchester, UK

53F16 Posts
Default

Quote:
Originally Posted by SaneMur View Post
Ah, yes, but I am also prime-hunting while the sieving is taking place, call it "live sieving" if you want.

Consider, in your example, how many primes you could have tested in those 40+ days. That complicates things a bit.
Unless you have a machine that is better at testing than sieving, and another machine that is better at sieving than testing, then this approach is sub-optimal. Even if you do have two such machines, there's still only a narrow range in which it's better to have one sieving and one testing at the same time.

If you can remove candidates faster by sieving, then any cycles you spend on testing are slowing you down.

The only reason to run a few tests on numbers that aren't optimally sieved is so that you can figure out how long the tests take so you can optimally sieve the rest of the candidates.

axn, I will read through that thread you linked and meditate on the matter.
lavalamp is offline   Reply With Quote
Old 2011-12-23, 23:41   #9
SaneMur
 
Jul 2011

32·13 Posts
Default

Quote:
Originally Posted by lavalamp View Post
Unless you have a machine that is better at testing than sieving, and another machine that is better at sieving than testing, then this approach is sub-optimal. Even if you do have two such machines, there's still only a narrow range in which it's better to have one sieving and one testing at the same time.
There is another scenario where live sieving makes sense.

I am using bases > 2, and exponent ranges up to 10,000,000.

Of course, there is no way for me to get through all of them in time periods less than years, but I did this for a reason.

I sieve to some arbitrary depth, let's call it 1T.

No reason to carry on much further for the first 100,000 exponents in the list, I am sure we can both agree.

So, I split the prime testing up on 5 cores, and core 6 sieves while these others churn away.

I usually reach a sieve depth of 10T on one core by the time the other 5 cores finish. Now I have what is a "great baseline" to use for the parallel sieving operation.

I use all 6 cores of machine #2 to do the prime testing for 100,000 < n < 500,000. And, again, recall my exponent base is >2, so the tests take longer than for base 2.

On machine #1 I engage 5 cores for parallel sieving and one core for the prime testing.

When this parallel sieving completes, I have sieved to a depth of 100T and searched all prime candidates up to 500,000. The very deep sieve usually means > 48 hours would be required to sieve just a single exponent if I were to use just one core, so there is no real need to sieve any longer.

At this point I have 12 cores at 5.0 GHz work on all of the parceled exponents n > 500,000 until "the sun burns out."

Should I have waited to reach 100T from the start and do no prime searching during that time, I would not have optimized the use of time best to my advantage.
SaneMur is offline   Reply With Quote
Old 2011-12-24, 20:56   #10
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

5×359 Posts
Default

At the risk of getting RDS to yell at me, can you supply me with an example of the problem you are working on?

And, by the way, we *do* have machines that are better at TF than at sieving...they are called GPUs.
Christenson is offline   Reply With Quote
Old 2011-12-24, 22:14   #11
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by Christenson View Post
And, by the way, we *do* have machines that are better at TF than at sieving...they are called GPUs.
TF and sieving are (roughly) the same. I assume you meant "better at TF/sieving than testing". This is true, but the software to efficiently use GPUs does not currently exist for as wide a range of sieving problems as for CPUs. E.g. CPUs are still the best (AFAIK) for sieving CRUS work.
Mini-Geek is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
f14 complete masser Sierpinski/Riesel Base 5 2 2006-04-23 16:05
Time to complete project Citrix Prime Sierpinski Project 5 2006-01-09 03:45
Time to complete information JuanTutors Software 3 2004-06-28 10:47
61.5 thru 62m complete to 2^60 nitro Lone Mersenne Hunters 0 2003-12-07 13:50
Shortest time to complete a 2^67 trial factor (no factor) dsouza123 Software 12 2003-08-21 18:38

All times are UTC. The time now is 23:34.

Fri May 7 23:34:08 UTC 2021 up 29 days, 18:15, 0 users, load averages: 1.74, 1.75, 1.72

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.