![]() |
|
|
#45 | |
|
May 2007
Kansas; USA
101×103 Posts |
Quote:
The answer to your question is a clear no in all circumstances. Even if you have 20 cores testing and 1 core sieving, it's still more efficient to sieve to the true optimum depth all of the time (and possibly use a percentage of your cores on other's sieved files; see below). Why? Because sieving is only 5-10% of total test effort. If you are, as you said, only sieving everything to P=500G for testing high limits, you're wasting quite a bit of testing time. In other words, if you wanted to sieve and test all of the time, at all times you should use 1-2 cores sieving while 20 cores are testing, which means you'd spend 5-10% of your total resources sieving. Many people make the mistake of under-sieving because they are in a hurry to find primes. All of that said, even if sieving were 50% of total test effort, you're still better off using that one sieving core all of the time to sieve to the optimum depth and ONE other core to test all of the time on your own sieved files (i.e. 50% of total CPU effort)...leaving the remaining 18 cores to test what others have already sieved. That is why we have many pre-sieved efforts here. Many people don't have good sievers or they just don't like to sieve and others like Lennart and Mathew like to sieve a lot. I'm kind of in the middle. I like to sieve for a while on 1-3 quads (I have 6 decent sieving quads) but it gets old after a period of time. I hope this puts it in logical perspective for you. Edit: For any team drive such as this, we always want to minimize total CPU time because there are very many varied interests here and there is always someone to do what others don't want to do. Last fiddled with by gd_barnes on 2011-07-18 at 06:07 Reason: edit |
|
|
|
|
|
|
#46 |
|
May 2008
Wilmington, DE
22×23×31 Posts |
You guys are never going to convince me that sieving to 5T for testing to 200K is saving me time.
I've done plenty of testing on the 1kers to 200K. It takes me 1 day to sieve a base to 500B using 1 core. On an average weight base I get about 4000 tests to run. It would take me another 9 days to get it to 5T. My average test time for 100K to 200K is 1.6 hours. That's 180 tests per day for 12 cores. That's 22 days to test the entire 4000 tests. By going to 5T on the sieve, I eliminated another 360 tests from the 4000. It would take me 2 days to test those 360, so in other words, I spent 9 more days eliminating something I could have tested in 2 days. Where's the savings? Sure, if I had waited the 10 days for the sieve to finish, it would have taken me only 20 days to fully test the 3640 tests, instead of 22 days. Assuming I use the 13 cores serially(sieve a base, test a base), it takes me 23 days to do it my way and 30 days to do it your way. Again, what am I missing here? I believe the problem is that the sieving programs are skewing the times it takes to eliminate a test. Thousands of tests are eliminated up front and very few on the back end. What we really need to know is how long it is taking to eliminate a test on the back end only. I've seen sieves report that the average elimination time per test is 10 minutes, but the program hasn't eliminated anything in over a hour or 2. Last fiddled with by MyDogBuster on 2011-07-18 at 08:29 |
|
|
|
|
|
#47 |
|
"Lennart"
Jun 2007
25×5×7 Posts |
I realy hope that the sieves not end at 200k !!!
Thats a waste of time. Sieve should at least go to n=1M Can you give me 6 bases so I can start sieving ? I don't know wich bases have a sievefile or not. Lennart Last fiddled with by Lennart on 2011-07-18 at 10:56 |
|
|
|
|
|
#48 |
|
"Lennart"
Jun 2007
100011000002 Posts |
I have started sieving
r67 r70 r103 r133 n 100k-1M Lennart |
|
|
|
|
|
#49 |
|
"Lennart"
Jun 2007
25·5·7 Posts |
I have started sieving r158 & r162
r158 n 100k-1M r162 n 50k-1M Lennart Last fiddled with by Lennart on 2011-07-18 at 13:05 |
|
|
|
|
|
#50 | |
|
"Mark"
Apr 2003
Between here and the
24×397 Posts |
Quote:
As to what you are missing, it is the fact that you are sieving on one core while PRP testing on multiple cores. If you were to split up sieving across all of your cores (a range of 5e11 per core), then it would take you one day to sieve to 6.5e12 and eliminate two days of PRP testing. In other words, it would take you 21 days (with this method) instead of 23 days to complete that range. The easier thing to do (rather than switching all of your core between PRPing and sieving) is to dedicate one to sieving and sieve your next range to optimal depth while the others are PRPing your current range. What I do (and I presume others do something similar) is that since I also have multiple cores (and am using PRPNet), I will stop PRP testing on a core a few weeks before the range is done and begin sieving my next reservation. Once sieving completes, I load it into a server then switch that core back to PRP testing. Note that I do this with two PRPNet servers running and use a 50:50 split between the two in the clients. It takes slightly longer to complete the older reservation, but with single k conjectures I want to avoid the possibility of idle clients if a prime were to be found quickly. Last fiddled with by rogue on 2011-07-18 at 13:37 |
|
|
|
|
|
|
#51 | |
|
May 2008
Wilmington, DE
22·23·31 Posts |
Quote:
One other factor, especially on the CRUS stuff, is that once we find a prime, the rest of the tests are eliminated. I know we can't count on finding a prime, but they are found and significant tests are eliminated. |
|
|
|
|
|
|
#52 | |
|
"Mark"
Apr 2003
Between here and the
635210 Posts |
Quote:
On a typical search sieving until the removal rate is about 2/3's of the longest PRP test is typically optimal (without a lot of time spent looking at FFT sizes, etc.). For CRUS, if about 30% of the k are removed when going from n to 2n, then a removal rate of about 1/2 is better. |
|
|
|
|
|
|
#53 | |
|
Jun 2005
lehigh.edu
210 Posts |
Quote:
why our teslas give computing errors for Primegrid's ppssieve and gcwsieve? For example Code:
LD_LIBRARY_PATH=/usr/local.hide/cuda/lib64:$LD_LIBRARY_PATH ./tpsieve-cuda-x86_64-linux -p 100000e9 -P 100001e9 -k 3 -K 9999 -n 2M -N 3M -c60 -t 4 tpsieve version cuda-0.2.3b (testing) Compiled Jun 25 2011 with GCC 4.1.2 20080704 (Red Hat 4.1.2-48) nstart=2000000, nstep=34 nstep changed to 32 tpsieve initialized: 3 <= k <= 9999, 2000000 <= n < 3000000 Sieve started: 100000000000000 <= p < 100001000000000 Thread 0 starting Thread 1 starting Thread 2 starting Thread 3 starting Detected GPU 1: Tesla C2070 Detected compute capability: 2.0 Detected 14 multiprocessors. Detected GPU 3: Tesla C2050 Detected compute capability: 2.0 Detected 14 multiprocessors. Detected GPU 0: Tesla C2070 Detected compute capability: 2.0 Detected 14 multiprocessors. Detected GPU 2: Tesla C2050 Detected compute capability: 2.0 Detected 14 multiprocessors. Computation Error: no candidates found for p=100000762311649. Thread 0 completed Waiting for threads to exit Thread 2 completed Thread 3 completed Thread 1 completed Sieve complete: 100000000000000 <= p < 100001000000000 Found 0 factors count=31019409,sum=0x284af85735fd771f Elapsed time: 15.42 sec. (0.02 init + 15.40 sieve) at 64939957 p/sec. Processor time: 5.10 sec. (0.02 init + 5.08 sieve) at 197051081 p/sec. Average processor utilization: 1.03 (init), 0.33 (sieve) Code:
> Does
> LD_LIBRARY_PATH=/usr/local.hide/cuda/lib64:$LD_LIBRARY_PATH
./tpsieve-cuda-x86_64-linux -p 100000e9 -P 100001e9 -k 3 -K 9999 -n 2M > -N 3M -c60 -t 1 -d X
> work? X = 1,2,3 or 4 // one card of the 4(?) in the box...
No, doesn't look like it --- no error message, but no factors found
either (much less 73). -Bruce*
-t 1 -d 2
tpsieve version cuda-0.2.3b (testing)
Compiled Jun 25 2011 with GCC 4.1.2 20080704 (Red Hat 4.1.2-48)
nstart=2000000, nstep=34
nstep changed to 32
tpsieve initialized: 3 <= k <= 9999, 2000000 <= n < 3000000
Sieve started: 100000000000000 <= p < 100001000000000
Thread 0 starting
Detected GPU 2: Tesla C2050
Detected compute capability: 2.0
Detected 14 multiprocessors.
Thread 0 completed
Waiting for threads to exit
Sieve complete: 100000000000000 <= p < 100001000000000
Found 0 factors
count=31019409,sum=0x284af85735fd771f
Elapsed time: 58.21 sec. (0.02 init + 58.19 sieve) at 17186248 p/sec.
Processor time: 4.33 sec. (0.02 init + 4.30 sieve) at 232341768 p/sec.
Average processor utilization: 1.08 (init), 0.07 (sieve)
---
LD_LIBRARY_PATH=/usr/local.hide/cuda/lib64:$LD_LIBRARY_PATH ./tpsiev
e-cuda-x86_64-linux -p 100000e9 -P 100001e9 -k 3 -K 9999 -n 2M -N 3M -c60 -t 1 -d 0
tpsieve version cuda-0.2.3b (testing)
Compiled Jun 25 2011 with GCC 4.1.2 20080704 (Red Hat 4.1.2-48)
nstart=2000000, nstep=34
nstep changed to 32
tpsieve initialized: 3 <= k <= 9999, 2000000 <= n < 3000000
Sieve started: 100000000000000 <= p < 100001000000000
Thread 0 starting
Detected GPU 0: Tesla C2070
Detected compute capability: 2.0
Detected 14 multiprocessors.
Thread 0 completed
Waiting for threads to exit
Sieve complete: 100000000000000 <= p < 100001000000000
Found 0 factors
count=31019409,sum=0x284af85735fd771f
Elapsed time: 58.25 sec. (0.02 init + 58.23 sieve) at 17175351 p/sec.
Processor time: 4.48 sec. (0.02 init + 4.46 sieve) at 224418059 p/sec.
Average processor utilization: 1.02 (init), 0.08 (sieve)
(And Greetings! after such a long time since the start of the Rogue-Garo Tables of Cunningham curve counts. Sorry for the off-topic inquiry. -bdodson) |
|
|
|
|
|
|
#54 | |
|
"Mark"
Apr 2003
Between here and the
24×397 Posts |
Quote:
|
|
|
|
|
|
|
#55 | |
|
May 2007
Kansas; USA
101×103 Posts |
Quote:
You are not thinking of CPU time. Think of it like this: If your sieving program eliminates 360 tests in one day it will have saved 20 computers 18 tests each. Yeah, those 20 computers could have done those 360 tests 20 times (i.e. 18 tests each) as fast but in the time that it took them, they could have been working on something else. Here is what you are ending up with: 1. Let's say you sieve from P=500G to 1T, you eliminate 100 tests, and it takes you one day to do that (24 CPU hours). 2. Since clearly that is not far enough for sieving n=100K-200K, I think it's fair to say that each test would take at least 30 CPU minutes. 3. 100 tests * 30 CPU minutes = 50 CPU hours. So...you have taken 50 CPU hours to test what could have been eliminated in 24 CPU hours sieving. Those additional 26 CPU hours could have been testing something else in the mean time; perhaps on testing an already-well-sieved base 6 file or something like that. This is a real example of what is happening to you. Please trust me, only sieving n=100K-200K for high bases to P=500G is wasting a lot of overall CPU time. Lennart and Curtis (at RPS) could quickly confirm this. Even if you have only 1 core for sieving for every 20 cores testing, that's enough. You just have to be patient in getting started. Edit: Mark also had an excellent example that looked at the inefficiency in a little different way. Gary Last fiddled with by gd_barnes on 2011-07-18 at 18:06 |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| PRPnet 2nd drive-51 bases with <= 5 k's to n=250K | gd_barnes | Conjectures 'R Us | 158 | 2013-08-12 03:18 |
| PRPnet 1st drive-R/S base 2 even-k/even-n/odd-n | mdettweiler | Conjectures 'R Us | 153 | 2011-08-10 06:54 |
| Bigger and better GPU sieving drive: Discussion | henryzz | No Prime Left Behind | 75 | 2010-10-31 16:51 |
| PRPNET & Phrot discussion | masser | Sierpinski/Riesel Base 5 | 27 | 2010-09-08 03:10 |
| PRPnet | mdettweiler | No Prime Left Behind | 80 | 2010-02-09 21:31 |