![]() |
Good idea Mark about having someone repeat my test of 3 years ago. Hardware and software has changed quite a bit in that time. Here is my thought:
Run a k=5M range, say P=30G-30.005G, to n=25K with no sieving using the latest version of PFGW on a modern machine. There should be ~20-30 k's remaining. Run the range once each with the following parameters: 1. -f0 2. -f10 3. -f30 4. -f50 5. -f100 My suggestion is to run -f30 first and get a baseline total timing. Then start runing the others. I suggest running the "outliers" of -f0 and -f100 next. I think you'll fairly quickly see that -f0 is taking way too long on k's that are remaining or have a prime n>10K. You'll also likely see that -f100 takes too long for the large majority of the range but it saves some time on the k's remaining at primes n>10K (reverse of -f0). Finally test -f10 and -f50. Those should be close to -f30. Perhaps one of them will be faster with today's hardware/software. After doing that, I suggest running a test doing what KEP has suggested. That is run the range using -f0 to n=2500. Then sieve for n=2500-10K, test, sieve n=10K-25K, and then test. See if that is faster than the fastest test above. Alternatively the test with the sieving involved could be done before any of the non-sieving tests. Seeing the timings would be interesting. |
[QUOTE=gd_barnes;405266]Good idea Mark about having someone repeat my test of 3 years ago. Hardware and software has changed quite a bit in that time. Here is my thought:
Run a k=5M range, say P=30G-30.005G, to n=25K with no sieving using the latest version of PFGW on a modern machine. There should be ~20-30 k's remaining. Run the range once each with the following parameters: 1. -f0 2. -f10 3. -f30 4. -f50 5. -f100 My suggestion is to run -f30 first and get a baseline total timing. Then start runing the others. I suggest running the "outliers" of -f0 and -f100 next. I think you'll fairly quickly see that -f0 is taking way too long on k's that are remaining or have a prime n>10K. You'll also likely see that -f100 takes too long for the large majority of the range but it saves some time on the k's remaining at primes n>10K (reverse of -f0). Finally test -f10 and -f50. Those should be close to -f30. Perhaps one of them will be faster with today's hardware/software. After doing that, I suggest running a test doing what KEP has suggested. That is run the range using -f0 to n=2500. Then sieve for n=2500-10K, test, sieve n=10K-25K, and then test. See if that is faster than the fastest test above. Alternatively the test with the sieving involved could be done before any of the non-sieving tests. Seeing the timings would be interesting.[/QUOTE] I agree. I will run these four tests on the same laptop which I conducted the other tests on (range of 50M k): -f10 to n=25000 -f30 to n=25000 -f0 to n=2500 -f0 to n=5000 Obviously on the last two I will need to sieve and continue to n=25000. I decided against running -f0 to n=25000 because I believe that remaining k at n=25000 will have a significantly negative impact on run time. |
[QUOTE=rogue;405271]I agree. I will run these four tests on the same laptop which I conducted the other tests on (range of 50M k):
-f10 to n=25000 -f30 to n=25000 -f0 to n=2500 -f0 to n=5000 Obviously on the last two I will need to sieve and continue to n=25000. I decided against running -f0 to n=25000 because I believe that remaining k at n=25000 will have a significantly negative impact on run time.[/QUOTE] You might find that testing 50M k takes a little longer than you'd like for so many different tests. Likely there would be ~200-300 k remaining. You might want to consider doing 5M or 10M k. |
[QUOTE=gd_barnes;405285]You might find that testing 50M k takes a little longer than you'd like for so many different tests. Likely there would be ~200-300 k remaining. You might want to consider doing 5M or 10M k.[/QUOTE]
Already started. Going to 50M reduces the impact that smaller ranges would have on the results. I will have results in less than three weeks. I will provide results as each completes as they won't complete on the same day. I would think that the first one to complete will be the best choice. |
[QUOTE=rogue;405300]Already started. Going to 50M reduces the impact that smaller ranges would have on the results. I will have results in less than three weeks. I will provide results as each completes as they won't complete on the same day. I would think that the first one to complete will be the best choice.[/QUOTE]
Due to my inability to incorrectly count zeros, I only tested to 5M. The two -f0 ones are done and I'm sieving those now. The -f10/-f30 are only about 40% and 35% done. |
[QUOTE=gd_barnes;405258]Kenneth, I know I seem a little short with you on this but we've had issues in the past with you making huge reservations and then losing interest, which leaves a big mess for me to sort out. The Riesel base 3 attack project that was abandoned after 1-2 months was the biggest of them all. Why do you feel this need to make HUGE reservations? For that reason, I am unwilling to accept huge reservations from you. I do not care how you test your reservations. If sieving is faster for you, then go right ahead. I only ask that the reservations be reasonable sized. Why is it such a problem to reserve a 4G range, complete it, and reserve another 4G range, etc.[/QUOTE]
I know that interest has been an issue in the past, but the problem with the Riesel Base 3 attack was that we did not have the starting bases script back then and the only thing wrong with the work I did back then, was that there remained to many k's (for reasons still unknown) but nothing were missed. Anyway I do understand your shortiness and as you may have seen in my private message I'm not doing this to bother you and I therefor initially accept to complete a 4G range, at a time, if it does not mean too much idle time. But as August 5th comes I'll reserve the lowest available 4G and then time will tell if it is desireable or if it gives too much idle time when ranges complete will I'm out on job. So to be clear, I read and understand fully your reasoning and I do understand that I have had a long history of wavering between different kind of work, but my main reason to reserve such a large range, was to keep the idle computation time to a minimum. Also I may add, that I do fully respect your admin right and that you may have the toughest job compared to all of us working for and with CRUS, so for now I'm going to do sets of 4G. Hope I answered it all. Take care. Ps. Could anyone care to tell me what kind of equipment that can complete a 1G range using -f30 to n=25K in only 8-10 CPU weeks? My Hasswell alone, showed signs of having to use 202 CPU days alone, testing a range of 1G to n=2500, so it puzzles me a bit how you can do it such fast. |
[QUOTE=rogue;405262]KEP, I think that Gary's request is reasonable, unless the computers you are running on are not accessed regularly.[/QUOTE]
My computer is accessed daily and I accept (at least for now :wink:) Gary's request. However I'm gonna hold back making an official reservation untill I see what result your timings shows. If there is a -fXX setting that shows to be just as efficient on an FMA and an AVX machine, as I believe -f0 and sieving is, then it is obviously a better choice :smile: Looking forward to be seeing what you come back with :smile: Take care everyone. |
I have my first results. With -f0 to n=5000, it took 18 hours 28 minutes to complete that range of 5M k to n=25000. This is broken down as:
[FONT="Courier New"][SIZE="2"] script: 15:04 sieve: 0:07 PRP test: 3:15 primality test: 0:02 [/SIZE][/FONT] There were 161 k remaining when the script finished. Sieving to 1e8 resulted in about 113,000 PRP tests. The remove rate was about 800 per minute at 1e8 while it tested about 600 per minute after sieving. I could probably have sieved to about 2e8 and shaved a few minutes off the time. With this setup it would take about 154 days for a single core to do a range of 1G. I tried to use cllr with the StopOnPrimedK switch, but it crashes. It might be faster than pfgw with the number_primes setting, but I can't verify it. I sent the details to Jean. After 20 hours and 24 minutes, -f10 to n=25000, it is about 48% done with the range. -f30 to n=25000 is about 43% done with the range. I have terminated both of those runs. Those are clearly half as fast (or worse) than -f0 to n=5000. You're probably wondering why I have not published results for -f0 to n=2500. It is still running and I won't complete it. The reason is that the script finished hours ago and that core remained idle for that time. Sieving and PRP testing started about the same time I started sieving the above. I need to complete PRP testing the range. I expect the results to be slightly better than the first. I will publish those later. |
I have my second results. With -f0 to n=2500, it took 16 hours 42 minutes to complete that range of 5M k to n=25000. This is broken down as:
[FONT="Courier New"][SIZE="2"] script: 12:01 sieve: 0:13 PRP test: 4:25 primality test: 0:03 [/SIZE][/FONT] There were 325 k remaining when the script finished. Sieving to 1e8 resulted in about 280,000 PRP tests. The remove rate was about 1200 per minute at 1e8 while it tested about 1050 per minute after sieving. With this setup it would take about 139 days for a single core to do a range of 1G. I suspect that only going to n=500 might shave another 10 to 15 days off of that. I am going to run two more tests with -f0 one to n=500 and one to n=1000. My biggest concern is that the .pfgw file created by srsieve will be unwieldy AND terminating pfgw while running it would create a problem. If Jean can find and fix the problem with llr, that should improve the speed even more. IMO a 4-core machine should be able to complete a range of 1G in a month. Would someone mind writing a .bat (Windows) or .sh file (*nix) that runs an entire range without intervention. When that script completes it should output a pl_remain.txt and pl_prime.txt file for Gary. |
I stopped the runs to n=500 and n=1000. After running more than 10% of the range, the runtime extrapolated to a time very close to the n=2500 runtime and that was just to run the script.
My recommendation based upon everything I have run is to use -f0 to n=2500 with 5 processes on a 4-core machine (200M per process), sieve, then test the resulting file. The only gotcha is that it requires one to change line 1 of the .pfgw file output from srsieve to include {number_primes,$a,1} if using pfgw. I think it would be possible to complete a range of 1G in just over a month on a 4-core machine. |
Highly interesting results again! Now for a range of 1G the number of remaining k's is much bigger. How does this affect srsieve speed? I seem to remember it got quite slow when the number of k's went up into the tens of thousands.
|
| All times are UTC. The time now is 23:01. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.