![]() |
[QUOTE=rogue;405166]KEP, please don't be so hasty. Gary, if someone wants to take on a large amount of work, we need to know if they can complete it in a reasonable amount of time with their resources. He is right that boredom is a factor, but at the same time seeing the light at the end of the tunnel allows one to keep focused on a task until it is completed.
Between the two of you, I think you can work out how you can submit results on a large reservation in a way that will not burden Gary. IMO, we need some concrete numbers so that we know the actual level of effort that you or anyone else is signing up for on this base, which is partly why I posted what I did earlier today. I am still running some tests to determine the optimal settings. One result I can say is that -f30 is not optimal when searching to n=25000. It is 10% slower than -f10. I will have more complete results later.[/QUOTE] :smile: Thanks for your comments, however for now and untill completion of R3 8G-13G to n=100K I'm not gonna do any work on starting R3 and if I really has to work (using the starting bases script) from n=1 to n=25K, then my current decision is final. It just doesn't seem reasonable for me, that I should spend ~202 CPU days on my Hasswell to complete the k's to n=2500 (and many more days running to n=25K) when I can complete the part of testing each k to n=2500 in just 53 CPU days per G range. Either way, though I have long dreamt about seeing this base completely started, the chance of me changing my decision is only a little different from 0, but we will now for sure around August 5th. I had by the way planned to run 23 ranges with 1G in each range, since I know that it will give big files, but nothing that can't be handled by google drive, so it shouldn't really have burdened Gary more than usual. On another sidenote, I know from my testing last night on my Hasswell, that April next year was very reasonable. :smile: |
[QUOTE=KEP;405170]:smile:
Thanks for your comments, however for now and untill completion of R3 8G-13G to n=100K I'm not gonna do any work on starting R3 and if I really has to work (using the starting bases script) from n=1 to n=25K, then my current decision is final. It just doesn't seem reasonable for me, that I should spend ~202 CPU days on my Hasswell to complete the k's to n=2500 (and many more days running to n=25K) when I can complete the part of testing each k to n=2500 in just 53 CPU days per G range. Either way, though I have long dreamt about seeing this base completely started, the chance of me changing my decision is only a little different from 0, but we will now for sure around August 5th. I had by the way planned to run 23 ranges with 1G in each range, since I know that it will give big files, but nothing that can't be handled by google drive, so it shouldn't really have burdened Gary more than usual. On another sidenote, I know from my testing last night on my Hasswell, that April next year was very reasonable. :smile:[/QUOTE] Kenneth, nobody really cares which way you do it as long as the results are correct. The main reason why I used the script to n=25k is time. Not computation time but my personal time. I manage a large number of cores and there's no way to set up a PRPnet, so minimizing my manual work is sometimes more important than getting the best performance. On large tests this is not an issue, but on the very small stuff it is. That's why I haven't done nearly as much work for R3 as I had planned to do. As Rogue said, you and Gary will find a way to handle this. If -10 or even -f0 and only scripting to n=2500 is the best way, then by all means do it this way. I won't, but as I said this is for reasons that may not apply to you. I'll do some work on R3 when n=25000 is reached :smile: |
I have some numbers that might change how people want to tackle this project. For each of these tests I tested a range of 100,000 k at 3e10, which is near the middle of the conjecture. These times are not "to the second", so some numbers would have changed for the better and others for the worse had I done so.
To give a baseline of the hardware, I have a laptop with a 4-core i7-3740QM at 2.7 GHz with an SSD. Note that those are physical cores. I first varied the trial factoring depth. No k remain after running this range. [FONT="Courier New"][size=2] -f0: n=25000, 30 minutes -> 208 days -f10: n=25000, 49 minutes -> 340 days -f20: n=25000, 51 minutes -> 354 days -f30: n=25000, 54 minutes -> 375 days -f40: n=25000, 57 minutes -> 396 days [/size][/FONT] It is obvious that -f0 is the best choice when testing to n=25000. I provide a reason for that below. I was curious to see how n impacts the results so I then ran another set of tests varying n. [FONT="Courier New"][size=2] -f0: n=500, 15 minutes -> 104 days (59 k remaining) -f0: n=2000, 17 minutes -> 118 days (13 k remaining) -> 29.5 days per 1G (using 4 cores) -f0: n=2500, 18 minutes -> 125 days (11 k remaining) -> 31.3 days per 1G (using 4 cores) -f0: n=3000, 19 minutes -> 132 days ( 8 k remaining) -f0: n=3500, 19 minutes -> 132 days ( 5 k remaining) -f0: n=4000, 20 minutes -> 139 days ( 4 k remaining) -f0: n=5000, 21 minutes -> 145 days ( 3 k remaining) [/size][/FONT] Note that for a range of 1e5 that the number of remaining k for a range of 1G will be closer to 10,000 times that number. So if going to n=3000 there will be approximately 80,000 k that need to be sieved and tested. This means that if you only test to n=3000, you would need to sieve and test those 80,000 k in less than 76 days (208 - 132) in order for it to benefit. I did not do any tests to estimate how long that would take. Sieving would obviously be very fast, probably taking only a few days, but I don't know how long testing would take. I have to presume that it would take much less time than 76 days. In looking at the numbers, it seems to me that going to n=5000 might be a better option. For only 13 more days you eliminated almost 60% of the k. To gauge impact of hyper-threading (enabled on this machine) and the SSD, I ran with more copies of pfgw than the physical number of cores. [FONT="Courier New"][size=2] -f0: n=5000, 6 concurrent copies -> 28 minutes -> 32.4 days per 1G -f0: n=5000, 8 concurrent copies -> 41 minutes -> 35.6 days per 1G [/size][/FONT] There is a significant benefit to running more copies than physical cores. These numbers are an extrapolation of the above: [FONT="Courier New"][size=2] -f0: n=25000, 6 concurrent copies -> 40 minutes -> 46.3 days per 1G [/size][/FONT] My laptop could theoretically do about 8G in a year. KEP would need to dedicate 5 4-core machines to this endeavor (at least as fast as mine) to complete a range of 20G to n=25000 in 6 months. You are probably wondering now why trial division (controlled by the -f switch) has such a significant impact. The problem is that for each number requiring trial division, it will do long division for each potential factor. pfgw is not using a sieve for this process and it doesn't do other shortcuts which would be dependent upon the trial division logic knowing the form of the number, so it cannot be any faster. A custom sieve would have to be written. IIRC, fermfact works like this, but it won't work on these numbers. As numbers get larger trial division takes longer because there are so many more potential factors. One last thing for some to consider is to redirect the output of pfgw to nul. The screen I/O on some systems could impact performance of pfgw. It didn't on Windows, which is where I ran these tests, but it could on others. |
[QUOTE=rogue;405182]
To gauge impact of hyper-threading (enabled on this machine) and the SSD, I ran with more copies of pfgw than the physical number of cores. [FONT=Courier New][SIZE=2] -f0: n=5000, 6 concurrent copies -> 28 minutes -> 32.4 days per 1G -f0: n=5000, 8 concurrent copies -> 41 minutes -> 35.6 days per 1G [/SIZE][/FONT] [/QUOTE] Is this stating that running 6 at once each will take 32.4 days giving 1G/5.4 days but running 8 at once each will take 35.6 days giving 1G/4.45 days? If so what about 10, 12 etc. |
[QUOTE=henryzz;405190]Is this stating that running 6 at once each will take 32.4 days giving 1G/5.4 days but running 8 at once each will take 35.6 days giving 1G/4.45 days?
If so what about 10, 12 etc.[/QUOTE] No. It means that running 6 instances simultaneously will search a range of 1G in 32.4 days. That would be 166.6M per core over that period of time. 8 instances takes longer and would only do 125M per core. |
[QUOTE=rogue;405182]
I first varied the trial factoring depth. No k remain after running this range. [FONT=Courier New][SIZE=2] -f0: n=25000, 30 minutes -> 208 days -f10: n=25000, 49 minutes -> 340 days -f20: n=25000, 51 minutes -> 354 days -f30: n=25000, 54 minutes -> 375 days -f40: n=25000, 57 minutes -> 396 days [/SIZE][/FONT] [...] You are probably wondering now why trial division (controlled by the -f switch) has such a significant impact. The problem is that for each number requiring trial division, it will do long division for each potential factor. pfgw is not using a sieve for this process and it doesn't do other shortcuts which would be dependent upon the trial division logic knowing the form of the number, so it cannot be any faster. A custom sieve would have to be written. IIRC, fermfact works like this, but it won't work on these numbers. As numbers get larger trial division takes longer because there are so many more potential factors. [/QUOTE] This is very interesting. So far my understanding was that it is much faster to trial factor a number to a (small) limit than just perform a PRP test. So if the percentage of candidates eliminated by trial factoring is big enough, it should be worth doing so. Will this be valid for higher n? If not I don't know what it's good for? Thanks for doing all these evaluations! I might be tempted to do some work on R3... |
[QUOTE=Puzzle-Peter;405173]Kenneth,
nobody really cares which way you do it as long as the results are correct. The main reason why I used the script to n=25k is time. Not computation time but my personal time. I manage a large number of cores and there's no way to set up a PRPnet, so minimizing my manual work is sometimes more important than getting the best performance. On large tests this is not an issue, but on the very small stuff it is. That's why I haven't done nearly as much work for R3 as I had planned to do. As Rogue said, you and Gary will find a way to handle this. If -10 or even -f0 and only scripting to n=2500 is the best way, then by all means do it this way. I won't, but as I said this is for reasons that may not apply to you. I'll do some work on R3 when n=25000 is reached :smile:[/QUOTE] Thanks for your feedback. I'm still not quite sure (even witht the math Rogue came up with) weather or not I'll take up the previous effort. However, I know from my own calculations that my Hasswell can, using -f0 test a 1G range in 53 CPU days to n=2500 and that leaves 149 CPU days to complete 75000-85000 k's to n=25K (and that n-range shouldn't take that long to complete). However, I have now put 4 cores on my R3 8G-9G reservation and I have no idea how long those 4 cores from my Sandy Bridge will be occupied, so even though February in the first detailed plan was doable, I'm not sure, how the new timetable would look like. What I know for sure is that I have about 5 weeks to agree with Gary on a plan that suits my efficiency desire and doesn't burden Gary, that should be possible to do and agree within the next 5 weeks :smile: |
[QUOTE=rogue;405182]My laptop could theoretically do about 8G in a year. KEP would need to dedicate 5 4-core machines to this endeavor (at least as fast as mine) to complete a range of 20G to n=25000 in 6 months.
[/QUOTE] Thanks for your numbers. The main reason that I don't want to test much higher than n=2500 using -f0 is because I want to use my CPU ressources as efficient as possible and trial factoring isn't as efficient as sieving (at least not on any of my systems). I'm able to complete the range k=40G-kMax to n=2500 in ~3.3467 Hasswell CPU years, when using -f0. Now I think I'll send Gary a PM and see if we can come to an agreement, because I do really desire to run this range: k=40,000,000,002 to kMax But only if I can come to agreement with Gary on how to complete this range :smile: Take care! |
[QUOTE=Puzzle-Peter;405229]This is very interesting.
So far my understanding was that it is much faster to trial factor a number to a (small) limit than just perform a PRP test. So if the percentage of candidates eliminated by trial factoring is big enough, it should be worth doing so. Will this be valid for higher n? If not I don't know what it's good for? Thanks for doing all these evaluations! I might be tempted to do some work on R3...[/QUOTE] It depends upon the base. As the base gets larger trial factoring is more important because PRP tests take longer. As no k survived my test range, I suspect that k that would survive would skew the results. It is as if you only want to use -f0 for n below a certain value then -fxx for n above that value. This lends credence to testing only to n=5000 with the script then sieving and testing the remaining k outside of the script. |
Mark, your timings and estimates are incorrect. They way underestimate the total time needed. Of course -f0 is faster with [COLOR=red]no k's reamining[/COLOR]. Try testing a range of 10M k with all of those various switches. If you test a range that has multiple k's remaining at n=25K, you will see differently. With no pre-sieving done, the biggest amount of time is taken for k's with no primes or primes that are n>10K. You cannot just set it to -f0 a and do no factoring for larger tests n>10K. It takes way too long. Back 3 years ago I tested numerous ranges of k's for myself using -f0 -f10 -f30 -f50 and -f100 with many k's remaining -f30 was the fastest with -f10 and -f50 slightly slower. Peter concurred with this when testing several 1G ranges himself. I invite anyone else to do such a test now. Perhaps things have changed but I doubt very much.
Kenneth, I know I seem a little short with you on this but we've had issues in the past with you making huge reservations and then losing interest, which leaves a big mess for me to sort out. The Riesel base 3 attack project that was abandoned after 1-2 months was the biggest of them all. Why do you feel this need to make HUGE reservations? For that reason, I am unwilling to accept huge reservations from you. I do not care how you test your reservations. If sieving is faster for you, then go right ahead. I only ask that the reservations be reasonable sized. Why is it such a problem to reserve a 4G range, complete it, and reserve another 4G range, etc. |
[QUOTE=gd_barnes;405258]Mark, your timings and estimates are incorrect. They way underestimate the total time needed. Of course -f0 is faster with [COLOR=red]no k's reamining[/COLOR]. Try testing a range of 10M k with all of those various switches. If you test a range that has multiple k's remaining at n=25K, you will see differently. With no pre-sieving done, the biggest amount of time is taken for k's with no primes or primes that are n>10K. You cannot just set it to -f0 a and do no factoring for larger tests n>10K. It takes way too long. Back 3 years ago I tested numerous ranges of k's for myself using -f0 -f10 -f30 -f50 and -f100 with many k's remaining -f30 was the fastest with -f10 and -f50 slightly slower. Peter concurred with this when testing several 1G ranges himself. I invite anyone else to do such a test now. Perhaps things have changed but I doubt very much.[/QUOTE]
I would agree that because no k were remaining that the numbers are skewed for the test I ran. Note that I did not know that there would be 0 k remaining for that range when I started. That is why I suggest only testing to n=5000 (or lower) then sieving and testing what remains. It would be trivial to write a script to merge the results so that you get one pl_prime.txt file for each range of 1G. The key question is: how much time does it take to sieve and test the remaining k vs. how long it would take with the script? Would someone be willing to run a test on a large range (10M or more) to see which method is faster? I think it would benefit R3 and S3 to know the answer to that question. KEP, I think that Gary's request is reasonable, unless the computers you are running on are not accessed regularly. |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.