![]() |
I would like to help on sieve...
|
The siever is in the yahoo groups thing for the 4 present members. Test it out and let me know if there are any bugs or anything anyone would like to be changed before the final release.
If there are no changes to be made till the end of the weekend, I can post the siever in this thread. All the files and everything when zipped is less than 100Kb. also [url]http://www.mersenneforum.org/showthread.php?t=6690[/url] should be in this sub-forum. Ask the admin to move. |
Which group, the primeform one?
Carlos |
No, it's called primecullenprime . The group is set to restricted, meaning that you will have to wait for my approval to get access; but I'm online this evening. It would be cool if you reserved a chunk of 10G, taking about one week on a decent computer. (Or more, of course). Yours H.
PS/edit: Dear Carlos, dear Citrix, I made some reservations for you. If you don't like them, I can undo them in no time. H. |
Please PM me the group link....
Carlos |
[url]http://tech.groups.yahoo.com/group/primecullenprime/[/url]
|
Thanks...for the moment I can only help with one AMD 64.
I really don't know how to run the client...where to I select the range to sieve?!?!? Can I use srsieve? |
Download the .zip file
extract it in a folder. open cullenwoodall.out and change the number after "// CW Sieved to:" to your starting number save file and close it run multisieve click start, no need to change anything. When sieve reaches end of range click stop post factors here. get a new range and repeat..... |
Done.Thanks.
Carlos |
The people who are testing the sieve, this weekend, could you guys report the following.
Time per factor # of millions done per hr. CPU type. Range testing. |
[QUOTE=Citrix;102018]The people who are testing the sieve, this weekend, could you guys report the following.
Time per factor # of millions done per hr. CPU type. Range testing.[/QUOTE] I have some doubt indeed on the accuracy of these stats in the program. We'll see tomorrow. |
If it is alright with all involved, I'm going to sieve the 60-70 range.
|
As for the stats within the client:
I let it run 12 hours on a Athlon XP 2000+, found 31 factors, from 32 551M to 36 020M, that makes 3.5G, and it shows 1480 seconds per candidate. I will give some other this evening. H. |
Stats
Program ran for 2 hrs cpu: Intel celeron 1.4 ghz found 3 factors. time per factor=40 min was able to do about 650M range 50,000M to 50,650M I think p-1 might be better, after we sieve to 100G, if the time per factor is rising so rapidly.:down: edit: any bugs or changes anyone wants? |
Can the time it takes for the first factor to be found be taken into account for the time per candidate calculation? Thet would make it more accurate.
That's the only thing to make it better I can see. As for the factor density, your three factors can be a statistical deviation. And I think we should still continue sieving for a moment, as 1) we find nonsmooth factors as well and 2) the time to eliminate an average canditate by sieving is still much lower than the time for LLR; finally 3) Why not spend too much time in sieving rather than spend too much time in LLR, for a change? Never a project was so easy to over-sieve, I vote for this luxury. BTW, question: Should we keep searching for factors of candidates that have a LLR-residue? How is the relation between sievespeed and number of candidates in the list? Proportional? logarithmic? Citrix? H. |
There are two stages to the algorithm
Stage 1) Takes about 2 sec per million range and this is fixed and does not vary with the number of candidates Stage 2) Takes about 14 sec per million. If we reduced the number of candidates by 1/2 then this would take 7 sec. So propotional. But since LLR and machines are not perfect, I think we should try to find a factor for all numbers even when they are LLRed, there might have been some error. No point on doing p-1 once they are llred. We can remove them from the sieve file once a candidate is double checked. This is the same as how PSP is set up. If you want you can sieve n=1.5-2M first and then the rest. Only the first 2 sec per million is duplicated in this , the rest is the same, but you will have ranges to LLR sooner. This method will require more book keeping effort. Also I think we should p-1 all candiates with low bounds say B1=10000 and b2=100,000 and quickly find all the low lying factors. Perhaps ECM with low bounds. Then see how many candidates left and then sieve. The time it takes to find the first candidate is already taken into account to calculate time per factor. |
Just finished my first range 60-70
Program ran for 21.5 hrs cpu: Opteron 248 @ 2.2GHz found 38 factors. Sieving Rate 1341.70 sec/candidate 465M / hr. |
12h
3.6G sieved 25 factors found 2500 seconds/factor |
[QUOTE=Citrix;102071]
But since LLR and machines are not perfect, I think we should try to find a factor for all numbers even when they are LLRed, there might have been some error. No point on doing p-1 once they are llred. We can remove them from the sieve file once a candidate is double checked. This is the same as how PSP is set up. [/QUOTE] Until here, we agree, just that I thought we could let the doublecheck to Ray Ballinger et al., if they agree, in the frame of their normal Cullen search. I didn't contact them yet; It would make things easier. [QUOTE=Citrix;102071] If you want you can sieve n=1.5-2M first and then the rest. Only the first 2 sec per million is duplicated in this , the rest is the same, but you will have ranges to LLR sooner. This method will require more book keeping effort. [/QUOTE] VETO. I am for computational efficiency, but the free-time efficiency counts as well. [QUOTE=Citrix;102071] Also I think we should p-1 all candiates with low bounds say B1=10000 and b2=100,000 and quickly find all the low lying factors. Perhaps ECM with low bounds. Then see how many candidates left and then sieve. The time it takes to find the first candidate is already taken into account to calculate time per factor.[/QUOTE] I don't see the the point of this. You want low lying factors? You sieve. You want efficiency? You let prime95 choose the bounds, and ECM is slower. You want to be sure not to miss some easy factors before LLR? Then you P-1. The whole philosophy is sieve-P-1-LLR. Or I missed you point, that's possible. Perhaps you wanted to propose some sophisticated P-1/sieve-mix that is even more efficient. Please explain. After all, everybody is free to do whatever he is pleased to do in this project, as long as it is halfway reasonable and doesn't cause too much work for bookkeeping(<--what a word, that!). Yours H. |
[quote]...bookkeeping(<--what a word, that!)...[/quote][URL]http://answers.yahoo.com/question/index?qid=20070315022759AALr5af[/URL]
|
[QUOTE=hhh;102089]Until here, we agree, just that I thought we could let the doublecheck to Ray Ballinger et al., if they agree, in the frame of their normal Cullen search. I didn't contact them yet; It would make things easier.
VETO. I am for computational efficiency, but the free-time efficiency counts as well. I don't see the the point of this. You want low lying factors? You sieve. You want efficiency? You let prime95 choose the bounds, and ECM is slower. You want to be sure not to miss some easy factors before LLR? Then you P-1. The whole philosophy is sieve-P-1-LLR. Or I missed you point, that's possible. Perhaps you wanted to propose some sophisticated P-1/sieve-mix that is even more efficient. Please explain. Yours H.[/QUOTE] It is ok with me if you want to deligate the work of double checking for Ray Ballinger et al. If so then you can remove candidates from the sieve. Just ask all users with new machines to double check 1-2 candidates before they reserve a new range or run Prime95 stress testing. One thing is that if double checking missed a prime, you may have to PRP a long way before you find one more and solve the question. Consider SOB and their missed prime. But I leave the decision upto you. For p-1, I looked at the 10 or so of the factors I found. Most of the factors could have been found within a few min of p-1 work compared to 40 min on the sieve for each factor. I suggest we do some basic p-1 with low bounds like b1=10000 and b2=100000. Then sieve with the remaining candidates and then return to p-1 with larger bounds. Anyway,we should do what ever is most efficient. Book Keeping? I always thought, it was two words? What are the roots of the word? |
[QUOTE=Citrix;102092]But I leave the decision upto you.[/QUOTE]
Anyways, we are going to think about DC only when we reach 5M or something. [QUOTE=Citrix;102092]For p-1, I looked at the 10 or so of the factors I found. Most of the factors could have been found within a few min of p-1 work compared to 40 min on the sieve for each factor.[/QUOTE] The problem with P-1 are all those that you test without finding a factor. [QUOTE=Citrix;102092]I suggest we do some basic p-1 with low bounds like b1=10000 and b2=100000. Then sieve with the remaining candidates and then return to p-1 with larger bounds. Anyway,we should do what ever is most efficient. [/QUOTE] Free-time-Veto again. If somebody wants to do P-1, he can, but then correctly, please. It's still fast enough. Yours H. |
[QUOTE=hhh;102093]
Free-time-Veto again. If somebody wants to do P-1, he can, but then correctly, please. It's still fast enough. [/QUOTE] I agree.:smile: I think there should be some LLR ranges in the LLR thread. Do you plan to go beyond 5M? I might be able to make the sieve program faster, if you plan to. |
I plan to, if necessary, and if the interest in the project makes it possible. I would wait a month, though, before we decide about that and before you start heavy work. Up to now I'm very happy it is starting well.
As for the LLR-ranges, I will put some as soon as ET_ has done his P-1 job. The sieve-file will be updated this evening. H. |
[QUOTE=hhh;102115]
As for the LLR-ranges, I will put some as soon as ET_ has done his P-1 job. The sieve-file will be updated this evening. H.[/QUOTE] Done! :razz: Luigi |
Linux version
I have implemented the algorithm described by Citrix in [url=http://www.mersenneforum.org/showpost.php?p=92720&postcount=9]this[/url] post, with two small improvements:
1. When generating the table of powers 2^1, 2^2, ..., 2^max_gap (all mod p) I take advantage of the regularity of the gaps. For the project data all gaps are multiples of 6, so only 2^6, 2^12, ..., 2^max_gap (all mod p) need to be generated. 2. There are many more small gaps than large gaps, so while almost all of the entries in the first half of the above table are needed, in the second half there are many unused entries. I fill every entry in the first half, but only fill the needed entries in the second half. Source and some binaries are [url=http://www.geocities.com/g_w_reynolds/gcwsieve/]here[/url]. To sieve for factors p in the range P0 < p < P1 run `gcwsieve -i <infile> -o <outfile> -f <factors_file> -p <P0> -P <P1>'. It is a bit slower than the modified MultiSieve with the Windows machine that I tested it on, but it is better than nothing for Linux users. It may be faster on 64-bit Linux. Very little testing has been done. I will try to fix any bugs reported, but I don't plan to spend much time making it faster unless I think of a big improvement to the main algorithm. Feel free to use anything in the source to improve MultiSieve (A lot of it is based on rogue's code anyway). |
[QUOTE=geoff;102260]I have implemented the algorithm described by Citrix in [url=http://www.mersenneforum.org/showpost.php?p=92720&postcount=9]this[/url] post, with two small improvements:
1. When generating the table of powers 2^1, 2^2, ..., 2^max_gap (all mod p) I take advantage of the regularity of the gaps. For the project data all gaps are multiples of 6, so only 2^6, 2^12, ..., 2^max_gap (all mod p) need to be generated. 2. There are many more small gaps than large gaps, so while almost all of the entries in the first half of the above table are needed, in the second half there are many unused entries. I fill every entry in the first half, but only fill the needed entries in the second half. [/QUOTE] Modified multisieve already does this. I just din't mention it in the algorithm. |
[QUOTE=geoff;102271]Reserving 150-160.[/QUOTE]
Do you plan to use your client? @hhh, please keep track of program used for each range, in case there is a bug in one of the programs and we need to recheck the ranges etc... |
150-160 done, 13 factors (using gcwsieve 1.0.0).
Reserving 160-200. [QUOTE=Citrix;102272]Do you plan to use your client?[/QUOTE] Yes. I have a P3/800 that I can only get to run from a floppy disk, this project is ideal because the sieve and factors files are small. It does 62kp/s for the range above. |
I get 62kps with your program on a 2.63 Ghz Intel celeron. What is wrong with the computer?
Any way to speed it up? |
[QUOTE=Citrix;102465]I get 62kps with your program on a 2.63 Ghz Intel celeron. What is wrong with the computer?
Any way to speed it up?[/QUOTE] It seems to run slow on P4 machines, I have added an SSE2 version of the main loop in version 1.0.1, but it is still quite slow with GCC 3.4 which is all I have for the Windows build. The Linux binaries built with GCC 4.1 are faster, but even there the P3 makes the P4 look sick. |
Multisieve is also very slow on P4's. I suggest P-1/LLR for all users with P4's.
Any way to use the linux version on windows? |
[QUOTE=Citrix;102470]Multisieve is also very slow on P4's. I suggest P-1/LLR for all users with P4's.
Any way to use the linux version on windows?[/QUOTE] It may be possible to compile with cygwin-gcc 4.1, but I don't have a cross-compiler, it needs to be done from Windows. |
[QUOTE=geoff;102472]It may be possible to compile with cygwin-gcc 4.1, but I don't have a cross-compiler, it needs to be done from Windows.[/QUOTE]
I have an AMD X2 with Windows, Cygwin and GCC 3.4, and I'm already involved into this project working on P-1 and planning to do sieving work. If it helps I may try and compile with my environment, just let me know. HHH received my request to join the mailing-list but I got no clue on executables. Luigi |
It is the .zip file in the file section of the yahoo groups. I need to modify it a little bit before final release. Just been lazy, need some incentive....:mally: :curtisc: :xmastree:
|
[QUOTE=Citrix;102465]I get 62kps with your program on a 2.63 Ghz Intel celeron. What is wrong with the computer?
[/QUOTE] Is your machine a Northwood or Prescott based Celeron? For gcwsieve version 1.0.2 on the 4825 term sieve file, at p=150e9: Compiled with GCC 4.1.2 and run on my 2.88GHz Northwood (128Kb L2) Celeron under Linux I get 89kp/s, but compiled with GCC 3.4.5 on a 2.80GHz Celeron D (256Kb L2) under Windows XP I get 63kp/s. If I compile with GCC 3.4.6 on Linux I still get 85kp/s on the Northwood Celeron, so I think most of the difference is due to the SSE2 assembler not being well tuned for Prescott, rather than differences between GCC 3.4 and 4.1. (My P3/800 now gets 72kp/s, 4 times faster per clock than the Celeron D). |
40% faster modified multisieve released.
Please update. File still restricted to the yahoo groups forum. Version 1.1. Please provide stats, feedback and any bugs you notice. Thx :smile: |
I get 310 kp/s with gswsieve 1.0.4 on an AMD 64 3000+.
Carlos |
I get 186 kp/s with gswsieve 1.0.4 on an AMD 2200+
|
Can we bring into line the output formats of the sievers? (If there are no other (technical?)objections). Would be cool. Yours H.
|
[QUOTE=hhh;103714]Can we bring into line the output formats of the sievers? (If there are no other (technical?)objections). Would be cool. Yours H.[/QUOTE]
the `--multisieve' switch will make gcwsieve write the factors and ABC files in the same format as MultiSieve. |
I think everyone should to switch to gcwsieve now. I do not plan to update multisieve anymore (unless there is a use for it), it seems fruitless right now to improve on it.
:smile: |
We will reach 2T on sieving by the end of the month. I thought we would take longer to get it.
Carlos |
Me too. What is your factor/time ratio now? Do you still get a factor per day? How fast would your computers be in LLR?
In other words: how many times sieving is still more efficient than LLR? I' make the update this afternoon, I think. Anyways, the speed increase will be less than 5%, I think. H. |
[quote=hhh;104177]Me too. What is your factor/time ratio now? Do you still get a factor per day? How fast would your computers be in LLR?
In other words: how many times sieving is still more efficient than LLR? I' make the update this afternoon, I think. Anyways, the speed increase will be less than 5%, I think. H.[/quote] 26433 sec/factor on my home machine. At work I have access to three 3.0GHz HT P4 machines and one 2.8GHz P4 dual core machine but I never tried LLR on them. Carlos |
That means that we can keep sieving fo rhte moment. But we can start LLR as well and take out the lower numbers from the sieve.
|
What's our goal on sieve, going up to 10T, 20T....?
Carlos |
I don't think more than 5T.
2.5T might be enough. As long as you can find 1 factor a day on a fast athlon, then you should sieve, else start with LLR. edit: If we are removing candidates as we are LLRing them, then this would change things and we might be able to go deeper as the sieve would become faster. :smile: |
We don't even need to LLR to take candidates out. We can take them out just like this as well. That's what I'm going to do with the next import, as we almost don't find any factors below 2M, thanks to P-1. As for the rest, I vote for sieving rather too deep than too shallow (native speeker assistance please!), as there are too many undersieved projects and I'd like to invert this.
But as usual, everybody is free to do whatever he is pleased to do. H. |
I'm going to finish the current ranges then I am out.
Carlos |
[QUOTE=em99010pepe;104454]I'm going to finish the current ranges then I am out.
Carlos[/QUOTE] That's a pity. Thank you anyway for the nice boost you gave to this project. H. |
On my 2.9GHz P4, LLR with exponent 3,250,000 should take about 32500 seconds (9 hours) at 10.0 ms/bit. LLR with exponent 5,000,000 should take about 82500 sec (23 hours) at 16.5 ms/bit.
I think sieving up to 2.5T-3T is probably about right, if we are not taking double checking into account. Maybe up to 5T allowing for double checks. If the project doesn't find a prime below exponent 5,000,000 then my guess is that there won't be a lot of interest in double checking, people would be more interested in doing first time tests on higher ranges to find the first prime. If a prime is found below exponent 5,000,000 then there could be more interest in double checking, to prove that it is the smallest such prime. |
1 Attachment(s)
Tests at different levels take different amount of time. I think we are still under sieved for the tests with n=4M to 5M.
The best approach would be to continue sieving and removing candidates as they are LLRed. For example, I think we are sieved well to n=2.5M. So we should assign all these candidates to LLR and then remove them from the sieve. This will make the sieve client much faster and the time per factor should drop, thus we can effectively sieve beyond 2.5T. I created a dat file, only above 2.4M, if any one wants to try it to see what speeds and time per factor we get. (see attached) :smile: |
Please decide which approach is better because I'm really inclined to move to another project.
Carlos |
Up to 2M, we have sieved nicely and done a decent P-1, such that we almost don't find factors by sieve anymore.
Up to 2.5M, I'm going to write out a decent P-1 as well. So I vote for just deleting the lines below 2.5M from the sieve.txt, and to continue sieving. This way, we won't be getting more factors per time, but the factors we are going to find will be worth more, on average. For the moment, everybody feel free to use the sieve.txt Citrix posted. The next official release will be truncated at 2.5M anyways. This is to be sure that no cycles are wasted. Yours H. |
1 Attachment(s)
To Geoff.
I removed the latest factors found from all attack sides such sieving and P-1 factoring and got a file with 2897 terms. How many terms do you have on yours? Carlos |
[QUOTE=em99010pepe;106348]I removed the latest factors found from all attack sides such sieving and P-1 factoring and got a file with 2897 terms.
How many terms do you have on yours? [/QUOTE] I had just removed the ones found by sieve, 2917 remain. I'll use your sieve file instead, the fewer terms the better :-) |
1 Attachment(s)
After completing the 2800k-2900k range at P-1 and almost finishing my sieve range, I updated my sieve file. 2889 terms left.
|
All right, here we go.
Sieving the first stage is finished, and certainly our attention should be pointed at LLR and the remaining P-1 ranges to prepare it. Nevertheless, the sieving part of the next stage is going to take a longer time (given the increased worth of every factor), and has already begun. I'm sieving right now the range 5M-25M, and the final number of tests to LLR will probably be around 28000. But, how will we proceed? I suggest that I keep going, crunching what I can on one machine. This would have the advantage of minimising the risk of human error in the factor transmission. Then, when LLR is approaching 4M, we could make the sieving public, and push it together a bit further, say up to 5G, for the whole range 5M-25M. And finally, when LLR approaches 5M and we are going to actually need the numbers, we could split the range into slices of 1 M and sieve them as the need advances. What do you think? Better ideas? Is there anything to poll? Yours H. |
The FFT sizes used by LLR for the 5-10 million range will be 512k, 610k, 768k, 896k, 1026k. Instead of reducing the sieve by removing the lowest 1 million n range, we could reduce it just to exclude the smallest FFT size.
|
That's a good idea. Can you determine and post the change levels? Sieving to 100G is going to be finished around the 21st of June, we'll see then how to proceed. Yours H.
|
Here are the approximate break points between FFT sizes for SSE2 machines:
[code] FFT size Break point -------- ------------------ 448/512 4522000 / 4523000 512/640 5126000 / 5127000 640/768 6423000 / 6424000 768/896 7652000 / 7653000 896/1024 8912000 / 8913000 [/code] If sieving for the 5-10 million range reaches the high point for the 1.5-5 million range (4200G) well before LLR reaches the 448/512 breakpoint, then maybe it would be worthwhile including the full 512k FFT range, i.e. sieve 4.5-10 million. |
| All times are UTC. The time now is 09:55. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.