20100820, 16:10  #1 
Oct 2006
vomit_frame_pointer
2^{3}·3^{2}·5 Posts 
6,353+
This C178 GNFS ran a little faster thanks to jrk's suggestion that we try triple large primes on the algebraic side:
n: Code:
3495720155744160390308702080911694892393254003819177989194364433392602119210358965833430295064663146579080152295928278131537398607330890608815038143831212204077824454760817944697 # norm 1.828473e17 alpha 7.945069 e 9.914e14 skew: 5242872.26 c0: 1415684462271305089012313057623015234842880 c1: 56489085491552175398550702459917716 c2: 355405802142076475701272154680 c3: 30573939026244353965597 c4: 18914955682611026 c5: 637985160 Y0: 5594293033634964367018772799829077 Y1: 10914669743009582011 rlim: 60000000 alim: 60000000 lpbr: 31 lpba: 31 mfbr: 62 mfba: 90 rlambda: 2.6 alambda: 3.6 The C178 split as P68.P110: Code:
P68: 76772819749620922859599019064869997729081875556728878881833848945739 P110: 45533304197302470775235834478317413598842411794792485571017691246712640554863455599919842904873592456844389323 
20100822, 19:05  #2 
Jul 2003
So Cal
2,069 Posts 
3LP sieving: memory and speed savings!

20100827, 02:37  #3  
Oct 2006
vomit_frame_pointer
2^{3}·3^{2}·5 Posts 
6,353+ log
Quote:
I sieved with 15e, on the algebraic side, 24M < Q < 164M. Jayson had the inspiration for the triple large prime approach a few days into the sieving, so about 17% of relations came from a standard double large prime approach with higher factor base limits (rlim=alim=120000000). Per Jayson, a better range might have been 40M through 180M or so. I am mostly swayed to this viewpoint, but I can't stand to watch the meager yields at the higher end. LowQ sieving is one of many things I don't understand about this business. Log attached. 6P353.log.gz BTW, Greg: Thanks for posting all your logs  they have been educational, so I'm happy to return the favor. Last fiddled with by FactorEyes on 20100827 at 03:18 

20100827, 03:21  #4 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
3^{2}·17·61 Posts 
It is a bit difficult to run the whole project of this size twice (for comparison vs. 2LP), but would it be possible to devise a data filter that would emulate (not necessarily precisely, but somewhat close) the removal* of lines with strictly 3LPs? It would be interesting to do that because this line:
"Thu Jul 22 23:59:04 2010 found 49,137,133 duplicates and 173,981,192 unique relations" appears to indicate a bit higher than the usual redundancy rate. But I could be totally wrong. So it would be interesting to see if a simulated equivalent 2LP set would have less redundancy. If it would have 8% less redundancy, then it would look like a wash, right? 3LP is an impressive device, and it is not my intent to disrespect it. It deserves an extended postmortem, what do you think? You saved the dataset, right? You could dissect it in a number of ways with some perl scripts. ________ *I sieved a bit with 3LP and the usual 2LP, and the sieving result with the latter is always nested inside the result of the former. So it could be reversed, for simulation purposes. Last fiddled with by Batalov on 20100827 at 03:27 
20100827, 03:37  #5  
Oct 2006
vomit_frame_pointer
550_{8} Posts 
Quote:
The duplication rate did look high. When I saw it, I attributed it to the relatively small factor base limits, as if that excused it. I can tell you that this took just under 10% less CPU time than I was expecting for this number. It was backo'theenvelope, and I threw away the envelope, but a C172 I ran (2LP) took 31 days of sieving on all my resources. Assuming a doubling of sieving time for 5 extra digits  crude, I know  puts a C178 at around 31*2^(6/5) = 71.2 days. I finished the sieving in around 66 days. Figuring that the first 24M or so Q were sieved with 2 large primes puts the speed improvement at around 89%. In summary, it seems to be faster. Another thing, for all those hesitant to join group factorizations due to memory constraints: this 3LP stuff takes 40% less memory. EDIT: A lot of this is, obviously, pretty crude estimation on my part. By way of adding some more impressionistic and imprecise thoughts, it seemed as if block Lanczos took a bit longer than I would expect for a 31bit GNFS job, but this is just an impression. Last fiddled with by FactorEyes on 20100827 at 03:41 

20100827, 05:05  #6 
Oct 2006
vomit_frame_pointer
2^{3}×3^{2}×5 Posts 
Duplication rate
I just ran through my notes on other factorizations: 49137133 duplicates in 222996222 relations seems about right for a 31bit GGNFS factorization  right around 22%.
EDIT: A local example is the fivemack project for 2^8771, also a C178 31bit GNFS. Code:
Wed Sep 9 11:47:24 2009 found 46267759 hash collisions in 226161090 relations Wed Sep 9 11:51:08 2009 found 46475021 duplicates and 179686069 unique relations Last fiddled with by FactorEyes on 20100827 at 05:13 
20100827, 05:44  #7  
Jul 2003
So Cal
2,069 Posts 
I think the most remarkable thing about this log is that there's nothing in it that's remarkable. 20% dupes doesn't seem excessive at this size. It required the usual minimum number of relations. The filtering progressed smoothly. The matrix is a bit large, but that's unsurprising simply because there was little oversieving.
Quote:
I noticed that you cut the fb limit in half. Was this optimal, or did you just try it and find it to be good enough? Why did you lower the FB limit on the rational side as well even though you still used 2 large primes on that side? 

20100827, 07:04  #8 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
2475_{16} Posts 
Just in case, here's a script that strips 3LP relations and leaves a simulated 2LPoutput (I've verified on a small real set):
Code:
#!/usr/local/bin/perl w #a script to skip 3LP relations use constant SIDE => 2; # or 1 if sieving was r use constant ALIM => 60000000; while(<>) { my @t = split ':'; my @s = split(',',$t[SIDE]); $s[$#s] =~ s/\s+$//; $q0 = hex($s[$#s]); $q0 = ALIM if ($q0 > ALIM); for($i=$n=0;$i<$#s;$i++) { $n++ if (hex($s[$i])>$q0); } print $_ if $n<3; } No need to filter, you can instead remdups the original one and the stripped one, what redundancy will you observe? Just curious. (I am not saying if it is going to be one way or another.) _________ ...And it will probably be inconclusive  smaller set will have lower redundancy. Sure. If someone was inclined to run the equivalent 2LP test, then this stripped set would probably amount to 60% of needed relations and 40% more 2LP sieving is needed to get the same volume of relations and then check the redundancy. That would be a fair test. Hmmm... Could you pipe and prepare the "2LP" set, and wc l it, then head <same size> full.set > 3LPsmall.set; and then remdups these two sets. At least they will have the same initial relation count. Could be fun. Last fiddled with by Batalov on 20100827 at 08:07 
20100827, 08:23  #9  
May 2008
2107_{8} Posts 
I think Al mentioned already, the first few million Q were sieved with 2lp parameters, so those should definitely be omitted from the comparison. Or else you will get a wrong picture.
The reason I think it is better to start sieving with relatively higher Q is because the smaller Q will tend to produce more duplicates than do higher Q. For instance, if a sieving run goes from Q=20M to 200M, likely more than 2/3 of the relations near Q=20M will end up being duplicates (this is from looking at some relations and counting how many have large primes in Q range). So factor in a 2/3 loss of yield for those first few relations you find. It will probably slow you down. I don't know how to determine the most optimal sieving range for Q but a gut feeling is that if the estimated range is so big that Q_max is more than, say, 5 times Q_min, then the range should be moved some more. Quote:
My reaction to this result is: (assuming we're sieving on algebraic side) If I've understood right, a small special Q lowers the algebraic norm (and increases rational norm) by a smaller amount than a large special Q does. So it is harder to factor algebraic norms with smaller special Q. In fact it seems that the ideal thing to do would be to have a larger algebraic fb for small Q and a smaller algebraic fb for large Q, to accommodate this difference (and just the opposite for rational fb). But this introduces its own problems (w.r.t. duplication), and GGNFS siever by default does not allow the fb limit on the sieving side to be greater than Q (it can be changed, of course). So 3lp is a sort of compromise which allows for more work to be done on the algebraic norms at small Q without having to increase the fb limit for them. This is why I think that the boost in yield is greater for the small Q, less so for larger Q. In fact it may be optimal to actually switch back over to 2lp when a certain Q has been reached, but this wasn't tried. I also wondered if having the additional large primes floating around might help build cycles faster/earlier, since it provides more coverage over the large ideals. But I'm not knowledgeable enough to make this determination. 

20100827, 20:51  #10 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
22165_{8} Posts 
I've separated the subthread so it would be easier to find as a technique guide (rather than in 6+ as a specific case).
It can be probably moved to Factoring! 
20100827, 23:11  #11  
Oct 2006
vomit_frame_pointer
360_{10} Posts 
Ignorance <> bliss
Quote:
This, anyway, is how I conceive of it in my fevered imagination. I probably have no idea what I'm talking about. Quote:
I'll run this stuff in the next few hours and report back. It might be worth running this one to completion with 2LP relations  chucking out the 3LP relations  but the principal complication here is that 2LP sieving should use larger (120M) FB bounds, which further complicates the comparison, as we will miss some relations that we would have caught had we done the whole thing with that larger f.b. Got that? 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Kaby Lake Memory Speed  airsquirrels  Hardware  12  20170622 14:48 
"Hybrid Memory Cube" offers 1 Tb/s memory bandwith at just 1.4 mW/Gb/s  ixfd64  Hardware  4  20111214 21:24 
Different Speed in different OS's  Dubslow  Software  11  20110802 00:04 
Line sieving vs. lattice sieving  JHansen  NFSNET Discussion  9  20100609 19:25 
Combined Sieving speed  Joe O  Prime Sierpinski Project  7  20060922 09:34 