mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > NFS@Home

Reply
 
Thread Tools
Old 2018-12-04, 17:31   #1
jyb
 
jyb's Avatar
 
Aug 2005
Seattle, WA

22·7·61 Posts
Default Guidelines for choosing parameters

Quote:
Originally Posted by VBCurtis View Post
Time for my periodic reminder that choosing LP bounds that cause average yield below 1.0 is distinctly suboptimal. It is a fallacy that matrices from such jobs are easier and smaller than the same job with 1 higher LP bound; matrix size scales pretty steadily with input difficulty, and not very much at all with LP choice provided LP is 32 or lower.

Q-ranges of 100M or greater should use 31LP, as using 30LP would mean a yield around 1.0-1.2.
Q-ranges of 180M or greater should use 32LP, as using 31LP would mean a yield around 1.2-1.4.
Q-ranges of 320M or greater should likely use 33LP, or 15e with a much smaller Q-range.

An example of the last situation: I test-sieved 13*2^870-1 and found I'd need Q of 20-330M on 14e, but just 20-133M on 15e. I cut alim and rlim in half for the 15e tests.

The only reason I know of to stick to smaller LP choices is to save disk space on the server; I've been submitting jobs that are faster at 33LP as 32 or 32-33 hybrid to save 200M relations' worth of disk space.
Thank you for this concise and concrete set of guidelines. I'd like to explore this a little more, because I'm not sure I understand all the points you're making, and they seem to be at odds with the experience of at least some forum members.

In the example that prompted your post, I had the following trial sieve yields for 5K blocks:
Code:
 Q		30-bit		31-bit
---		------		------
20M		3380		6139
50M		4353		8371
80M		4410		9183
110M		4662		11010
140M		4003		10119
170M		3896		9901
200M		3907		9997
Now, with these values, I would probably suggest sieving 20M - 180M for 30-bit, and 20M - 160M for 31-bit. (Actually, I did suggest the former.) Okay, so that sounds like a slight win for 31-bit. But here's the thing: I ran these tests on the same otherwise-reasonably-unloaded machine, and here are the values I got for sec/rel:
Code:
Bits		Avg sec/rel	Total time
----		-----------	----------
30		0.20612		27826200s (assumes ~135M relations)
31		0.13659		34146785s (assumes ~250M relations)
The total CPU time is almost 23% greater with the 31-bit job. So although allowing larger large primes can cut down the required sieving range, that gain seems to be more than offset by the sieving being slower for any given range. Am I failing to understand something about how this works? Please correct me if so.

So that's the sieving; now let's talk about the post-processing. The filtering part is surely slower for a job with larger large primes, simply because there are many more relations. But of course the post-processing time is dominated by the linear algebra, so it really comes down to how big a matrix gets generated for the choices of large prime size.

According to you, it's a fallacy that there's any notable difference along these lines. I don't have enough experience of my own to either confirm or refute that, but I do note that others have found it impossible to post-process 31- and 32-bit jobs on their hardware, while finding 30-bit jobs manageable (see Carlos's post right after yours). I would guess that your argument is that this is because 31-bit jobs are 31-bit precisely because they are harder than 30-bit jobs, and it's simply this hardness that causes problems for Carlos, not the 31-bitness of them. Well, we're about to get a nice natural experiment in whether that's true: Carlos has 4p3_1335L reserved. It's a 30-bit job, but you apparently think that its hardness is more appropriate for a 31-bit job, in which case Carlos should have trouble with it, right?

Of course, what we really need is to sieve the otherwise-same job at two different bit-levels, then post-process both of them to see whether they do actually build matrices of about the same size. But I'm loath to propose using NFS@Home for redundant work in this way, and doing this at the 30- or 31-bit level is a bit taxing on my own resources. Still, if I have time soon, I might just give that a shot.

In any case, I make no claim to expertise in any of these areas, so I would greatly appreciate being corrected as to any misunderstandings I have. Thanks for your earlier explanation and for any future education you can give me.

Last fiddled with by jyb on 2018-12-04 at 17:40 Reason: Fixed typo
jyb is online now   Reply With Quote
Old 2018-12-04, 17:52   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

13×367 Posts
Default

When comparing different LP sizes for a test-sieve, I've always used 70% more relations for the larger LP as enough to build an equivalent matrix. 250M is more than 1.7x 135M, which puts 31LP at a disadvantage in the comparison.

If the 30LP sec/rel is 1.7x the 31LP sec/rel, the projects should take the same amount of time, since 31LP requires 1.7x the relation count. In your example, the 30LP time is 1.51x the 31LP, which suggests your parameter choice is substantially better than 31LP. I've never experienced this outcome, but my personal projects have been of a rather narrow range of SNFS forms (and GNFS, not relevant for the present discussion).

The sieve range from 160M to 180M is likely to be pretty slow in sec/rel terms, and only the 30LP parameters need to sieve this range; the better yield of 31LP also means more of the relations are found in the more-efficient/faster range. If your average sec/rel number is weighted for yield and still shows a clear win for 30LP, then this job is an exception to my standard guidance.

I'll try to address more of your thoughts later today. Thanks for the detailed consideration!

Last fiddled with by VBCurtis on 2018-12-04 at 17:53
VBCurtis is online now   Reply With Quote
Old 2018-12-04, 17:55   #3
pinhodecarlos
 
pinhodecarlos's Avatar
 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK

5×983 Posts
Default

Just a note on my side. I’ve done a lot of 31-bit post processing for NFS and RSALS before. My complain is that for the current laptop I’ve owned since 2013, I lost the ability to process the 31-bit jobs for the last two years and I bet is something on msieve routines since I’ve not updated my laptop since then (2013). With this I’m looking forward to process 4p3_1335L hoping I won’t need to max the number of relations. So far I’ve managed only once to run the latter for a 31-bit job.


PS. On this link follows an example of a succeeded post-processing. Nowadays msieve fails on my laptop at line (Tue Oct 18 04:42:57 2016 heaviest cycle: 7 relations), it goes nuts.
https://www.mersenneforum.org/showpo...11&postcount=3

Last fiddled with by pinhodecarlos on 2018-12-04 at 18:07
pinhodecarlos is offline   Reply With Quote
Old 2018-12-04, 18:37   #4
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

13×367 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
Are the highlighted values correct? You normally want rlambda 3.6 or similar when using 3 large primes.

The same question applies to the next job (4p3_1335L).

Chris
lpbr/a is the size of large prime to retain; that is, if a relation's cofactor splits into a primes that are all smaller than 2^lpbr/a, we'll keep the relation.

mfbr/a is the size of cofactor to try splitting. The chances of a 93bit input splitting into 31*31*31bits is very small (compared to 30*31*32 or some other combination), so it's a bit wasteful to try to split 93-bit cofactors with 31LP. A rule of thumb for 3 large primes is 3*lbpr/a -3, so 31LP would use 90 for the 3-large-prime size. There are lots of ways to split 90 into 3 factors 31 or smaller, so most of the factorization effort on 90-bit inputs results in good relations. I haven't personally done any 31LP tasks with 3LP, and Rich has usually posted 91 for MFB with 31LP; my experiments with 33LP show 95 and 96 are usually competitive, and with 32LP 92 and 93 are usually competitive but 94 usually isn't.

Lambda is a parameter I don't understand well, and an expert may come correct me:
I believe it is the ratio of {bitsize of inputs to try splitting} to {lpb size}. If lpb is 31, then lambda under 3.0 would restrict GGNFS to trying to split inputs smaller than 93 bits. Lambda must be chosen greater than mfb/lpb, else the siever won't actually be trying to split the inputs you think it will.

What's not clear is why lambda is not best at 2.1 for 2LP settings and 3.1 for 3LP settings. Here is my fuzzy uncertain explanation: (experts?)
GGNFS attempts to split any input of size lambda(r/a) * lpb(r/a). This splitting involves trial division by each prime up to (r/a)lim, followed by trying to split the cofactor if the cofactor is smaller than mfb(r/a). So, by choosing lambda to be, say, 2.6 for a 31LP task, 2.6*31 = 80.6 bits. Factor out the small primes below alim/rlim, and if the leftover is 62 or smaller try to split it. So, lambda = 2.6 in this case allows for 18.6 bits of small primes.
The larger alim and rlim are, the higher the chance of "small" primes summing to more than 16 bits (31*2.5 - 62 is just under 16). However, too large a lambda means a bunch of time is wasted trial dividing large inputs into cofactors that don't fit our desired MFB range.

Now, I thought lambda was supposed to be the ratio of MFB to LPB, but if it was then 2.0 or 2.1 should always be best (or 3.0 or 3.1 for 3LP); this led to my speculation in the previous paragraph.

Summary: For 2LP tasks, lambda is 2.5 or 2.6 or 2.7 (use larger values with larger lim's). For 3LP, use 3.7 or 3.8 or maybe 3.9; I haven't done enough jobs to know for sure.

Last fiddled with by VBCurtis on 2018-12-04 at 18:41
VBCurtis is online now   Reply With Quote
Old 2018-12-04, 19:44   #5
jyb
 
jyb's Avatar
 
Aug 2005
Seattle, WA

22·7·61 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
The sieve range from 160M to 180M is likely to be pretty slow in sec/rel terms, and only the 30LP parameters need to sieve this range; the better yield of 31LP also means more of the relations are found in the more-efficient/faster range. If your average sec/rel number is weighted for yield and still shows a clear win for 30LP, then this job is an exception to my standard guidance.
Actually, there was surprisingly little variance in the sec/rel data over the different ranges I checked. The block at 170M was slightly slower than the block at 140M, but not by much, and the block at 200M was better than both.
jyb is online now   Reply With Quote
Old 2018-12-04, 20:55   #6
jyb
 
jyb's Avatar
 
Aug 2005
Seattle, WA

22×7×61 Posts
Default

Quote:
Originally Posted by jyb View Post
Actually, there was surprisingly little variance in the sec/rel data over the different ranges I checked. The block at 170M was slightly slower than the block at 140M, but not by much, and the block at 200M was better than both.
Actually, scratch that last part. The variance was indeed small, but the block at 200M was in fact a little worse than the one at 170M. My initial data set was gathered with poor methodology.
jyb is online now   Reply With Quote
Old 2018-12-05, 16:59   #7
chris2be8
 
chris2be8's Avatar
 
Sep 2009

37708 Posts
Default

Most of what I know about [ar]lambda comes from reading the INSTALL.and.USE file shipped with the 64 bit lattice siever (the file's called cweb.INSTALL.and.USE with lasieve4).

The relevant part is:
Quote:
The selection of the third field (called lambda) of the last two lines requires
some care. If lambda is too large, then the trial division sieve will dominate
the run time without producing more useful sieve reports than the optimal
value. If it is too small, sieve reports will be lost, and you will be forced
to sieve more special q. Note that the yield per special q decreases as the
special q increases. The best procedure is to first fix the large prime bounds
for your project. Next, sieve some special q with a very liberal choice of
lambda (eg, 4.0). Next, resieve with smaller values of lambda, keeping the
smallest value for which you do not loose more than 1-2% compared to the
liberal choice of lambda. Larger losses are acceptable only if the decrease
of the run time per special q is so considerable that the effect is not
neutralized by the need to sieve more special q at the upper end of the range.
My personal rule of thumb is to set alambda to mfba/lpba+0.6 and rlambda to mfbr/lpbr+0.6 (this assumes the jobs too small to make test sieving as above worth while).

Chris
chris2be8 is offline   Reply With Quote
Old 2018-12-06, 17:01   #8
chris2be8
 
chris2be8's Avatar
 
Sep 2009

23·3·5·17 Posts
Default

After checking the INSTALL.and.USE file I found I'd missed some of the relevant info.

This should go *before* the part I quoted above:
Quote:
# Third field: Sieve report bound lambda
# All sieve reports for which the sieve value is at least
# log(abs(polynomial value))-lambda*log(factor base bound)
# will be investigated by the trial division sieve.
AFAIK the process is:
The sieve finds candidate relations.
If they pass the check based on lambda small factors are divided out by TF.
If the are then under MFB limits they are fully factored by QS.
If all the factors are under the large prime bound the relation is written to the output file.

Of course all checks are done on both sides. But I think that's enough to understand what the various parameters do.

Chris
chris2be8 is offline   Reply With Quote
Old 2018-12-06, 18:02   #9
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

13×367 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
After checking the INSTALL.and.USE file I found I'd missed some of the relevant info.

This should go *before* the part I quoted above: AFAIK the process is:
The sieve finds candidate relations.
If they pass the check based on lambda small factors are divided out by TF.
If the are then under MFB limits they are fully factored by QS.
If all the factors are under the large prime bound the relation is written to the output file.

Of course all checks are done on both sides. But I think that's enough to understand what the various parameters do.

Chris
Aha! Thank you! This calculation shows how the factor base interacts with lambda.
I had wondered why large values for lambda helped small-Q sieving more than large-Q; recall that ggnfs reduces the factor base to Q-1 when sieving Q below the declared factored base.

So, a large lambda * small factorbase is similar to smaller lambda * full factor base.

In short, a large lambda "looks" really good if you test-sieve small Q values, but may waste TF time (that is, increase sec/rel without increase in yield) on large Q values. Neat.
VBCurtis is online now   Reply With Quote
Old 2018-12-07, 17:28   #10
chris2be8
 
chris2be8's Avatar
 
Sep 2009

111111110002 Posts
Default

That makes me wonder if adjusting lambda according to how much the factor base has been reduced would be worth doing?

And can you confirm the TF bound is the size of the factor base? That's another point I'm not sure of.

Chris
chris2be8 is offline   Reply With Quote
Old 2019-01-14, 17:17   #11
chris2be8
 
chris2be8's Avatar
 
Sep 2009

204010 Posts
Default

I've tested adjusting lambda according to how much the factor base bounds have been reduced and it looks promising. I'm ensuring lambda*log(factor base bound) is constant regardless of what the bounds are reduced to:
Code:
chris@linux-5hwg:~/bin> diff -u factMsieve.1.pl factMsieve.1.pl.test
--- factMsieve.1.pl     2019-01-11 15:24:30.000000000 +0000
+++ factMsieve.1.pl.test        2019-01-14 07:58:41.000000000 +0000
@@ -984,12 +984,16 @@
   }
   print OUTF "skew: $SKEW\n";
   my $sieverRL = $RLIM; # rlim passed to the siever
+  my $sieverRLAMBDA = $RLAMBDA; # rlambda passed to the siever
   my $sieverAL = $ALIM; # alim passed to the siever
+  my $sieverALAMBDA = $ALAMBDA; # alambda passed to the siever
   if ($LATSIEVE_SIDE) {
     if ($RLIM > $Q0) {
       $sieverRL = $Q0-1;
       unlink "$JOBNAME.afb.1"; # This probably won't exist anyway.
       $sieverAFBopt = '-F'; # Force AFB to be calculated
+      $sieverRLAMBDA = $RLAMBDA*log($RLIM)/log($sieverRL);
+      logwrite("Adjusted rlim from $RLIM to $sieverRL and rlambda from $RLAMBDA to $sieverRLAMBDA"); # DEBUG
     }
     else {
       $sieverAFBopt = '-k'; # keep AFB in $JOBNAME.AFB.1
@@ -999,6 +1003,8 @@
       $sieverAL = $Q0-1;
       unlink "$JOBNAME.afb.0"; # This probably won't exist anyway.
       $sieverAFBopt = '-F'; # Force AFB to be calculated
+      $sieverALAMBDA = $ALAMBDA*log($ALIM)/log($sieverAL);
+      logwrite("Adjusted alim from $ALIM to $sieverAL and alambda from $ALAMBDA to $sieverALAMBDA"); # DEBUG
     }
     else {
       $sieverAFBopt = '-k'; # keep AFB in $JOBNAME.AFB.0
@@ -1010,8 +1016,8 @@
   print OUTF "lpba: $LPBA\n";
   print OUTF "mfbr: $MFBR\n";
   print OUTF "mfba: $MFBA\n";
-  print OUTF "rlambda: $RLAMBDA\n";
-  print OUTF "alambda: $ALAMBDA\n";
+  print OUTF "rlambda: $sieverRLAMBDA\n";
+  print OUTF "alambda: $sieverALAMBDA\n";
   print OUTF "q0: $Q0\n";
   print OUTF "qintsize: $qrSize\n";
   print OUTF "#q1:$Q1\n";
In my first (and so far only) test case (rlim was 300000 for this number, reduced to start of each range when sieving below it).
Without patch:
Code:
chris@linux-5hwg:~/ggnfs/tests/factordb1/current> grep copycount tf69122916865_3_98_3_1555/sysout
copycount copied 21062 records from spairs.add.150001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 22200 records from spairs.add.160001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 24497 records from spairs.add.170001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 25521 records from spairs.add.180001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 26412 records from spairs.add.190001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 28107 records from spairs.add.200001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 29191 records from spairs.add.210001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 31297 records from spairs.add.220001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 30698 records from spairs.add.230001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 33504 records from spairs.add.240001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 33996 records from spairs.add.250001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 35762 records from spairs.add.260001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 35519 records from spairs.add.270001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 36943 records from spairs.add.280001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 36576 records from spairs.add.290001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 39409 records from spairs.add.300001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 39281 records from spairs.add.310001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 38391 records from spairs.add.320001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 37944 records from spairs.add.330001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 38235 records from spairs.add.340001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 37654 records from spairs.add.350001 to >>tf69122916865_3_98_3_1555.dat
copycount copied 36762 records from spairs.add.360001 to >>tf69122916865_3_98_3_1555.dat
With patch:
Code:
chris@linux-5hwg:~/ggnfs/tests/factordb1/current> grep copycount tf69122916865_3_98_3_1555.2/sysout
copycount copied 31613 records from spairs.add.150001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 32199 records from spairs.add.160001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 33859 records from spairs.add.170001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 34077 records from spairs.add.180001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 34138 records from spairs.add.190001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 35000 records from spairs.add.200001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 35235 records from spairs.add.210001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 36696 records from spairs.add.220001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 35114 records from spairs.add.230001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 37226 records from spairs.add.240001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 36871 records from spairs.add.250001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 38022 records from spairs.add.260001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 36951 records from spairs.add.270001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 37863 records from spairs.add.280001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 36954 records from spairs.add.290001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 39409 records from spairs.add.300001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 39281 records from spairs.add.310001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 38391 records from spairs.add.320001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 37944 records from spairs.add.330001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 38235 records from spairs.add.340001 to >>tf69122916865_3_98_3_1555.2.dat
copycount copied 37654 records from spairs.add.350001 to >>tf69122916865_3_98_3_1555.2.dat
Comparing the number of records from each range with the start of the range (the number after spairs.add.) I got a useful increase in yield sieving below the full bounds.

I'll do some more testing. Eg what happens if I start with larger lambda. And try some larger jobs.

Chris
chris2be8 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Choosing the best polynomial wombatman Msieve 123 2013-08-27 11:27
Help choosing motherboard please. Flatlander GPU Computing 4 2011-01-26 08:15
Choosing the best CPU for sieving siew Factoring 14 2010-02-27 10:07
Guidelines for ECM Before Other Methods wblipp GMP-ECM 1 2005-04-19 15:58
Choosing amount of memory azhad Software 2 2004-10-16 16:41

All times are UTC. The time now is 14:25.

Fri May 7 14:25:42 UTC 2021 up 29 days, 9:06, 0 users, load averages: 3.73, 3.65, 3.28

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.