![]() |
![]() |
#1 | |
Aug 2005
Seattle, WA
2·33·31 Posts |
![]() Quote:
In the example that prompted your post, I had the following trial sieve yields for 5K blocks: Code:
Q 30-bit 31-bit --- ------ ------ 20M 3380 6139 50M 4353 8371 80M 4410 9183 110M 4662 11010 140M 4003 10119 170M 3896 9901 200M 3907 9997 Code:
Bits Avg sec/rel Total time ---- ----------- ---------- 30 0.20612 27826200s (assumes ~135M relations) 31 0.13659 34146785s (assumes ~250M relations) So that's the sieving; now let's talk about the post-processing. The filtering part is surely slower for a job with larger large primes, simply because there are many more relations. But of course the post-processing time is dominated by the linear algebra, so it really comes down to how big a matrix gets generated for the choices of large prime size. According to you, it's a fallacy that there's any notable difference along these lines. I don't have enough experience of my own to either confirm or refute that, but I do note that others have found it impossible to post-process 31- and 32-bit jobs on their hardware, while finding 30-bit jobs manageable (see Carlos's post right after yours). I would guess that your argument is that this is because 31-bit jobs are 31-bit precisely because they are harder than 30-bit jobs, and it's simply this hardness that causes problems for Carlos, not the 31-bitness of them. Well, we're about to get a nice natural experiment in whether that's true: Carlos has 4p3_1335L reserved. It's a 30-bit job, but you apparently think that its hardness is more appropriate for a 31-bit job, in which case Carlos should have trouble with it, right? Of course, what we really need is to sieve the otherwise-same job at two different bit-levels, then post-process both of them to see whether they do actually build matrices of about the same size. But I'm loath to propose using NFS@Home for redundant work in this way, and doing this at the 30- or 31-bit level is a bit taxing on my own resources. Still, if I have time soon, I might just give that a shot. In any case, I make no claim to expertise in any of these areas, so I would greatly appreciate being corrected as to any misunderstandings I have. Thanks for your earlier explanation and for any future education you can give me. Last fiddled with by jyb on 2018-12-04 at 17:40 Reason: Fixed typo |
|
![]() |
![]() |
![]() |
#2 |
"Curtis"
Feb 2005
Riverside, CA
52×11×17 Posts |
![]()
When comparing different LP sizes for a test-sieve, I've always used 70% more relations for the larger LP as enough to build an equivalent matrix. 250M is more than 1.7x 135M, which puts 31LP at a disadvantage in the comparison.
If the 30LP sec/rel is 1.7x the 31LP sec/rel, the projects should take the same amount of time, since 31LP requires 1.7x the relation count. In your example, the 30LP time is 1.51x the 31LP, which suggests your parameter choice is substantially better than 31LP. I've never experienced this outcome, but my personal projects have been of a rather narrow range of SNFS forms (and GNFS, not relevant for the present discussion). The sieve range from 160M to 180M is likely to be pretty slow in sec/rel terms, and only the 30LP parameters need to sieve this range; the better yield of 31LP also means more of the relations are found in the more-efficient/faster range. If your average sec/rel number is weighted for yield and still shows a clear win for 30LP, then this job is an exception to my standard guidance. I'll try to address more of your thoughts later today. Thanks for the detailed consideration! Last fiddled with by VBCurtis on 2018-12-04 at 17:53 |
![]() |
![]() |
![]() |
#3 |
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
130116 Posts |
![]()
Just a note on my side. I’ve done a lot of 31-bit post processing for NFS and RSALS before. My complain is that for the current laptop I’ve owned since 2013, I lost the ability to process the 31-bit jobs for the last two years and I bet is something on msieve routines since I’ve not updated my laptop since then (2013). With this I’m looking forward to process 4p3_1335L hoping I won’t need to max the number of relations. So far I’ve managed only once to run the latter for a 31-bit job.
PS. On this link follows an example of a succeeded post-processing. Nowadays msieve fails on my laptop at line (Tue Oct 18 04:42:57 2016 heaviest cycle: 7 relations), it goes nuts. https://www.mersenneforum.org/showpo...11&postcount=3 Last fiddled with by pinhodecarlos on 2018-12-04 at 18:07 |
![]() |
![]() |
![]() |
#4 | |
"Curtis"
Feb 2005
Riverside, CA
52×11×17 Posts |
![]() Quote:
mfbr/a is the size of cofactor to try splitting. The chances of a 93bit input splitting into 31*31*31bits is very small (compared to 30*31*32 or some other combination), so it's a bit wasteful to try to split 93-bit cofactors with 31LP. A rule of thumb for 3 large primes is 3*lbpr/a -3, so 31LP would use 90 for the 3-large-prime size. There are lots of ways to split 90 into 3 factors 31 or smaller, so most of the factorization effort on 90-bit inputs results in good relations. I haven't personally done any 31LP tasks with 3LP, and Rich has usually posted 91 for MFB with 31LP; my experiments with 33LP show 95 and 96 are usually competitive, and with 32LP 92 and 93 are usually competitive but 94 usually isn't. Lambda is a parameter I don't understand well, and an expert may come correct me: I believe it is the ratio of {bitsize of inputs to try splitting} to {lpb size}. If lpb is 31, then lambda under 3.0 would restrict GGNFS to trying to split inputs smaller than 93 bits. Lambda must be chosen greater than mfb/lpb, else the siever won't actually be trying to split the inputs you think it will. What's not clear is why lambda is not best at 2.1 for 2LP settings and 3.1 for 3LP settings. Here is my fuzzy uncertain explanation: (experts?) GGNFS attempts to split any input of size lambda(r/a) * lpb(r/a). This splitting involves trial division by each prime up to (r/a)lim, followed by trying to split the cofactor if the cofactor is smaller than mfb(r/a). So, by choosing lambda to be, say, 2.6 for a 31LP task, 2.6*31 = 80.6 bits. Factor out the small primes below alim/rlim, and if the leftover is 62 or smaller try to split it. So, lambda = 2.6 in this case allows for 18.6 bits of small primes. The larger alim and rlim are, the higher the chance of "small" primes summing to more than 16 bits (31*2.5 - 62 is just under 16). However, too large a lambda means a bunch of time is wasted trial dividing large inputs into cofactors that don't fit our desired MFB range. Now, I thought lambda was supposed to be the ratio of MFB to LPB, but if it was then 2.0 or 2.1 should always be best (or 3.0 or 3.1 for 3LP); this led to my speculation in the previous paragraph. Summary: For 2LP tasks, lambda is 2.5 or 2.6 or 2.7 (use larger values with larger lim's). For 3LP, use 3.7 or 3.8 or maybe 3.9; I haven't done enough jobs to know for sure. Last fiddled with by VBCurtis on 2018-12-04 at 18:41 |
|
![]() |
![]() |
![]() |
#5 | |
Aug 2005
Seattle, WA
2·33·31 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#6 |
Aug 2005
Seattle, WA
2×33×31 Posts |
![]()
Actually, scratch that last part. The variance was indeed small, but the block at 200M was in fact a little worse than the one at 170M. My initial data set was gathered with poor methodology.
|
![]() |
![]() |
![]() |
#7 | |
Sep 2009
22·3·167 Posts |
![]()
Most of what I know about [ar]lambda comes from reading the INSTALL.and.USE file shipped with the 64 bit lattice siever (the file's called cweb.INSTALL.and.USE with lasieve4).
The relevant part is: Quote:
Chris |
|
![]() |
![]() |
![]() |
#8 | |
Sep 2009
22×3×167 Posts |
![]()
After checking the INSTALL.and.USE file I found I'd missed some of the relevant info.
This should go *before* the part I quoted above: Quote:
The sieve finds candidate relations. If they pass the check based on lambda small factors are divided out by TF. If the are then under MFB limits they are fully factored by QS. If all the factors are under the large prime bound the relation is written to the output file. Of course all checks are done on both sides. But I think that's enough to understand what the various parameters do. Chris |
|
![]() |
![]() |
![]() |
#9 | |
"Curtis"
Feb 2005
Riverside, CA
467510 Posts |
![]() Quote:
I had wondered why large values for lambda helped small-Q sieving more than large-Q; recall that ggnfs reduces the factor base to Q-1 when sieving Q below the declared factored base. So, a large lambda * small factorbase is similar to smaller lambda * full factor base. In short, a large lambda "looks" really good if you test-sieve small Q values, but may waste TF time (that is, increase sec/rel without increase in yield) on large Q values. Neat. |
|
![]() |
![]() |
![]() |
#10 |
Sep 2009
22·3·167 Posts |
![]()
That makes me wonder if adjusting lambda according to how much the factor base has been reduced would be worth doing?
And can you confirm the TF bound is the size of the factor base? That's another point I'm not sure of. Chris |
![]() |
![]() |
![]() |
#11 |
Sep 2009
22·3·167 Posts |
![]()
I've tested adjusting lambda according to how much the factor base bounds have been reduced and it looks promising. I'm ensuring lambda*log(factor base bound) is constant regardless of what the bounds are reduced to:
Code:
chris@linux-5hwg:~/bin> diff -u factMsieve.1.pl factMsieve.1.pl.test --- factMsieve.1.pl 2019-01-11 15:24:30.000000000 +0000 +++ factMsieve.1.pl.test 2019-01-14 07:58:41.000000000 +0000 @@ -984,12 +984,16 @@ } print OUTF "skew: $SKEW\n"; my $sieverRL = $RLIM; # rlim passed to the siever + my $sieverRLAMBDA = $RLAMBDA; # rlambda passed to the siever my $sieverAL = $ALIM; # alim passed to the siever + my $sieverALAMBDA = $ALAMBDA; # alambda passed to the siever if ($LATSIEVE_SIDE) { if ($RLIM > $Q0) { $sieverRL = $Q0-1; unlink "$JOBNAME.afb.1"; # This probably won't exist anyway. $sieverAFBopt = '-F'; # Force AFB to be calculated + $sieverRLAMBDA = $RLAMBDA*log($RLIM)/log($sieverRL); + logwrite("Adjusted rlim from $RLIM to $sieverRL and rlambda from $RLAMBDA to $sieverRLAMBDA"); # DEBUG } else { $sieverAFBopt = '-k'; # keep AFB in $JOBNAME.AFB.1 @@ -999,6 +1003,8 @@ $sieverAL = $Q0-1; unlink "$JOBNAME.afb.0"; # This probably won't exist anyway. $sieverAFBopt = '-F'; # Force AFB to be calculated + $sieverALAMBDA = $ALAMBDA*log($ALIM)/log($sieverAL); + logwrite("Adjusted alim from $ALIM to $sieverAL and alambda from $ALAMBDA to $sieverALAMBDA"); # DEBUG } else { $sieverAFBopt = '-k'; # keep AFB in $JOBNAME.AFB.0 @@ -1010,8 +1016,8 @@ print OUTF "lpba: $LPBA\n"; print OUTF "mfbr: $MFBR\n"; print OUTF "mfba: $MFBA\n"; - print OUTF "rlambda: $RLAMBDA\n"; - print OUTF "alambda: $ALAMBDA\n"; + print OUTF "rlambda: $sieverRLAMBDA\n"; + print OUTF "alambda: $sieverALAMBDA\n"; print OUTF "q0: $Q0\n"; print OUTF "qintsize: $qrSize\n"; print OUTF "#q1:$Q1\n"; Without patch: Code:
chris@linux-5hwg:~/ggnfs/tests/factordb1/current> grep copycount tf69122916865_3_98_3_1555/sysout copycount copied 21062 records from spairs.add.150001 to >>tf69122916865_3_98_3_1555.dat copycount copied 22200 records from spairs.add.160001 to >>tf69122916865_3_98_3_1555.dat copycount copied 24497 records from spairs.add.170001 to >>tf69122916865_3_98_3_1555.dat copycount copied 25521 records from spairs.add.180001 to >>tf69122916865_3_98_3_1555.dat copycount copied 26412 records from spairs.add.190001 to >>tf69122916865_3_98_3_1555.dat copycount copied 28107 records from spairs.add.200001 to >>tf69122916865_3_98_3_1555.dat copycount copied 29191 records from spairs.add.210001 to >>tf69122916865_3_98_3_1555.dat copycount copied 31297 records from spairs.add.220001 to >>tf69122916865_3_98_3_1555.dat copycount copied 30698 records from spairs.add.230001 to >>tf69122916865_3_98_3_1555.dat copycount copied 33504 records from spairs.add.240001 to >>tf69122916865_3_98_3_1555.dat copycount copied 33996 records from spairs.add.250001 to >>tf69122916865_3_98_3_1555.dat copycount copied 35762 records from spairs.add.260001 to >>tf69122916865_3_98_3_1555.dat copycount copied 35519 records from spairs.add.270001 to >>tf69122916865_3_98_3_1555.dat copycount copied 36943 records from spairs.add.280001 to >>tf69122916865_3_98_3_1555.dat copycount copied 36576 records from spairs.add.290001 to >>tf69122916865_3_98_3_1555.dat copycount copied 39409 records from spairs.add.300001 to >>tf69122916865_3_98_3_1555.dat copycount copied 39281 records from spairs.add.310001 to >>tf69122916865_3_98_3_1555.dat copycount copied 38391 records from spairs.add.320001 to >>tf69122916865_3_98_3_1555.dat copycount copied 37944 records from spairs.add.330001 to >>tf69122916865_3_98_3_1555.dat copycount copied 38235 records from spairs.add.340001 to >>tf69122916865_3_98_3_1555.dat copycount copied 37654 records from spairs.add.350001 to >>tf69122916865_3_98_3_1555.dat copycount copied 36762 records from spairs.add.360001 to >>tf69122916865_3_98_3_1555.dat Code:
chris@linux-5hwg:~/ggnfs/tests/factordb1/current> grep copycount tf69122916865_3_98_3_1555.2/sysout copycount copied 31613 records from spairs.add.150001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 32199 records from spairs.add.160001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 33859 records from spairs.add.170001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 34077 records from spairs.add.180001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 34138 records from spairs.add.190001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 35000 records from spairs.add.200001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 35235 records from spairs.add.210001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 36696 records from spairs.add.220001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 35114 records from spairs.add.230001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 37226 records from spairs.add.240001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 36871 records from spairs.add.250001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 38022 records from spairs.add.260001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 36951 records from spairs.add.270001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 37863 records from spairs.add.280001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 36954 records from spairs.add.290001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 39409 records from spairs.add.300001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 39281 records from spairs.add.310001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 38391 records from spairs.add.320001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 37944 records from spairs.add.330001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 38235 records from spairs.add.340001 to >>tf69122916865_3_98_3_1555.2.dat copycount copied 37654 records from spairs.add.350001 to >>tf69122916865_3_98_3_1555.2.dat I'll do some more testing. Eg what happens if I start with larger lambda. And try some larger jobs. Chris |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Choosing the best polynomial | wombatman | Msieve | 123 | 2013-08-27 11:27 |
Help choosing motherboard please. | Flatlander | GPU Computing | 4 | 2011-01-26 08:15 |
Choosing the best CPU for sieving | siew | Factoring | 14 | 2010-02-27 10:07 |
Guidelines for ECM Before Other Methods | wblipp | GMP-ECM | 1 | 2005-04-19 15:58 |
Choosing amount of memory | azhad | Software | 2 | 2004-10-16 16:41 |