mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2020-10-18, 23:35   #1
charybdis
 
charybdis's Avatar
 
Apr 2020

50510 Posts
Default NFS cofactorization with 3LP

When sieving with 2 large primes, one tends to find that the optimal value of mfb is slightly less than 2*lpb. Presumably this is because a composite cofactor close to 2^(2*lpb) is unlikely to split into two primes that are both smaller than 2^lpb. For example, for lpb=32 the optimum value of mfb with CADO-NFS appears to be 60.

Similarly, when sieving with 3 large primes, the optimum value of mfb is slightly less than 3*lpb. However, what happens if we get a cofactor that is just below 2^(2*lpb)? Such a cofactor is still unlikely to contribute to a relation: as in the 2LP case, it is probably not the product of two primes both smaller than 2^lpb, and it will also be too small to be the product of three large primes. So again it should make sense to set a double large prime bound smaller than 2^(2*lpb).

Neither GGNFS nor CADO appear to have an option to set a double large prime bound separately from mfb when using 3LP. Does anyone here know what is actually done? Is a value computed automatically, perhaps from the mfb given?
charybdis is offline   Reply With Quote
Old 2020-10-18, 23:56   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

5×17×59 Posts
Default

I've wondered about this also, though my curiosity was about cofactors slightly bigger than 2*mfb but still too small to possibly split into 3 large primes.
VBCurtis is online now   Reply With Quote
Old 2020-10-19, 00:05   #3
charybdis
 
charybdis's Avatar
 
Apr 2020

5·101 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I've wondered about this also, though my curiosity was about cofactors slightly bigger than 2*mfb but still too small to possibly split into 3 large primes.
CADO definitely deals with these properly - see this snippet from sieve/las-cofactor.cpp:

Code:
  nd = mpz_get_d (n);
  B = (double) scs.lim;
  kB = B * B;
  for (klpb = lpb; klpb < s; klpb += lpb, kB *= B)
    {
      /* invariant: klpb = k * lpb, kB = B^(k+1) */
      if (nd < kB) /* L^k < n < B^(k+1) */
	return 0;
    }
I couldn't find anything related to the double large prime bound, but there's a LOT of code for the siever so there's every chance I just haven't found it.
charybdis is offline   Reply With Quote
Old 2020-10-19, 00:33   #4
charybdis
 
charybdis's Avatar
 
Apr 2020

5·101 Posts
Default

Should have realised that I could just search in a relations file to find an answer to my question

Here's one from a job with lim1=175M, lpb1=32, mfb1=90:
178218979181,5545:2,b,fb,445,3b21,3c47b,166eeb,b38e5f,c2091937:3,3,3,5,71,89,265,9e3f5,c5a6b,451d0b,1c9c589,45458a5,f085ca3b,fb78b42f
That's two large 32-bit prime factors on the algebraic side, so the cofactor wasn't far off 2^64. It sure looks like CADO is just using a bound of 2^(2*lpb).

It feels like there could be a decent speedup from setting the 2LP bound separately in this situation, and I'd be surprised if no-one had thought of this before, so is there some reason why this wouldn't actually work in practice?
charybdis is offline   Reply With Quote
Old 2020-10-19, 02:36   #5
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

501510 Posts
Default

Well, standard practice for 2LP side is to use mfb of 2* lpb, or 2 * lpb-1. We're the weird ones using very non-standard settings for our params files. Take a look at the factory params files for, say, C130-145 to see what I mean.

So, it would not surprise me that nobody thought to do what you're suggesting. Edit: I don't speak code well enough to answer my own question: Could you just modify that formula above to exclude, say, 64-bit cofactors when LP = 32? Just that one change should provide a fair bit of speedup.

Last fiddled with by VBCurtis on 2020-10-19 at 02:38
VBCurtis is online now   Reply With Quote
Old 2020-10-19, 18:59   #6
charybdis
 
charybdis's Avatar
 
Apr 2020

1111110012 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
So, it would not surprise me that nobody thought to do what you're suggesting. Edit: I don't speak code well enough to answer my own question: Could you just modify that formula above to exclude, say, 64-bit cofactors when LP = 32? Just that one change should provide a fair bit of speedup.
I've stuck in a couple of lines to exclude cofactors between 61 and 64 bits. 60 might be a bit lower than optimal but it matches the bound on the rational side. The siever appears to be running normally; yield and rels/sec are down, but I can't compare that with anything until I know how many fewer relations will be needed to form a matrix.

Of course, I wouldn't like having to recompile every time I change the bounds, so if this experiment isn't a complete failure I'll get in contact with the authors.

Last fiddled with by charybdis on 2020-10-19 at 19:01
charybdis is offline   Reply With Quote
Old 2020-11-03, 12:30   #7
charybdis
 
charybdis's Avatar
 
Apr 2020

7718 Posts
Default

Bounds of 32/60/90 on the algebraic side are quite clearly worse than 32/64/90, by something like 10-15% depending on whether you take matrix size into account or not.

I'll try 32/62/90 on the next job.
charybdis is offline   Reply With Quote
Old 2020-11-19, 20:50   #8
charybdis
 
charybdis's Avatar
 
Apr 2020

5×101 Posts
Default

32/62/90 performs fairly similarly to 32/64/90, it's maybe a couple of percent slower but hard to tell for sure. There is a slight reduction in relations needed, but the lower yield means more sieving has to be done at higher Q values where the sieving is slower. There isn't any noticeable improvement in matrix size.

In hindsight, I shouldn't have been expecting the algebraic side to behave too similarly to the rational side: the larger norms mean that more of the relations have two large primes close to the bound on the algebraic side than on the rational side.

I could try 32/63/90, but it's hard to imagine it'll be more than a percent or two faster than 32/64/90, so I'll just stick with the defaults from now on.
charybdis is offline   Reply With Quote
Old 2020-12-07, 12:45   #9
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

Try asking the developers in the cado-nfs mailing list, they are very responsive and I know they have thought pretty deeply about cofactorization strategies.

For extra credit: given a rational and algebraic pair of numbers to factorize, determine as fast as possible if it is worth the effort to try factorizing both.
jasonp is offline   Reply With Quote
Old 2020-12-07, 15:39   #10
charybdis
 
charybdis's Avatar
 
Apr 2020

5×101 Posts
Default

Quote:
Originally Posted by jasonp View Post
Try asking the developers in the cado-nfs mailing list, they are very responsive and I know they have thought pretty deeply about cofactorization strategies.

For extra credit: given a rational and algebraic pair of numbers to factorize, determine as fast as possible if it is worth the effort to try factorizing both.
I figured in the end that it wasn't worth it. The time spent on 2LP cofactorization is tiny, so the savings from having mfb below 2*lpb actually come from reducing the number of relations needed to build a matrix. But the larger norms on the algebraic side mean that yield drops much more if you drop algebraic mfb by one bit than if you drop rational mfb by one bit (and this will apply to the hypothetical algebraic 2LP bound too).

One useful thing that we do seem to have discovered is that it can be optimal to set the mfb bounds asymmetrically (algebraic higher than rational) even when both lpbs are the same; Curtis has run some tests in the c100-c120 range that show this to be the case.

Last fiddled with by charybdis on 2020-12-07 at 15:40
charybdis is offline   Reply With Quote
Old 2020-12-08, 15:11   #11
charybdis
 
charybdis's Avatar
 
Apr 2020

5·101 Posts
Default

I had another idea: turns out you can get a tiny speedup by ignoring cofactors right at the bottom of the 3LP range - say, the bottom 2 bits of it. It looks like the improvement won't be any more than 1%, so I'm not sure if it's worth bothering the developers about this.
charybdis is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 22:09.


Thu Oct 28 22:09:53 UTC 2021 up 97 days, 16:38, 0 users, load averages: 1.39, 1.45, 1.35

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.