mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   ksieve: a kilobit-scale NFS line siever (https://www.mersenneforum.org/showthread.php?t=4812)

jasonp 2005-10-10 06:12

ksieve: a kilobit-scale NFS line siever
 
I expect that sometime this week I'll be releasing the first draft of a classical siever that is capable of handling the sieving for kilobit-scale NFS factorizations. The early announcement is so that others can tell me if I'm missing something fundamental.

The package will be able to handle sieve lines up to size 2^64, and factor base limits of 2^32 with large primes up to 2^40. Large primes will use an MPQS subsystem that can factor integers up to 2^120. The siever itself is based on the classical siever in GGNFS, with many modifications and cleanups; it looks like the memory consumption is about 5% larger than that needed to store the factor base.

The test case I'm working with is RSA1024, using the parameters and polynomials from Shamir and Tromer's TWIRL paper. With a factor base limit just under 2^32 each factor base has over 200 million entries and the two combined take up 3.2 [i]gigabytes[/i] of storage. That's two orders of magnitude larger than the size of current cutting edge NFS work, and it's surprisingly difficult to write code that can scale up that much.

No, I don't seriously plan to use this for an actual kilobit factorization; this is more for testing purposes and to get an idea of the scale a real effort would need (with a sieve interval of 10^15 like the TWIRL paper wants, a single sieve line will probably take weeks). It's also because a suite of tools that will someday be powerful enough to actually do this sort of work will have to start somewhere.

The big thing I'm worried about now is whether 2^32 is enough of a factor base. If it's not then a [i]lot[/i] of code will have to change.

jasonp

xilman 2005-10-10 07:30

[QUOTE=jasonp]The big thing I'm worried about now is whether 2^32 is enough of a factor base. If it's not then a [i]lot[/i] of code will have to change.[/QUOTE]
Do you mean kilobit SNFS or kilobit GNFS? The difference is quite important!

From my investigations of the difficulty of kilobit SNFS, a 32-bit FB bound is easily adequate. Experiments seem to show that 28-bit bounds are sufficient there and 27-bits may be, though probably too small for efficient computation.

I'm not so sure about kilobit GNFS. My gut feeling is that 2^32 is on the small side but would have to investigate much more carefully before I can say anything more precise.

A small amount of work has been done on the requirements for kilobit GNFS, including the TWINKLE and TWIRL papers. I'm a minor co-author of another paper, the great majority of the work for that being done by Arjen Lenstra. I'll see whether it has appeared on the web somewhere and post the URL if so. Otherwise, I can mail you a personal copy.

Paul

R.D. Silverman 2005-10-10 09:28

[QUOTE=xilman]Do you mean kilobit SNFS or kilobit GNFS? The difference is quite important!

From my investigations of the difficulty of kilobit SNFS, a 32-bit FB bound is easily adequate. Experiments seem to show that 28-bit bounds are sufficient there and 27-bits may be, though probably too small for efficient computation.

I'm not so sure about kilobit GNFS. My gut feeling is that 2^32 is on the small side but would have to investigate much more carefully before I can say anything more precise.

A small amount of work has been done on the requirements for kilobit GNFS, including the TWINKLE and TWIRL papers. I'm a minor co-author of another paper, the great majority of the work for that being done by Arjen Lenstra. I'll see whether it has appeared on the web somewhere and post the URL if so. Otherwise, I can mail you a personal copy.

Paul[/QUOTE]

2^32 should be OK for SNFS but it is too small for GNFS. GNFS will
require 2^35 to 2^37.

jasonp 2005-10-10 15:48

[QUOTE=xilman]Do you mean kilobit SNFS or kilobit GNFS? The difference is quite important![/QUOTE]
I was hoping the effort could handle GNFS, but Bob's response seems to
indicate that's wishful thinking at the moment.
[QUOTE=xilman]
A small amount of work has been done on the requirements for kilobit GNFS, including the TWINKLE and TWIRL papers. I'm a minor co-author of another paper, the great majority of the work for that being done by Arjen Lenstra. I'll see whether it has appeared on the web somewhere and post the URL if so. [/QUOTE]
Do you mean "Factoring Estimates for a 1024-bit Modulus"? I think CiteSeer had it.

Could an actual siever sharpen the estimates from there?

jasonp

R.D. Silverman 2005-10-10 16:08

[QUOTE=jasonp]I was hoping the effort could handle GNFS, but Bob's response seems to
indicate that's wishful thinking at the moment.

Do you mean "Factoring Estimates for a 1024-bit Modulus"? I think CiteSeer had it.

Could an actual siever sharpen the estimates from there?

jasonp[/QUOTE]

While I was at RSA, (circa ~2000) I built a very crude, out-of-core siever
from my own line siever on a 64-bit Alpha
simply to allow me to run some *small* experiments. I was using 3 large
primes (split by pollard rho and then squfof; slow!) The LP bound was
50 times the factor base bound. I tried various size factor bases,
starting at 2^28 (bound = 2^32.2) up to 2^35 (bound = 2^39.6) and
estimated the amount of sieving needed to gather enough relations.
I sampled the yield rate for about 25 values of b, starting near the
origin and going out to about 2^30.
A *very* rough estimate gave a minimum somewhere between 2^33 and
2^34 primes in the factor base (for EACH polynomial). Note, however, that
the response curve is shallow near the optimum, so a factor of about 2
either way will not have a dramatic effect. There was, however, a
distinct increase in sieve time over the optimum near 2^28 primes.
2^28 primes is DEFINITELY too few.

jasonp 2005-10-10 16:30

[QUOTE=R.D. Silverman]While I was at RSA, (circa ~2000) I built a very crude, out-of-core siever from my own line siever on a 64-bit Alpha simply to allow me to run some *small* experiments.

A *very* rough estimate gave a minimum somewhere between 2^33 and 2^34 primes in the factor base (for EACH polynomial)[/QUOTE]
With factor bases that big it will be a long time before computers commonly have enough memory to keep the entire computation in memory. So I guess out-of-core sieving is a requirement. You can probably reduce the amount of disk IO by using a high-radix discrete priority queue. Another possibility is to maintain a heap of the last N sieve values whose accumulated log is high enough, maybe N=10e6, then read the factor base once starting from the largest primes and do the sieving only over heap entries.

Either way will be a mess :)

jasonp

R.D. Silverman 2005-10-10 16:54

[QUOTE=jasonp]With factor bases that big it will be a long time before computers commonly have enough memory to keep the entire computation in memory. So I guess out-of-core sieving is a requirement. You can probably reduce the amount of disk IO by using a high-radix discrete priority queue. Another possibility is to maintain a heap of the last N sieve values whose accumulated log is high enough, maybe N=10e6, then read the factor base once starting from the largest primes and do the sieving only over heap entries.

Either way will be a mess :)

jasonp[/QUOTE]

I was not concerned, at that time, with making the code run quickly...

I only wanted to gather smoothness data.

BTW, it *might* be the case that we want to consider FOUR large primes,
and not merely three. However, it is very unlikely that a composite
near 2^160 will split into 4 primes near 2^40....... Experiments will
have to be done. We don't even have any experience with 3 large primes on
both sides.

Note also, that 32-bit processors can't even address the primes in the
factor bases....

akruppa 2005-10-10 17:51

I suppose an ECM routine optimised for input numbers around 10^50 might come in handy here...
We meant to rewrite the EC arithmetic in GMP-ECM with mpn primitives to reduce overhead. We could try to add a few other tricks to allow rapidly testing many relatively small numbers to small bounds (i.e. precomputing and keeping optimal addition chains for the desired primes ≤B1).

Alex

jasonp 2005-10-10 17:53

[QUOTE=R.D. Silverman]I was not concerned, at that time, with making the code run quickly...

I only wanted to gather smoothness data.[/QUOTE]
Sure; come to think of it, I doubt the code I'm working on should be used for more than that.
[QUOTE=R.D. Silverman]
BTW, it *might* be the case that we want to consider FOUR large primes,
and not merely three. However, it is very unlikely that a composite
near 2^160 will split into 4 primes near 2^40....... Experiments will
have to be done. We don't even have any experience with 3 large primes on
both sides. [/QUOTE]
Franke's lattice siever routinely uses three large primes, but the implementation is limited to 32-bit large primes. Franke et al. have written a paper on hardware ECM for finding large primes, and the size targeted was 200-bit composites with 5 large primes. That sounds like major overkill unless they meant the rational and algebraic norms together.

Regarding the cutoffs for trying to factor the leftover composite, do you think the 3-large-prime bounds from earlier NFS papers would still apply to 40-bit large primes? i.e. according to estimates from the CWI folks, with a FB limit of 2^32 and 40-bit large primes the number of wasted factorizations would be lowest for inputs in the 102-112 bit range (and 64-71 bits for two large primes).
[QUOTE=R.D. Silverman]
Note also, that 32-bit processors can't even address the primes in the
factor bases....[/QUOTE]
Another reason for an out-of-core implementation :)

jasonp

xilman 2005-10-10 18:06

[QUOTE=akruppa]I suppose an ECM routine optimised for input numbers around 10^50 might come in handy here...
We meant to rewrite the EC arithmetic in GMP-ECM with mpn primitives to reduce overhead. We could try to add a few other tricks to allow rapidly testing many relatively small numbers to small bounds (i.e. precomputing and keeping optimal addition chains for the desired primes ≤B1).

Alex[/QUOTE]
While true, I think I would want to see how such an implementation compares with an MPQS similarly optimized for the same sized integers. Each has its benefits over the other.

QS gets you all the factors for essentially the same cost as getting the first. ECM, on the other hand, may produce a too-large (probable) prime cofactor and so remove the need to spend time digging out other factors.

The QS sievers descended from the ancient factoring-by-email progenitor have long used ECM, and very successfully.


Paul

akruppa 2005-10-10 18:21

[QUOTE=xilman]
QS gets you all the factors for essentially the same cost as getting the first. ECM, on the other hand, may produce a too-large (probable) prime cofactor and so remove the need to spend time digging out other factors.
[/QUOTE]

We can use both. If the residual is so large that it needs to have at least 3 lp to be smooth to the desired bounds, run a few ECM curves first. If it finds nothing, the number probably isn't three-(or more)-composite and can be discarded. If it finds a factor, divide out and repeat. Once the cofactor is <lpb^2, let MPQS do the rest.

The complexity of ECM is quite a bit better than QS if one can assume that the number has a prime factor <N[sup]1/3[/sup].

Alex


All times are UTC. The time now is 07:29.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.