mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2011-10-13, 05:46   #1
Sleepy
 
May 2011

23 Posts
Default Size optimization

I have noticed that for larger problem like RSA768 or bigger, the polynomial selection has a very high tendency to report negative skewness in the initial size optimization. Is this to be expected when we move to a larger composite?
Sleepy is offline   Reply With Quote
Old 2011-10-13, 10:45   #2
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

That's not expected at all; can you give an example?
jasonp is offline   Reply With Quote
Old 2011-10-14, 07:59   #3
Sleepy
 
May 2011

23 Posts
Default

Using v1.49 Msieve, I got this result using a RSA768 stage 1 hit during initial optimization. The skew is quite close to 0 after the routine.

Code:
norm 6.9891978e+34 skew 7.533692
-95874395962566283976879517727145211059
+2122472720518479292507
+87840551615038595739331407354941471553
-35337242809776485267112309864495751433
-3856114479612131191693344932738535682
+1020669703390328498616179428060004185
-1563710833682459157669146
-456
+1584
On a RSA896 stage 1 hit, I got this result.

Code:
12 3535308466027048639492187 570107532450270477086892611321876406536982348

norm 1.5492085e+43 skew 3.243352
-570107532450270477086892611321876406536982348
+3535308466027048639492187
-442512866685373337176887522011622592629342461
+1783619376608838135560536234346985353366253293
+130308875328793669965090832180115584266425892
-232742995070824773894200971356601413407818738
+123811301456959059442938409267
+6
+12
When I tried on some RSA1024 hits, most results turn out to have negative skew.

Code:
norm 1.0000000e+50 skew -21968.720494
-1497020286335533624272472469866417926156994242006493
+3060129714738514697695981
-61512974071785246992630097516374839936951467724821651725
+3224769695783746561958228856476108502074711656233260565
-41608949586908930017300382371452135437782031835809457
-123217001150933869214201846566669926297110497087536
-2518560936318891493017569856712661
-2683
+12

norm 1.0000000e+50 skew -142410668.715203
-1333693454126410609876682984700939586082833248126732
+4154713801963641711545561
-1044189187320323999592363385700694274967127262451502777716728767989
+348063062440139044513995512405594594231066311863890859490518423312
-7867305474845877618631173929204554872401418193374191
+490468982860015174126338691966369496986648447647714
-3287287660421733701627942163036395
-433
+24
Using the latest SVN version and -np2 on that same RSA896 stage 1 hit triggered a infinite loop when the skew goes into negative in the routine. :(

So I was wondering if for very large composite there is a greater chance of occurance in getting a negative skew during size optimization.
Sleepy is offline   Reply With Quote
Old 2011-10-14, 10:53   #4
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

I've never tried to get poly selection to work for > 768-bit inputs, I just assumed it would be broken :) It's on my todo list to fix.
jasonp is offline   Reply With Quote
Old 2011-10-14, 16:30   #5
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

DD716 Posts
Default

Actually, there have been many changes since v1.49 was released that hopefully make polynomial selection for these large jobs behave better. Can you retry with the latest SVN trunk?
jasonp is offline   Reply With Quote
Old 2011-10-15, 15:08   #6
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3×1,181 Posts
Default

Okay, I fixed a few problems in the size optimization (the code's notion of 'infinitely large size' needed updating to cope with 1024-bit inputs :)

Try the new latest SVN and see if that works better. By the way, what did you do to get those 1024-bit stage 1 hits, and how long did it take?
jasonp is offline   Reply With Quote
Old 2011-10-17, 06:49   #7
Sleepy
 
May 2011

101112 Posts
Default

I see... Thanks! I will try it now=) Hmmm I think it takes about 2.5hr per hit on average. I was using the SVN version and ran it on my P4 desktop. I have added a 309 digit entry using 1.5E+40 as my bound but it's just a random guess:P

Also is there any reference I can look at for the size optimization algorithm? I would like to read up on it:) By the way the root sieve implemented in Msieve seems to have a lot of changes from the one described in Murphy's thesis, especially for degree 6. Is there a new algorithm?
Sleepy is offline   Reply With Quote
Old 2011-10-17, 13:41   #8
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

354310 Posts
Default

There's no documentation for any of it. Look for slides from the CWI winter workshop, where I'll be giving some details. The short answer is that the algorithm is the same as described by Murphy, but modified to sieve over lattices instead of lines. With degree 6 (and even the larger degree 5 problems), the root sieve would take forever if you didn't limit the search space somehow.

Here's an excerpt from my reply to someone else who asked about this:
Code:
To determine the bounds on the size of a rotation coefficient, the code
chooses a multiple of the bound on the L2 norm of the polynomial and
then looks at how large the coefficient can get before the polynomial
size exceeds that bound. This is tricky since the skew and translation
can be adjusted to delay the inevitable. Different bounds on the
norm can be tried successively so that small rotations that increase
the size only a little can be mixed in with larger rotation coefficients
that affect the polynomial size more.

The code you were reading proceeded one J1 at a time, found the bound
on J0 and ran the root sieve on that J0 'line', continuing to the next
J1 until the J0 line size became too small. In that way the code could
rigorously explore even oddly shaped (i.e. rotated elliptical rather than
rectangular) search spaces. That worked fine until people started
needing huge root sieve problems to be solved, and the code was taking
hours for a single one. Also, degree 6 problems have such a huge
search space that it is infeasible to rigorously search even the
smallest problems in this way.

The updated root sieve works in stages now; the general idea is to
choose a lattice size L that is the product of (powers of) many small
primes, and to sieve over the J0(i) = R + i*L, where R is a residue
class mod L that happens to have many roots modulo the primes dividing
L. In this way we limit the sieve to rotations that have a good root
score modulo the smallest primes, then find a subset that also have
a good root score modulo larger primes.

For degree 4 we only need to do the root sieve over J1=0 and a few R
lattices. For small degree 5 problems we can vary J1 over its entire
range and save the highest-scoring few R lattices, then lattice sieve
over those. For degree 6 and the larger degree 5 problems, this is
still too much work, so L also has to be broken up into several pieces.

For large degree 5 problems we can sieve over an L0 x L0 grid and
find the highest-scoring residue classes R0, then choose a grid size
L1 coprime to L0 and find the highest-scoring residue classes
R1 + i*L1, with the constraint that all such lattice points fall
on R0 + (J0*j+J1*k)*L0 for one of the R0 from the first stage.
In practice L0 is the product of the first few (powers of) small
primes and L1 is the product of the next small primes. The result
is a heap of the best-scoring residue classes R2 modulo L0*L1, where
R2 is formed via the chinese remainder theorem, and each residue
class modulo L0*L1 is sieved in the conventional way. This method
hopefully still finds the best rotations but is L0 times faster
than searching each J1 value with a lattice size of L0*L1.

Degree 6 treats L as L0*L1*L2 but otherwise works the same way.
Here it is important to allow as much precomputation as possible,
because the middle stage has to search an L1 x L1 grid for each
of the high-scoring residue classes in the first L0 x L0 x L0 grid.

Once the actual lattice sieve finds a high-scoring rotation,
it tries to vary the translation and skew to reduce the size
of the resulting polynomial. The rotation then gets a score that
combines the size and root scores, and goes into a heap that
remembers the top 200 such rotations. After the lattice sieving
finishes, any rotations in the heap are applied to the actual
polynomial and the translation and skew are varied to optimize
the Murphy score.

Last fiddled with by jasonp on 2011-10-17 at 14:04
jasonp is offline   Reply With Quote
Old 2011-10-17, 19:39   #9
debrouxl
 
debrouxl's Avatar
 
Sep 2009

2×3×163 Posts
Default

In less than 2h of CPU time on one core of Core 2 Duo T7200 (2 GHz), msieve r646 has found ~2000 polynomials (!) for RSA-1024.

At the beginning, msieve wrote "expecting poly E from 1.11e-21 to 1.28e-21". The best e score is
# norm 4.371900e-24 alpha -9.729326 e 1.702e-22 rroots 4

About one third of the polynomials have skew 1001.00 or 1002.62, and e scores around ~1e-23.
debrouxl is offline   Reply With Quote
Old 2011-10-17, 22:08   #10
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

3·1,181 Posts
Default

The 'expecting poly' line was recently added by Serge, and is a complete extrapolation for inputs this large. Likewise, my sieving bounds are pretty loose so that a lot of hits get generated. I've also walked through some of the choices that stage 1 makes when initializing, and it appears to cut off a lot of the potential search space because stage 1 was specialized to top out to 32-bit numbers a while ago. Running stage 1 for 1024-bit GNFS really needs key pieces of support code updated to handle 48- to 64-bit numbers, and that's quite tricky.
jasonp is offline   Reply With Quote
Old 2011-10-18, 01:02   #11
jrk
 
jrk's Avatar
 
May 2008

3·5·73 Posts
Default

Quote:
Originally Posted by jasonp View Post
because stage 1 was specialized to top out to 32-bit numbers a while ago. Running stage 1 for 1024-bit GNFS really needs key pieces of support code updated to handle 48- to 64-bit numbers, and that's quite tricky.
The restriction means that we need m0/a[d-2] is at most about 2^64. This would be easier to achieve by switching to degree 7, which would likely yield better polynomials for rsa1024 anyway.
jrk is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
gcc optimization/intrinsics R.D. Silverman Factoring 12 2015-09-15 08:51
Program optimization henryzz Programming 17 2012-09-26 13:21
Possible optimization for GIMPS alpertron Math 3 2012-08-13 16:08
NFS Optimization R.D. Silverman Factoring 91 2010-01-24 20:48
ASM Optimization Cyclamen Persicum Hardware 4 2004-05-26 07:51

All times are UTC. The time now is 19:05.


Fri Sep 24 19:05:32 UTC 2021 up 63 days, 13:34, 1 user, load averages: 1.41, 1.75, 1.73

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.