mersenneforum.org Using p=2 for sieving (Quadratic sieve algorithm)
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2007-02-27, 16:34 #1 R1zZ1   Feb 2007 5 Posts Using p=2 for sieving (Quadratic sieve algorithm) Hi guys, I'm trying to implement the Quadratic Sieve algorithm (MPQS) and I don't know how to use p=2 in sieving, because I noticed that without this prime in factor base I have not enough B-smooth relations and my algorithm is too much slow. The problem is that the Hensel Lemma doesn't hold with p=2. Thanks in advance
2007-02-27, 16:59   #2
jasonp
Tribal Bullet

Oct 2004

24×13×17 Posts

Quote:
 Originally Posted by R1zZ1 I'm trying to implement the Quadratic Sieve algorithm (MPQS) and I don't know how to use p=2 in sieving, because I noticed that without this prime in factor base I have not enough B-smooth relations and my algorithm is too much slow. The problem is that the Hensel Lemma doesn't hold with p=2.
You can avoid sieving with 2 entirely, make the cutoff for smooth relations more generous to compensate, then attempt to trial divide by 2 every relation whose log meets the cutoff. In fact, doing this for all primes smaller than a given bound (i.e. 30) will make the entire sieving stage a little (possibly a lot) faster.

jasonp

 2007-02-27, 17:52 #3 akruppa     "Nancy" Aug 2002 Alexandria 2,467 Posts Dear Jason, does Msieve sieve with powers of small primes, i.e. sieve with p^k > 30 even though p<30 ? If yes, does doing vs. not doing it have an appreciable effect on yield? Thanks, Alex
2007-02-27, 18:15   #4
R.D. Silverman

Nov 2003

22×5×373 Posts

Quote:
 Originally Posted by jasonp You can avoid sieving with 2 entirely, make the cutoff for smooth relations more generous to compensate, then attempt to trial divide by 2 every relation whose log meets the cutoff. In fact, doing this for all primes smaller than a given bound (i.e. 30) will make the entire sieving stage a little (possibly a lot) faster. jasonp
My code did the following:

Before one sieves, one initializes an array to 0. As part of the initialization
procedure, one can simply set the relevant bytes in this array to log 2 [scaled, of course] instead of 0. This takes no extra time.

2007-02-27, 18:25   #5
jasonp
Tribal Bullet

Oct 2004

DD016 Posts

Quote:
 Originally Posted by akruppa does Msieve sieve with powers of small primes, i.e. sieve with p^k > 30 even though p<30 ? If yes, does doing vs. not doing it have an appreciable effect on yield?
The QS implementation does not sieve with any primes less than 256, and makes the log-cutoff 30 bits smaller to compensate. The powers of all the small primes are divided out of relations that meet the cutoff, and the log value is incremented to reflect the factors that a sieve value is known to have. Then the resulting log value is compared to the unmodified cutoff. No sieving with powers takes place; it would complicate the calculation of sieve roots, and the overwhelming majority of duplicate factors will be of primes less than 256 anyway.

The 'pad' of 30 was chosen so that the yield found in a few test factoizations came out to be about the same as the yield when small primes are not skipped. I wouldn't be surprised if more tuning was needed; the decision was made years ago and msieve is very different now. The bound of 256 is a lot larger than is usual for the small prime variation; it had to be that big because trial division by primes < 2^17 uses integer reciprocals, and the reciprocal values could not be computed for the small primes and still fit in 32 bits. Profiling shows that the smallest primes never take more than 10% of the total runtime (this tends to 0% for factorizations above about 70 digits), so it's okay to be conservative and test lots of sieve values. Currently about 1/3 of the sieve values making the first cutoff also make the second cutoff.

The NFS code uses resieving, and in that case it's important that as few relations as possible survive the log cutoff. Switching sieve roots is also easy, thus the NFS code sieves with all p^k for p^k < 1400

jasonp

2007-02-27, 19:05   #6
R1zZ1

Feb 2007

1012 Posts

Quote:
 Originally Posted by jasonp You can avoid sieving with 2 entirely, make the cutoff for smooth relations more generous to compensate, then attempt to trial divide by 2 every relation whose log meets the cutoff. In fact, doing this for all primes smaller than a given bound (i.e. 30) will make the entire sieving stage a little (possibly a lot) faster. jasonp
Thx a lot!

Can you approximately indicate me how much i have to increase the cutoff to compensate avoiding of all primes smaller than 30?

Last fiddled with by R1zZ1 on 2007-02-27 at 19:07

2007-02-28, 01:13   #7
jasonp
Tribal Bullet

Oct 2004

24×13×17 Posts

Quote:
 Originally Posted by R1zZ1 Can you approximately indicate me how much i have to increase the cutoff to compensate avoiding of all primes smaller than 30?
The smart way is to estimate the average contribution of each ignored prime to a random sieve value, and to increase the bound by the total contribution of all the skipped primes. For prime p with two sieve roots (i.e. most of them), it's easy to show that the contribution of p to a random sieve value is 2*log2(p)/(p-1).

However, I don't think you should use the smart way. What you want is the average contribution to a random *smooth* sieve value, and these contain a higher proportion of small p compared to random sieve values. Nobody has looked seriously at approximating this, so in the end it likely is preferable to decide on the amount of 'padding' by experiment.

Good luck,
jasonp

 2007-03-02, 21:59 #8 R1zZ1   Feb 2007 5 Posts YES! Thank you for helping me and my group to do this very very interesting project! I've founded a lot of interesting threads on this forum that helped us to do a good implementation of quadratic sieve algorithm. Now my mpqs implementation works and i was able to factorize 40+ digits very fast. The bottleneck is linear algebra solved with matker of pari/gp library: 10 minutes for ~1500 relations... Hence I'm forced to maintain the limit of factor base (B) very small to have a little number of relations and speed up linear algebra phase. Any ideas to solve this without implementing block lanczos myself (probably too much hard for my capabilities) and so factorize more digits faster? This project is for university course of number theory held by RenĂ© Schoof at University of Tor Vergata, Rome (Computer Engineer). My teacher is famous, do you know him? http://www.mat.uniroma2.it/~schoof/ Last fiddled with by R1zZ1 on 2007-03-02 at 22:26
2007-03-02, 23:42   #9
jasonp
Tribal Bullet

Oct 2004

67208 Posts

Quote:
 Originally Posted by R1zZ1 Now my mpqs implementation works and i was able to factorize 40+ digits very fast. The bottleneck is linear algebra solved with matker of pari/gp library: 10 minutes for ~1500 relations... Any ideas to solve this without implementing block lanczos myself (probably too much hard for my capabilities) and so factorize more digits faster?
I've heard that pari's linear algebra is slow, but 10 minutes for a small problem like that is crazy. It should be taking milliseconds. If you can use C code, you're welcome to borrow mine, or any of the publicly available implementations in this forum. Also google for 'SIMPQS' for another MPQS implementation in the development stage, that has integrated the block lanczos code from msieve.

jasonp

2007-09-11, 19:59   #10
smoking81

Sep 2007

1210 Posts

Quote:
 Originally Posted by jasonp The smart way is to estimate the average contribution of each ignored prime to a random sieve value, and to increase the bound by the total contribution of all the skipped primes. For prime p with two sieve roots (i.e. most of them), it's easy to show that the contribution of p to a random sieve value is 2*log2(p)/(p-1). However, I don't think you should use the smart way. What you want is the average contribution to a random *smooth* sieve value, and these contain a higher proportion of small p compared to random sieve values. Nobody has looked seriously at approximating this, so in the end it likely is preferable to decide on the amount of 'padding' by experiment. Good luck, jasonp
hi!
Can you please help me with an example?

Let's suppose that my (60-digit) n is = 340282366920938463463300000000009999999999974607431768211457,
then the primes < 30 in my FB are -1 2 23 29. Furthermore I extimated that
P(=#|FB|)=3000 and M=300'000

Which cutoff for searching smooth relations would you advise in this case?
Is it possible to put it in a formula (eg: log (M*sqrt(N))+??) or otherwise how would you suggest to calculate the padding?
thanks really a lot!bye

2007-09-11, 20:58   #11
bsquared

"Ben"
Feb 2007

19×179 Posts

Quote:
 Originally Posted by smoking81 hi! Can you please help me with an example? Let's suppose that my (60-digit) n is = 340282366920938463463300000000009999999999974607431768211457, then the primes < 30 in my FB are -1 2 23 29. Furthermore I extimated that P(=#|FB|)=3000 and M=300'000 Which cutoff for searching smooth relations would you advise in this case? Is it possible to put it in a formula (eg: log (M*sqrt(N))+??) or otherwise how would you suggest to calculate the padding? thanks really a lot!bye
I'll chime in...
First a comment:

The size of your FB (3000) and M seem small, at least compared with what my implementation uses. But from everything I've read, these can be very implementation specific. If I force mine to use your values, it is about 2x slower.

The small prime pad I get from trial sieving a couple blocks using only the small primes in the FB (i.e. < 50), and finding the average and standard deviation afterwards and adding them together to get the pad (in bits). (I read this method somewhere). It seems to work ok, but it also doesn't change much and so is probably not necessary.

 Thread Tools

 Similar Threads Thread Thread Starter Forum Replies Last Post mickfrancis Factoring 2 2016-05-06 08:13 mickfrancis Factoring 5 2016-03-31 06:21 Sam Kennedy Factoring 20 2013-01-09 16:50 ThiloHarich Factoring 13 2009-01-04 18:19 ThiloHarich Factoring 5 2006-07-14 09:51

All times are UTC. The time now is 12:30.

Sat Apr 10 12:30:58 UTC 2021 up 2 days, 7:11, 1 user, load averages: 1.66, 2.05, 2.12

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.