mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   Sieving polynoms in Quadratic Sieve (https://www.mersenneforum.org/showthread.php?t=11180)

ThiloHarich 2008-12-23 13:38

Sieving polynoms in Quadratic Sieve
 
Hi I have a dump question.
In MPQS we determine the sieve interval M of a function q(x) = (ax + b)^2 - N such that the |q(x)| <= N (-M <= x <= M)
Why do we do this?
In fact we consider q'(x) = (ax + b)^2 mod N, which means that the value of q'(x) is limited by N anyway.
Can't we use just one Polynom with a unlimited range M?

Another dump question don't it make sense to look at values q(x) = x^2 mod N which are in the near of sqrt (l * N).
Can't we just look at values q(x) = \ceil {sqrt (l*N)}^2 mod N.
This ensures that the generated values |q(x)| are small:
|q(x)| <= (sqrt (l*N)+1)^2 mod N
= sqrt (l*N)^2 + 2 * sqrt (l*N) + 1 mod N
= l*N + 2 * sqrt (l*N) + 1 mod N
= O(sqrt (N))
So we have a much better chance so find smooth values.
Of coarse we can not sieve here, but we can use the gcd method of bernstein.

jasonp 2008-12-23 16:34

MPQS needs a method to determine when it is better to switch to another polynomial because the sieve values of the current polynomial are getting too large, and hence you are not finding relations fast enough because they are too unlikely. You can sieve each polynomial as much as you want, but eventually you will sieve 100x as long to find 2x as many relations. You'd find more relations in the same time if you switched to another polynomial before that point. Also, knowing in advance how far you want to sieve lets you choose A*X+B so that the sieve forms a parabola whose average value is zero (i.e. has approximately equal numbers of positive and negative values), so that the worst size of polynomial value is two or three bits smaller than you'd otherwise have to deal with.

Your second question is interesting, but I suspect you'll find that you have to perform a lot of computation to sample a large enough arithmetic progression of polynomial values to compare favorably with the number of samples you can get in a modest size sieve.

bsquared 2008-12-23 16:47

forgive me for being so slow, but what does the I refer to in your notation? for instance here:

[quote]sqrt (l * N)[/quote]

fivemack 2008-12-23 18:50

I think his idea is, instead of looking at values of (ax+b)^2 for various carefully chosen a and b, to look at values of ceil(sqrt(xN))^2 for lots of x.

When I try this, the values seem to have much worse smoothness properties than the quadratic sieve; I suspect it's a reinvention of the continued-fraction method without the advantages of using continued-fraction approximants.

My vague belief is that taking numbers that get bigger than N and reducing them mod N leaves them of size very close to N (expectation one bit shorter) while destroying any arithmetic properties that they have.

ThiloHarich 2008-12-23 21:13

@ bsquared I mean the square root of a (integer) multiple of N (maybe I should have used 'i' instead of 'l').
In Fermat we choose x to be near the square root of N.So the result is in the near of N. So mod N the value is small.
Sine we always compute mod N we can look for values which will be in the near of 2*N, 3*N, ....
So the preimage is something like the rounded square root of i*N , i \in N.
For x = \round {sqrt (i*N)} we look for q(x) = x^2 mod N

@jasonp the (intermediate) values q(x) will never exceed N^2 since we always calculate mod N. Of coarse the x might get bigger.
But we start of with x ~ sqrt (N) and the sieve interval is something far below sqrt (N) as well. So we end up here far below N as well.

@ fivemack unfortunately I have only a rough idea of the CFRAC. Has anybody a good and short introduction to it?

jasonp 2008-12-23 21:38

[QUOTE=ThiloHarich;154779]
@jasonp the (intermediate) values q(x) will never exceed N^2 since we always calculate mod N. Of coarse the x might get bigger.
But we start of with x ~ sqrt (N) and the sieve interval is something far below sqrt (N) as well. So we end up here far below N as well.
[/QUOTE]
I wasn't clear about what I meant. If the sieve interval has a million sieve values, you only have to perform a few thousand single-precision operations to determine most of the smoothness properties of all of those sieve values. Your method would require, at the very least, computing the product of all one million multiple-precision sieve values before using Bernstein's remainder tree. Just that would be more work than the sieving. Sieving does work proportional to the factor base size with a very small proportionality constant, factoring the sieve values directly does work proportional to (at best) the number of sieve values.

You [i]can[/i] use both a sieve and a remainder tree on the survivors, and I suspect this would be faster than sieving alone when you allow three or more large primes in relations.

PS: Regarding your original question, it is a misnomer to say that MPQS has "multiple polynomials". Any relation found by MPQS would be found with ordinary QS using the single ordinary QS polynomial. What MPQS really does is sample the ordinary QS polynomial at very widely separated values of x, where the sampling is structured so that the values examined are known to be divisible by some big numbers. This is just like using a lattice sieve for NFS problems, except the lattice is 1-dimensional instead of 2-dimensional.

ThiloHarich 2008-12-24 00:35

@ jasonp on my second question you might be right. There might be a lot of computation to get the corresponding values. Maybe the CFRAC is just an approximation/calculation of the sqrt (i*N) values.
But i think we can reduce the effort.
If we have a sieve intervall M, we can calculate sqrt (N) with arbitrary log (M) after the decimal point. sine M is much smaller then sqrt (N) this can be done easy. This only must be done once.
So we have just the multiplication (of this value) with sqrt (i). If we just use i=j^2 this just means that we have to calculate j * sqrt (N).
If we calculate sqrt (N) * 2^k for some k we can calculate the values easy with the graycode style calculation we did it for calculating the b's for the SIQS.

On my first question I'm not really satisfied with your answer.
I do not see that we get much fewer relation if we have big values for x since we always calculate q(x) mod N. So the values should be 1/2 N in average. we loose just a very small linear factor which should be smaller then 2 (if we might calculate the average size of the resulting factors of the original parabola). Why cant we use just one polynomial?

jasonp 2008-12-24 00:50

[url="http://factorization.blogspot.com/2007/11/cfrac.html"]first google hit for CFRAC[/url]

I'll have to think more about your method, but I still think Tom is right. Bernstein has proposed using a remainder tree for CFRAC too.

bsquared 2008-12-24 03:50

1 Attachment(s)
[quote=ThiloHarich;154805]

On my first question I'm not really satisfied with your answer.
I do not see that we get much fewer relation if we have big values for x since we always calculate q(x) mod N. So the values should be 1/2 N in average. we loose just a very small linear factor which should be smaller then 2 (if we might calculate the average size of the resulting factors of the original parabola). Why cant we use just one polynomial?[/quote]

We can choose the polynomial such that we always find the "null" of the parabola in the sieve interval, to either side of our starting x. Attached is a graph of sieve locations, x, where we find a relation, along with the size in bits of the q(x) from which it came. This was generated from 1 polynomial having ~ 78 million total sieve locations. The average bit value of q(x) is ~ 140 (the input N has size of 232 bits for this example).

If instead, we use 100 polynomials each with 1/100th sized sieve interval, we again sieve over 78 million sieve locations but now the average bit value of q(x) is ~ 132. In practice I find about twice the number of relations in less time than it takes to sieve one giant sieve interval. The lower average size of q(x) occurs because the parabola is not nearly so "spread out" in order to force its null into a huge sieve interval. This will be illustrated in the next post, because I can only attach one thing to a post...

- ben.

bsquared 2008-12-24 03:52

1 Attachment(s)
And the 100 poly case...

So there is no reason why we can't sieve over 1 polynomial. In either case the size of the q(x) is much less than N (slightly bigger than sqrt(N)). The q(x) are just a bit smaller for smaller sieve intervals, and in practice it's faster.

ThiloHarich 2008-12-24 20:51

@ bsquared I think you argue from the fact that we choose a search Intervall M then determine the factor a (of the resulting residues) to be something like sqrt(N)/M for one polynomial.
If we use more polynomials we can reduce M (simply split it over more polynomials) this means that we can increase a, if we rely on the assumption just to search around the roots of the polynomial.

I want to choose 'a' very big independent of M.
If we choose a to be something like N/M we might still get M different values for this polynomial. And we have a known (chosen) big factor a ~ N/M in the residues.
So a unknown factor of a most M remains. Since M is subpolynomial (the running time of the algorithm) this might be easy to determine.
The only issue i see is setting up the a^-1 mod p of the factor base to do sieving.
But we can use a product of the factor base like we did it in SIQS. Setinng up one polynomial should be easier as setting up multiple of it.
So where is my mistake?

Concerning CFRAC, I did not find a concrete description of the algorithm. But what I guess from the theory behind it they try to approximate sqrt (N), which results in some values which are near some i * sqrt (N) as well. But I think they will not hit every i. This is my dump interpretation of it.

On the algorithm I suggested, it might be done like this:
What about calculating s = sqrt (N) in a normal way up to an reasonable precision, and then calculating
q_0 = s
q_i = q_{i-1} + s. This is just a normal addition. Then we round it and find the remainder over the Factorbase?
improvement for i=2*j do q_{2j} = 2* q_j
so this will not cost us much compared with the determination of the factors of this value.

ThiloHarich 2008-12-25 19:27

Of coarse we have to calculate \floor{q_i^2} mod N to determine the factors over the factor base. The squaring is very time consuming.
Maybe we can improve here if we do some restrictions.

ThiloHarich 2009-01-02 08:53

If we have a sieving interval M the resulting size of the values of q(x) is in O(M*sqrt (N)).
For QS M is O(exp (sqrt (log (N) * log log (N))) and the values were bounded by O(N).
So I expect the running time to be decreased only by an constant factor in the exponent (something like sqrt (2)), which is far away from the NFS.
But for the CFRAC the same should hold as well??
I'm going to implement it. I already have a MPQS in place so lets see how it will be compared with it.

ThiloHarich 2009-01-04 18:19

The answer to my first question is rather simple if q(x) is getting greater then N we have to reduce it by applying mod N on the result. But then we can not be sure that 'a' still divides q(x), so we have to limit M. and choose different a's.


All times are UTC. The time now is 11:05.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.