mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Math (https://www.mersenneforum.org/forumdisplay.php?f=8)
-   -   prime factorization algorithms? (https://www.mersenneforum.org/showthread.php?t=14803)

MathBoy 2011-01-22 18:39

ok in your paper

I am having trouble understanding

(iv) and (v)

what does add the value of [log(pi)] have to do with anything?
I get that if r1, r2 are the 2 solutions to Q(x) = 0 mod p_i then their exist other solutions of the form r1+kp_i for all k belonging to Z.

I also get how to find a factor bases quickly using Legendre symbol and using Tonelli–Shanks algorithm or Pocklington's algorithm to find the y value in y^2 = n mod p for the factor bases.

(v) I get M*sqrt(N) approx by the fact that if Q(x) = (x - [sqrt(n)])^2 - n so when x is small we get Q(x) approx = 2*x*[sqrt(n)]

Don't get where we are getting these log calculations in step (iv) and (v).

in the example here [URL="http://en.wikipedia.org/wiki/Quadratic_sieve"]http://en.wikipedia.org/wiki/Quadratic_sieve[/URL]
Data collection part

I get this completely they are using no log calculations and just look for the entries to be 1's in V array. This would imply that those V(x) elements = 1 factor completely in the factor bases. Which they call B-smooth.

But going by this how do we know we are going to have enough B-smooth numbers such that we can get FB+1 relations to make the matrix have non-trivial solutions. This would depend on if we chose are B bound and M sieve interval correctly.

Either way I would like to know how these log calculations in your paper relate to the example in wikipedia seems their doing something completely different in a few stages of the data collection part.

R.D. Silverman 2011-01-23 00:40

[QUOTE=MathBoy;248420]ok in your paper

I am having trouble understanding

(iv) and (v)

what does add the value of [log(pi)] have to do with anything?
I get that if r1, r2 are the 2 solutions to Q(x) = 0 mod p_i then their exist other solutions of the form r1+kp_i for all k belonging to Z.

[/QUOTE]

suppose that q(x) := Ax^2 + 2Bx + C = a mod n for x = x_0.

We want a to factor over the factor base. If

a = p1*p2*p3 * ....

Then log(a) = log(p1) + log(p2) + log(p3) .....

The sieve accumulates logs of the prime p_i at locations x_j such
that q(x_j) = 0 mod p_i. i.e. p_i divides a. Thus, by accumulating the
logs of the primes in the "right locations", you can attempt to factor
many values of q(x) all at once.

[QUOTE]

But going by this how do we know we are going to have enough B-smooth numbers such that we can get FB+1 relations to make the matrix have non-trivial solutions. This would depend on if we chose are B bound and M sieve interval correctly.

Either way I would like to know how these log calculations in your paper relate to the example in wikipedia seems their doing something completely different in a few stages of the data collection part.[/QUOTE]

My paper gives values of B that work for various sized composites.

As for how they were first found? Experimentation.

MathBoy 2011-01-23 02:05

[QUOTE]suppose that q(x) := Ax^2 + 2Bx + C = a mod n for x = x_0.

We want a to factor over the factor base. If

a = p1*p2*p3 * ....

Then log(a) = log(p1) + log(p2) + log(p3) .....

The sieve accumulates logs of the prime p_i at locations x_j such
that q(x_j) = 0 mod p_i. i.e. p_i divides a. Thus, by accumulating the
logs of the primes in the "right locations", you can attempt to factor
many values of q(x) all at once.
[/QUOTE]

confused, I get the properties of logs ,...etc but when I look at the example given on Wikipedia the data collection step uses nothing about logs only the elements in V that are 1.

I guess I don't see where these logs are and why we are using them

sorry if I am not seeing the obvious.

axn 2011-01-23 02:54

[QUOTE=MathBoy;248584]confused, I get the properties of logs ,...etc but when I look at the example given on Wikipedia the data collection step uses nothing about logs only the elements in V that are 1.

I guess I don't see where these logs are and why we are using them

sorry if I am not seeing the obvious.[/QUOTE]

[url]http://en.wikipedia.org/wiki/Quadratic_sieve#Checking_smoothness_by_sieving[/url]

It clearly states where the logs are being added.

MathBoy 2011-01-23 05:19

well where are they using log(p) stuff in this example.

[URL="http://en.wikipedia.org/wiki/Quadratic_sieve#Example_of_basic_sieve"]http://en.wikipedia.org/wiki/Quadratic_sieve#Example_of_basic_sieve[/URL]

A[] I am assuming is the V in this example

WraithX 2011-01-23 05:44

[QUOTE=MathBoy;248634]well where are they using log(p) stuff in this example.

[URL="http://en.wikipedia.org/wiki/Quadratic_sieve#Example_of_basic_sieve"]http://en.wikipedia.org/wiki/Quadratic_sieve#Example_of_basic_sieve[/URL]

A[] I am assuming is the V in this example[/QUOTE]

I don't know where you got A[], but let's use it for this example. In that wikipedia page, they are dividing the entries in V[] by the primes in the factor base to see if that entry in V[] becomes 1.

Another way to do this is to set all the values in the array (lets call our array A[]) to zero. Then, for the prime 2 (from the example factor base), we find the first relation that has 2 as a factor. (each relation corresponds to one location in A[]) Then for that location in A[] (lets call it x), we increment A[x + 2*k] (0 <= k <= len(A)/2) by log(2). Then, for the prime 17 (the next prime in the example factor base), we find the first relation that has 17 as a factor. Then for that location in A[] (lets call it y), we increment A[y + 17*j] (0 <= j <= len(A)/17) by log(17). etc. Then, at the end, we will have an array A[] that will have some zero entries (not smooth at all, according to the factor base), and then all the other entries will represent different levels of smoothness. At this point, we choose a minimum smoothness to see which entries from A[] correspond to which relations we want to keep. So, if we want the entries in A[] to be > 5.7 (pick a number here, another algorithm for this), we search through the array to see which entries are > 5.7, then we know which relations to keep.

MathBoy 2011-01-23 07:10

OK , but if you do it that way then you don't know if the nonzero entries can be completely factored over your factor bases.

Since on the Wikipedia example a 1 in V indicates that element in V was completely factored over your factor bases.

How do you know this if you use log addition method?

I would assume the only way is to sum up all the log(p_i) in your factor bases
then search in your A[] for entries = to sum over all FB element log(p_i)

If this is how it's done then I get it.

jasonp 2011-01-23 15:14

Using logarithms is a performance optimization; you have to factor a huge number of integers in a time much faster than trial dividing each one.

If you know that a small prime p divides a number x, you also know that it cannot divide the next p-1 numbers after x. This means you can skip trial division by p for a while. That's a long-winded way of describing a sieve procedure.

The sieve needs one entry for each x. You need to find which of the x is almost completely a product of small primes p. You could make each entry an integer and multiply each x by every p when p divides that integer, or you can work in logarithms and perform an addition of log(p). The latter is much faster and more space-efficient than the former, and is what give the sieve based methods a great deal of their power.

MathBoy 2011-01-23 16:37

Ok, but what I am trying to get at is in the example on Wikipedia you know that a entry 1 in V means the number completely factored in FB.

How do you know the equivalent when using log's?

Using a certain bound 5.5 > doesn't mean those numbers in the A[] having this property completely factor in FB.


I do understand using logs is better computationally saves time and space though.
Just don't understand how you determine which elements completely factor in FB.

Also in the wikipedia example a entry 1 in V means the element completely factored but this will only catch elements that are products of primes in the set and not products of primes with powers different then 1. So it will give you only at most 2^pi(d) numbers in V that completely factor, where d is your randomly chosen bound. Their could be tons more that have higher powers ...etc

axn 2011-01-23 17:34

[QUOTE=MathBoy;248731]Ok, but what I am trying to get at is in the example on Wikipedia you know that a entry 1 in V means the number completely factored in FB.

How do you know the equivalent when using log's?

Using a certain bound 5.5 > doesn't mean those numbers in the A[] having this property completely factor in FB.


I do understand using logs is better computationally saves time and space though.
Just don't understand how you determine which elements completely factor in FB.
[/QUOTE]

For the love of god, read
[QUOTE="the wiki page"]At the end of the factor base, any A[] containing a value [B]above a threshold of roughly log(n)[/B] will correspond to a value of y(x) which splits over the factor base. The information about exactly which primes divide y(x) has been lost, but it has only small factors, and there are many good algorithms (trial division by small primes, SQUFOF, Pollard rho, and ECM are usually used in some combination) for factoring a number known to have only small factors.[/QUOTE]
My bold.

fivemack 2011-01-23 18:52

[QUOTE=MathBoy;248731]Ok, but what I am trying to get at is in the example on Wikipedia you know that a entry 1 in V means the number completely factored in FB.

How do you know the equivalent when using log's?

Using a certain bound 5.5 > doesn't mean those numbers in the A[] having this property completely factor in FB.
[/QUOTE]

That's fine; the point for all factor-base algorithms is that, if you've picked B correctly, you're looking for things of which there exist significantly more than the number you need. So it's worth making rough-and-ready choices to make the sieving faster, even at a price of sometimes getting false negatives and positives.

You take all the sievepoints that score highly enough, and then factor the associated number by trial division (or by some more clever method - a lot of the interesting bits of the GNFS sievers are in the choice of that method); it may be that it doesn't factor completely over the factor base. If it does factor completely, use it; if it doesn't, go on to the next high-scoring sievepoint.

The really good trick here is the large-primes variant - you sieve for things that have some factors in the factor base, then do a full factorisation by a cleverer method, and *add to the factor base* the small-enough extra factors you find that lie outside it. You can handle them specially rather than by full dense-matrix methods (for instance, if a prime appears only once in the entire job you just throw it away), and this technique turns out to be absolutely essential for getting adequate performance out of NFS and QS.


All times are UTC. The time now is 05:34.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.