mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2012-11-04, 02:27   #1
Sam Kennedy
 
Sam Kennedy's Avatar
 
Oct 2012

1228 Posts
Default Quadratic Sieve by Hand

I'm trying to figure out how the quadratic sieve works by doing it by hand. I'm pretty sure I'm just making stuff up, this is what I have figured out so far:

Let n = 703
k = floor(sqrt(703)) = 26
the upper limit = floor(sqrt(703*2)) = 37

Then I calculated r^2 - n, using the values 27 to 37.
27 -> 26
28 -> 81
29 -> 138
30 -> 197
31 -> 258
32 -> 321
33 -> 386
34 -> 453
35 -> 522
36 -> 593
37 -> 666

Next I factored each of the above numbers:
26 = 2^1 * 13^1
81 = 3^4
138 = 2^1 * 3^1 * 23^1
197 = prime
258 = 2^1 * 3^1 * 43^1
321 = 3^1 * 107^1
386 = 2^1 * 193^1
453 = 3^1 * 151^1
522 = 2^1 * 3^2 * 29^1
593 = prime
666 = 2^1 * 3^2 * 37^1

For my factor base I will use the primes up to 107, so I exclude any numbers with prime factors greater than that, this gives the vector matrix:

1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
1,2,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

I then calculate each of the values in that matrix mod 2, giving:

1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

Now I'm unsure of what to do next, or if any of the above is correct?

If someone could give an example of factoring a number using the quadratic sieve, keeping the number theory to a minimum, that would be really helpful
Sam Kennedy is offline   Reply With Quote
Old 2012-11-04, 11:44   #2
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

874210 Posts
Default

you have to add them mod 2 componentwise in all combinations possible until you get a sum which is 0 (vector 0). You already have this in the second vector, giving you the fact that 28^2=3^4, or (28-9)(28+9)=0 mod 703 and you just factored your number.

edit: this "add them componentwise" can be done in a more or less cleverer way (imagine when you solve systems of equations by reduction)

Last fiddled with by LaurV on 2012-11-04 at 11:47
LaurV is offline   Reply With Quote
Old 2012-11-04, 17:05   #3
Sam Kennedy
 
Sam Kennedy's Avatar
 
Oct 2012

1228 Posts
Default

Did I do everything else correctly?

I'm not sure what 'adding componentwise' is, I haven't studied Maths past high school level, and I've forgotten the majority of what I learned anyway.

Are you hinting at Gaussian Elimination? That's the only method I've read about for the last step, I don't understand how it works though :(
Sam Kennedy is offline   Reply With Quote
Old 2012-11-04, 18:29   #4
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

Quote:
Originally Posted by Sam Kennedy View Post
Did I do everything else correctly?

I'm not sure what 'adding componentwise' is, I haven't studied Maths past high school level, and I've forgotten the majority of what I learned anyway.

Are you hinting at Gaussian Elimination? That's the only method I've read about for the last step, I don't understand how it works though :(
Yes, this is the extended equivalent of solving a system of linear equations.

(Note: The standard convention both on Wikipedia and in mathematics in general is to write the vectors as columns in the matrix, not as rows as you did. Such is used in the following links.)

In the case of sieving methods of factorization (QS, NFS), what you do in general is find a combination of those vectors which sum to zero, corresponding to a congruence of squares from which the number factorizes.

This is the realm of linear algebra, typically a first or second year course for a physics or math undergraduate student. (It's the extended study of systems of linear equations.) You can either do that with (a variant of) Gaussian elimination, reducing the matrix to reduced row echelon form, from which the nullspace can be read off trivially.

As LaurV pointed out, the second row in the matrix is a vector in the nullspace, i.e. take the linear combination of (1*second vector + 0*all others) and you get zero. That means that linear combination corresponds to a square number that factors completely over the factor base. In this particular case, this means that 1*3^4 + 0*others is the square of a number (3^2=9) that factors completely over the primes <= 107. Since you initially calculated 28^2==81[703], this means that (28)^2 == (3^2)^2 [703], and thus 28^2 and 9^2 form a congruence of squares, whence their sum and difference, 19 and 37, are the factors of n.

In general, though, you will need more vectors (i.e. more smooth numbers over the factor base) than primes in the factor base to guarantee that some combination sums to zero; in this case, since your factor base has 28 primes, you'd need 29 numbers factored over the factor base to guarantee the existence of a null vector, though obviously in this case in wasn't necessary.

Additionally, to expand on what LaurV meant by "more or less cleverly", Msieve doesn't use Gaussian elimination, but rather the Block Lanczos method, which takes advantage of the fact that 1) the matrix is only composed of zeros and ones, not any real number, and 2) the matrix tends to be very, very sparse (i.e. composed almost entirely of zeros with very few ones as entries).

Last fiddled with by Dubslow on 2012-11-04 at 18:33 Reason: adding a link
Dubslow is offline   Reply With Quote
Old 2012-11-04, 23:00   #5
Sam Kennedy
 
Sam Kennedy's Avatar
 
Oct 2012

10100102 Posts
Default

I'm trying to understand the gaussian elimination process, the description I was given is "Starting with the first column on the left hand side, we find the first string with a 1 in that column and add it modulo 2 to each of the succeeding strings with a 1 in that column. We then eliminate this first string with a 1 in the first column"

So given the following (matrix on left, identity matrix on right):
00110 100000
11010 010000
11101 001000
10000 000100
10000 000010
11101 000001

I find the second row is the first to have a 1 in the first column, I add this row mod 2 to the row below: 11010 + 11101 mod 2 = 00111
Then I do the same with the identity matrix: 010000 + 001000 mod 2 = 011000, and repeat for each of the rows below with a 1 in the first column, after the first step I have:

00110 100000
00000 010000
00111 011000
01010 010100
01010 010010
00111 010001

Is that correct?

Also, I wasn't sure what to do after you have added two rows to cancel each other out. I have my table (as in my original post):

r -> r^2 - n

I know if I have a single 0 vector, I can calculate the factors as: left hand side +/- square root of right hand side, but what about when I have to use more than one row?
Sam Kennedy is offline   Reply With Quote
Old 2012-11-04, 23:18   #6
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by Sam Kennedy View Post
I'm trying to understand the gaussian elimination process, the description I was given is "Starting with the first column on the left hand side, we find the first string with a 1 in that column and add it modulo 2 to each of the succeeding strings with a 1 in that column. We then eliminate this first string with a 1 in the first column"

So given the following (matrix on left, identity matrix on right):
00110 100000
11010 010000
11101 001000
10000 000100
10000 000010
11101 000001

I find the second row is the first to have a 1 in the first column, I add this row mod 2 to the row below: 11010 + 11101 mod 2 = 00111
Then I do the same with the identity matrix: 010000 + 001000 mod 2 = 011000, and repeat for each of the rows below with a 1 in the first column, after the first step I have:

00110 100000
00000 010000
00111 011000
01010 010100
01010 010010
00111 010001

Is that correct?
You put the sum of the second and third rows into the third row, the sum of the second and fourth rows into the fourth row, etc., leaving the second row as is. Then you put the second row on top.

It should look like:

11010 010000
00110 100000
00111 011000
01010 010100
01010 010010
00111 010001

Quote:
Originally Posted by Sam Kennedy View Post
Also, I wasn't sure what to do after you have added two rows to cancel each other out. I have my table (as in my original post):

r -> r^2 - n

I know if I have a single 0 vector, I can calculate the factors as: left hand side +/- square root of right hand side, but what about when I have to use more than one row?
In general, the ith vector represents the parity of the power of each prime-from-the-factorbase in the factorization of the number ri2 mod n. Thus, if after finding the nullspace of the matrix, you see that v1 + v3 + v4 + v9 + ... = the zero vector (or whichever vectors are the solution) then that means that the sum of those power-of-primes is even, which means that when you multiply those r12 * r32 * r42 * r92 * ... = Y, that is a square number. Then reduce Y mod n. If X = r1 * r3 * r4 * r9 * ... (reduced mod n), then your congruence is X^2 == Y mod n, where Y is (by construction) a square number. Then X+sqrt(Y) and X-sqrt(Y) are possible factors. (I got my knowledge of such things from section 1.2 of this paper.)

Last fiddled with by Dubslow on 2012-11-04 at 23:29 Reason: I'm pretty sure this is correct
Dubslow is offline   Reply With Quote
Old 2012-11-04, 23:59   #7
Sam Kennedy
 
Sam Kennedy's Avatar
 
Oct 2012

5216 Posts
Default

Thank you very much for taking the time to explain all of this, I'm finally getting to grips with how this sieve works! :)

Now I just need to find some paper and write down everything before I forget!
Sam Kennedy is offline   Reply With Quote
Old 2012-11-05, 00:18   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

See also the paper referenced here. (It's about SIQS though, not plain QS, and is probably out of your league, as it's out of mine at the moment.)

For teh lulz: in the paper, he mentions a 3.5M matrix being somewhat impratical, while my computer is currently doing a 4.16M matrix, using 1.5 GB of memory, in about a day.

Last fiddled with by Dubslow on 2012-11-05 at 00:35 Reason: lol
Dubslow is offline   Reply With Quote
Old 2012-11-05, 19:07   #9
Sam Kennedy
 
Sam Kennedy's Avatar
 
Oct 2012

5216 Posts
Default

Quick question, using the matrix I posted above, should the final result be (after first pass):

00001 111000
00110 100000
11100 110000
10000 000100
10000 000010
00000 001001
Sam Kennedy is offline   Reply With Quote
Old 2012-11-07, 00:19   #10
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by Sam Kennedy View Post
Quick question, using the matrix I posted above, should the final result be (after first pass):

00001 111000
00110 100000
11100 110000
10000 000100
10000 000010
00000 001001
I'm not sure what you mean by "pass". If you mean after eliminating the ones in the first column, then it should look like what I posted a few above, i.e.

11010 010000
00110 100000
00111 011000
01010 010100
01010 010010
00111 010001.

Or have you made other progress since this post?
Dubslow is offline   Reply With Quote
Old 2012-11-07, 02:26   #11
Sam Kennedy
 
Sam Kennedy's Avatar
 
Oct 2012

2·41 Posts
Default

By one pass I meant after eliminating the fifth column.

I have written my own version of the quadratic sieve, but the code is horrible and it barely works. Instead of moving the row to the top, I just changed all the values to -1 and ignored it, probably why it doesn't work.

The data collection stage is also really slow, as I test each number for smoothness, attempting to factor it over the factor base, even with a factor base as small as 100, it is incredibly slow.

I'm going to start again from scratch in a few days, this has to be by far the most challenging thing I've attempted to code.

Here is a summary of what my program does:
-Input n
-
Input smoothness bound (upper limit for factor base)
-Use sieve of Eratosthenes to create factor base
-Starting at the ceiling of the square root of n (let's call the value r), calculate r^2 mod n
-
Divide r repeatedly by each factor in the factor base, if the value is 1, save r, otherwise increment and start again, until there are as many values of r as in the factor base, plus one
-Create vector matrix
-Do my horrible version of gaussian elimination
-Check for zero vectors, then attempt to factor, if no zero vectors, do gaussian elimination again
-Use gcd to factor number

Somehow, my code works on small numbers... but I'm assuming the way I'm doing gaussian elimination and collecting values for r are what is causing the program to fail.
-
Sam Kennedy is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Java Quadratic Sieve Ilya Gazman Factoring 3 2016-02-22 11:32
Finding B in Quadratic Sieve paul0 Factoring 3 2011-09-22 17:12
Possible improvement of quadratic sieve Random Poster Factoring 4 2010-02-12 03:09
Factoring in the Quadratic Sieve ThiloHarich Factoring 47 2007-01-08 14:12
Quadratic Sieve in wikipedia.de ThiloHarich Factoring 5 2006-07-14 09:51

All times are UTC. The time now is 22:12.

Sat Sep 26 22:12:26 UTC 2020 up 16 days, 19:23, 0 users, load averages: 1.55, 1.55, 1.52

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.