mersenneforum.org GF(2) Matrix request
 Register FAQ Search Today's Posts Mark Forums Read

 2007-11-06, 15:13 #1 oslik   Nov 2007 22 Posts GF(2) Matrix request Hi everyone. I'm now in a process of writing and testing the linear algebra step for GNFS (or QS), whoose goal is to solve linear system on GF(2). My realization is a realization of Block-Coppersmith algorithm and is intended for use on parallel computers (using MPI-interface). Block-Coppersmith can be parallelized much better, than Block-Lanscoz, which is now used in GGNFS and msieve programs, so that may be a good option for handling really large numbers. As I understood from this forum, many of you are doing hard job in factoring large integers using GGNFS or msieve programs, sometimes sieving takes a very long time. My question is, that if someone can give me some test matrices for linear algebra step, I will appreciate it very much. I hope, that after some testing I will be able to present a version of Block-Coppersmith for clusters and maybe even for distributed computations (I have some ideas and progs for that). So, I ask for matrices, of any size :-) Thank you in advance, Ivan.
2007-11-06, 16:35   #2
jasonp
Tribal Bullet

Oct 2004

5×709 Posts

Quote:
 Originally Posted by oslik My question is, that if someone can give me some test matrices for linear algebra step, I will appreciate it very much. I hope, that after some testing I will be able to present a version of Block-Coppersmith for clusters and maybe even for distributed computations (I have some ideas and progs for that). So, I ask for matrices, of any size :-)
I hope you have lots of disk space :)

The largest matrix I have on disk has size about 4M, which is only moderately large considering the size of current NFS efforts. With enough compression it should fit on a single CD. Send me email to coordinate delivery if you're interested; others here can offer far larger matrices

2007-11-06, 16:54   #3
fivemack
(loop (#_fork))

Feb 2006
Cambridge, England

33×239 Posts

Quote:
 Originally Posted by oslik So, I ask for matrices, of any size :-) Thank you in advance, Ivan.
In what format do you want these matrices? Tens of millions of lines of 'X Y\n'?

And what sort of size are you really after? The standard processing programs tend not to write out matrices in full, though obviously they could be persuaded to; a large factorisation I have lying around is

Code:
Constructed a 2916896 x 2917143 matrix of weight 202031153 in 82 minutes
Solved in 23.5 hours (two CPUs), using 1.5G memory
but writing out that matrix would take most of a CD in compressed form. I have significantly smaller ones available, too, from 120 rather than 206-digit SNFS.

2007-11-06, 18:57   #4
jasonp
Tribal Bullet

Oct 2004

5×709 Posts

Quote:
 Originally Posted by fivemack In what format do you want these matrices? Tens of millions of lines of 'X Y\n'? And what sort of size are you really after? The standard processing programs tend not to write out matrices in full, though obviously they could be persuaded to; a large factorisation I have lying around is
Since the latest msieve version does checkpoint and restart, the whole matrix is in fact written to disk, in a completely undocumented binary format.

Actually, maybe it would be a valuable exercise to document all the on-disk formats of the intermediate files. That would let other programs interoperate, and it's a big win to use the same relation format as GGNFS :)

 2007-11-06, 20:27 #5 henryzz Just call me Henry     "David" Sep 2007 Liverpool (GMT/BST) 176616 Posts would it be possible for someone to write a parellized siever as well so all the slowest bits can be shared
 2007-11-06, 21:18 #6 akruppa     "Nancy" Aug 2002 Alexandria 2,467 Posts Sieving already is embarrasingly parallel. Just assign different ranges of b-values or special-q values to different machines. Alex
 2007-11-06, 21:41 #7 frmky     Jul 2003 So Cal 5·479 Posts I can send a 6.4M matrix your way once I know the format you want, and how to put an msieve matrix in that format. Perhaps msieve 1.29's binary format would be easiest. Greg Last fiddled with by frmky on 2007-11-06 at 21:42
 2007-11-07, 08:17 #8 oslik   Nov 2007 1002 Posts Thank you for you replies. I've PM-ed everyone with details :-) The format I'm looking for can be any format I can read from file and convert in to GGNFS matrix format (I looks very convenient for handling sparse matrices). I'll report the results as soon as they are available and hope to present them and even the code :-)
 2007-11-07, 14:51 #9 jasonp Tribal Bullet     Oct 2004 1101110110012 Posts Greg, the easiest way to produce the matrix file is to rerun the filtering using msieve 1.29, then start the linear algebra. This will produce msieve.dat.mat; for linear algebra experiments that will not use the dependencies that are generated, you only need this file. Apologies that this will take hours though :( In case anyone is interested, here's the binary format (it's really easy): The matrix file is an array of 32-bit words, with byte ordering determined by the machine that produced the matrix (it will be little-endian for just about everybody). The nonzero matrix entries are stored by column, because they are easy to generate that way. The first word in the file is the number of rows, the second is the number of dense rows ('D'), and the third is the number of matrix columns. Each column is then dumped out. For each column, there is a 32-bit word indicating the number of sparse entries in the column, then a list of that many nonzero column positions (each a 32-bit word, with words in arbitrary order), then (D+31)/32 words for the dense rows in that column. The latter should be treated as an array of D bits in little-endian word order. Rows are numbered from 0 to (number of rows - 1). Rows 0 to D-1 are considered dense, with a set bit in the bitfield indicating a nonzero entry for that column. The sparse entries in each column are numbered from D onward. Functions dump_matrix and read_matrix, in gnfs/gf2.c, give sample code for writing to / reading from the matrix file. Does the above make sense? Does anyone think it would be a good idea to compress the above somehow? For example, for matrices with < 16M rows and columns (this is enough for 200-digit GNFS) all the 32-bit quantities can be stored as 24-byte integers, saving 1/4 of the disk space.
 2007-11-07, 15:27 #10 oslik   Nov 2007 22 Posts Yes it maybe worth for large matrices.Another idea may be to "zip" a matrix somehow, because if you store a sparse matrices (from GGNFS factorization) it is smaller in zip format, so there is a room for improvement :-). Maybe you can store not indices but their differences in each column (not the index, but the distance to the next non-zero entry), so fewer bits for the matrix storage. Last fiddled with by oslik on 2007-11-07 at 16:09
2007-11-07, 19:21   #11
frmky

Jul 2003
So Cal

5·479 Posts

Quote:
 Originally Posted by jasonp Apologies that this will take hours though :(
A few hours of time is the least of my worries, especially if it will lead to an efficient, parallel, public LA implementation.
Quote:
 Originally Posted by jasonp Does anyone think it would be a good idea to compress the above somehow?
I personally don't think it really matters. If there are lots of 0's, then that just means it will compress better when shipped from one place to another. Hard drives are large now, so the space during a run doesn't matter. It will take a bit longer to access, but as long as it is rarely accessed on disk during a run, it doesn't seem to make much difference.

Greg

 Similar Threads Thread Thread Starter Forum Replies Last Post Xyzzy Lounge 13 2017-02-21 18:29 Damian Puzzles 18 2012-12-08 19:01 fivemack Factoring 11 2009-08-18 17:39 buan Homework Help 3 2007-07-17 15:07 R.D. Silverman NFSNET Discussion 6 2006-03-19 17:56

All times are UTC. The time now is 00:18.

Mon Jun 27 00:18:14 UTC 2022 up 73 days, 22:19, 1 user, load averages: 1.75, 1.37, 1.15