mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   CWI format --> GGNFS format (https://www.mersenneforum.org/showthread.php?t=13734)

jasonp 2010-08-16 15:46

Wow, busy today.

Bob: Msieve will read a single field for the (a,b) pair, then a set of fields for the rational factors until it hits a colon, then a set of fields for the algebraic factors until it hits a null, \n or \r character (thus it will work for unix, windows or mac text files or even concatenated files from different OSes). Factors that occur multiple times only need to appear in the text once. Factors should be separated by commas only, and no count of the factors is needed.

Msieve has always required a single text file with all relations.

Also, both GGNFS and msieve use algebraic ideals of the form a - b \alpha, which is the opposite of the CWI convention. The input parser requires that relation 'b' values be positive.

For the factor base file, coefficients can appear in any order and missing coefficients are treated as zero. The degree of the polynomial is just that of the largest nonzero coefficient. The SKEW line is not necessary (it's treated as 1.0 if not found)

The filtering assigns a number to each line in the relation file and uses that number to uniquely identify it throughout the postprocessing. The output from the filtering is not human readable, it is only a '*.cyc' file with each matrix column and the relation numbers that appear in that column. The filtering does all the work in one pass, it is not expected that cycle files are improved iteratively.

For large problems the LA writes a '*.chk' checkpoint file.

I checked the converter binary on the web page and got the same error, it was linked against a 3-year old msieve library and there's no way I'd be able to rebuild it now. You may be right that it would be quicker to build your own converter, but if you're going to be playing with the msieve source we'll have to get the library compiled eventually :)

R.D. Silverman 2010-08-16 16:12

[QUOTE=jasonp;225682]Wow, busy today.

Bob: Msieve will read a single field for the (a,b) pair, then a set of fields for the rational factors until it hits a colon, then a set of fields for the algebraic factors until it hits a null, \n or \r character (thus it will work for unix, windows or mac text files or even concatenated files from different OSes). Factors that occur multiple times only need to appear in the text once. Factors should be separated by commas only, and no count of the factors is needed.

Msieve has always required a single text file with all relations.

Also, both GGNFS and msieve use algebraic ideals of the form a - b \alpha, which is the opposite of the CWI convention. The input parser requires that relation 'b' values be positive.

For the factor base file, coefficients can appear in any order and missing coefficients are treated as zero. The degree of the polynomial is just that of the largest nonzero coefficient. The SKEW line is not necessary (it's treated as 1.0 if not found)

The filtering assigns a number to each line in the relation file and uses that number to uniquely identify it throughout the postprocessing. The output from the filtering is not human readable, it is only a '*.cyc' file with each matrix column and the relation numbers that appear in that column. The filtering does all the work in one pass, it is not expected that cycle files are improved iteratively.

For large problems the LA writes a '*.chk' checkpoint file.

I checked the converter binary on the web page and got the same error, it was linked against a 3-year old msieve library and there's no way I'd be able to rebuild it now. You may be right that it would be quicker to build your own converter, but if you're going to be playing with the msieve source we'll have to get the library compiled eventually :)[/QUOTE]

I got the converter to work. Sort of. I fixed the format problems in
msieve.fb then ran the code. It runs fine on most relations, but causes
a seg fault on relations that have more than 9 factors for either polynomial.

It handles:

0133 r1 r2 r3 a1 a2 a3

just fine, but barfs on

01A3 r1 r2 r3 ....r10 a1 a2 a3

i.e. there are 10 factors of the rational polynomial, and the convert
code does not seem to handle the hex 'A' .

R.D. Silverman 2010-08-16 16:21

[QUOTE=R.D. Silverman;225685]I got the converter to work. Sort of. I fixed the format problems in
msieve.fb then ran the code. It runs fine on most relations, but causes
a seg fault on relations that have more than 9 factors for either polynomial.

It handles:

0133 r1 r2 r3 a1 a2 a3

just fine, but barfs on

01A3 r1 r2 r3 ....r10 a1 a2 a3

i.e. there are 10 factors of the rational polynomial, and the convert
code does not seem to handle the hex 'A' .[/QUOTE]

Also, CWI format allows comment fields in the data that begin with #.
The converter barfs on these as well. I use these to make sure that I
never get output files from different projects mixed up. When I see
#C(2,2166L) at the head of my file, I know which project it is
from. CWI post processing ignores these comment fields. cwi2gg barfs
on them.

R.D. Silverman 2010-08-16 16:32

[QUOTE=jasonp;225682]Wow, busy today.

Bob: Msieve will read a single field for the (a,b) pair, then a set of fields for the rational factors until it hits a colon, then a set of fields for the algebraic factors until it hits a null, \n or \r character (thus it will work for unix, windows or mac text files or even concatenated files from different OSes). Factors that occur multiple times only need to appear in the text once. Factors should be separated by commas only, and no count of the factors is needed.

Msieve has always required a single text file with all relations.

Also, both GGNFS and msieve use algebraic ideals of the form a - b \alpha, which is the opposite of the CWI convention. The input parser requires that relation 'b' values be positive.

For the factor base file, coefficients can appear in any order and missing coefficients are treated as zero. The degree of the polynomial is just that of the largest nonzero coefficient. The SKEW line is not necessary (it's treated as 1.0 if not found)

The filtering assigns a number to each line in the relation file and uses that number to uniquely identify it throughout the postprocessing. The output from the filtering is not human readable, it is only a '*.cyc' file with each matrix column and the relation numbers that appear in that column. The filtering does all the work in one pass, it is not expected that cycle files are improved iteratively.

For large problems the LA writes a '*.chk' checkpoint file.

I checked the converter binary on the web page and got the same error, it was linked against a 3-year old msieve library and there's no way I'd be able to rebuild it now. You may be right that it would be quicker to build your own converter, but if you're going to be playing with the msieve source we'll have to get the library compiled eventually :)[/QUOTE]


I plan on using my own NFS siever since msieve does not have an NFS
lattice siever. If you ever do implement one, I would like to help out
with the source code.


Note:

The CWI filter code does do its work in multiple passes. It requires
a lot of manual input. Set filter parameters, do a pass, see the results,
reset parameters, do another pass etc.

It is also horribly slow and uses a LOT of memory. It also does
frequent reallocs(), which cause WINDOZE a lot of problems; the realloc
frequently fails. Here is a data point.

I filtered the data for 2,1191- using the CWI filter code. One of the
input parameters is 'maxrels', the maximal number of relations to combine
together to form an output relation. With maxrels set at 11, I managed
to form a matrix file of 7.1M rows and average lit bits of 80 per row.
With maxrels set higher I could not get the filter code to run at all
because of the realloc() problems, even on a machine with 4GB and XP-64.

Greg is doing the final processing. He managed to squeeze the matrix
down to 6.5M rows, with an average of about 105 lit bits using msieve.

jasonp 2010-08-16 16:34

[QUOTE=R.D. Silverman]
I think I found a bug. It works on most inputs.
I will not work on inputs (seemingly) that have a
HEX character in the first input field:
[/QUOTE]
Yes, I now remember that it cannot handle a factor count of 10 or more. I think the converter was written before someone (Jes Hansen?) sent me the CWI header file that gave the file format, so I only had example output to work from.

Also, the LA includes the quadratic character rows in the input matrix (the QS and NFS codes use the same LA implementation), and the preprocessing in the LA pulls out the 48 most dense rows for a separate Gauss elimination pass. Usually the portion of the matrix that the Lanczos iteration actually solves has 65-85 nonzeros per column. What the CWI filtering calls maxrels is fixed at 28 in msieve.

Much more detail on the filtering is available [url="http://cado.gforge.inria.fr/workshop/slides/papadopoulos.pdf"]here[/url]

frmky 2010-08-16 17:55

My code, too, probably can't handle more than 9 factors. Richard always used a larger print bound of 1000000 so that wasn't an issue for the NFSNet relations. It also has nasty and unnecessary dependencies on the GGNFS code.

The cleanest thing to do would be to read in the polys in whatever format, take Jason's relation reading code fixed for >9 factors, divide out the provided factors, then just use a bit of trial factoring to pull out the tiny factors followed by Pollard Rho or ECM to find the rest. One more hitch is that CWI rational and algebraic factors can come in either order, depending on which poly was specified first. In GGNFS, the rational factors always come first.

As a concrete example, here are equivalent CWI and GGNFS relations for 2,1191-. My code includes all factors, but those <1000 can be omitted.

[CODE]0155 556389 873452 13901 29671 42829 53279 71569 27361 128761 152539 548851 4165627;
0135 -385643 376992 18553 19531 1029881 12739 12757 95923 105607 798961;
0157 -11665601 481884 37013 37649 67231 98389 3369059 11119 13711 17659 659881 705559 1442971 2120917;
0174 12191 3427 13619 28549 37021 48973 60133 70919 422111 51487 60091 119101 4600399;
0156 884621 6709141 10847 11689 12973 22783 4472021 21163 24391 520309 1417489 4054387 4307473;
0166 -4314838 12265509 20219 23473 30853 48679 90197 3689869 11287 237157 748387 1997029 4656061 4763149;
0157 -693025 2335558 39301 65927 77813 96557 5300963 13903 42853 64171 73681 437743 2307541 4518061;
0146 211638 2879759 35053 41651 55201 3470611 40531 159337 167119 1114381 2451721 8552737;
0167 5964581 4301494 12791 23209 23509 68729 87943 3799219 24043 32233 110899 777103 1006393 2854309 7358233;

556389,873452:364d,73e7,a74d,d01f,11791,2f,67,d3,167,761,b03,eb1,258d:6ae1,1f6f9,253db,85ff3,3f8ffb,2,4f,61,6d,c1
-385643,376992:4879,4c4b,fb6f9,5,2f,e3,1bb,1cf,25f,287,971,9ad,18eb,1a2f:31c3,31d5,176b3,19c87,c30f1,2,2a1,397,bdd
-11665601,481884:9095,9311,1069f,18055,336863,5,7,3d,df,42d,63d,ad9,14e3:2b6f,358f,44fb,a11a9,ac417,16049b,205cd5,2,7,7f,2e3
12191,3427:3533,6f85,909d,bf4d,eae5,11507,670df,7,6b,7f,14b,1d3:c91f,eabb,1d13d,46324f,3,7,175
884621,6709141:2a5f,2da9,32ad,58ff,443cd5,5,d,17,ad,fb,6a3,b89,1a35,1df9:52ab,5f47,7f075,15a111,3ddd73,41ba11,3,7,7f,142f
-4314838,12265509:4efb,5bb1,7885,be27,16055,384d8d,2,7,1d,9d,df,11b,5ab,67f:2c17,39e65,b6b63,1e78e5,470bbd,48ae0d,7,d,67,eb9
-693025,2335558:9985,10187,12ff5,1792d,50e2e3,7f,15d,1bb,295,3b9,41b,2507:364f,a765,faab,11fd1,6adef,2335d5,44f0ad,2,3
211638,2879759:88ed,a2b3,d7a1,34f513,2,13,29,3d,df,67f,1163,164f,1ecb,1f01:9e53,26e69,28ccf,11010d,256909,828121,d,6cd
5964581,4301494:31f7,5aa9,5bd5,10c79,15787,39f8b3,b,1bb,26b,52f,16e5,1855:5deb,7de9,1b133,bdb8f,f5b39,2b8da5,704719,2,3,d
[/CODE]

R.D. Silverman 2010-08-16 18:02

[QUOTE=jasonp;225690]Yes, I now remember that it cannot handle a factor count of 10 or more. I think the converter was written before someone (Jes Hansen?) sent me the CWI header file that gave the file format, so I only had example output to work from.

[/QUOTE]

This limitation is a MAJOR problem. Many relations have more than 10
prime factors for one or more of the polynomials.

It looks as if I will have to write my own converter.

xilman 2010-08-16 18:06

[quote=frmky;225701]The cleanest thing to do would be to read in the polys in whatever format, take Jason's relation reading code fixed for >9 factors, divide out the provided factors, then just use a bit of trial factoring to pull out the tiny factors followed by Pollard Rho or ECM to find the rest. [/quote]The CWI suite has a separate program, factorrelations, to find the "tiny" divisors. Quite frequently it would be used in several passes with successively smaller bounds on what constitutes "tiny", interleaved with filtering passes. The idea was to save file space and disk IO by filtering on large primes before going on to smaller.


Paul

R.D. Silverman 2010-08-16 18:09

[QUOTE=frmky;225701]My code, too, probably can't handle more than 9 factors. Richard always used a larger print bound of 1000000 so that wasn't an issue for the NFSNet relations. It also has nasty and unnecessary dependencies on the GGNFS code.

The cleanest thing to do would be to read in the polys in whatever format, take Jason's relation reading code fixed for >9 factors, divide out the provided factors, then just use a bit of trial factoring to pull out the tiny factors followed by Pollard Rho or ECM to find the rest. One more hitch is that CWI rational and algebraic factors can come in either order, depending on which poly was specified first. In GGNFS, the rational factors always come first.

As a concrete example, here are equivalent CWI and GGNFS relations for 2,1191-. My code includes all factors, but those <1000 can be omitted.

[CODE]0155 556389 873452 13901 29671 42829 53279 71569 27361 128761 152539 548851 4165627;
0135 -385643 376992 18553 19531 1029881 12739 12757 95923 105607 798961;
0157 -11665601 481884 37013 37649 67231 98389 3369059 11119 13711 17659 659881 705559 1442971 2120917;
0174 12191 3427 13619 28549 37021 48973 60133 70919 422111 51487 60091 119101 4600399;
0156 884621 6709141 10847 11689 12973 22783 4472021 21163 24391 520309 1417489 4054387 4307473;
0166 -4314838 12265509 20219 23473 30853 48679 90197 3689869 11287 237157 748387 1997029 4656061 4763149;
0157 -693025 2335558 39301 65927 77813 96557 5300963 13903 42853 64171 73681 437743 2307541 4518061;
0146 211638 2879759 35053 41651 55201 3470611 40531 159337 167119 1114381 2451721 8552737;
0167 5964581 4301494 12791 23209 23509 68729 87943 3799219 24043 32233 110899 777103 1006393 2854309 7358233;

556389,873452:364d,73e7,a74d,d01f,11791,2f,67,d3,167,761,b03,eb1,258d:6ae1,1f6f9,253db,85ff3,3f8ffb,2,4f,61,6d,c1
-385643,376992:4879,4c4b,fb6f9,5,2f,e3,1bb,1cf,25f,287,971,9ad,18eb,1a2f:31c3,31d5,176b3,19c87,c30f1,2,2a1,397,bdd
-11665601,481884:9095,9311,1069f,18055,336863,5,7,3d,df,42d,63d,ad9,14e3:2b6f,358f,44fb,a11a9,ac417,16049b,205cd5,2,7,7f,2e3
12191,3427:3533,6f85,909d,bf4d,eae5,11507,670df,7,6b,7f,14b,1d3:c91f,eabb,1d13d,46324f,3,7,175
884621,6709141:2a5f,2da9,32ad,58ff,443cd5,5,d,17,ad,fb,6a3,b89,1a35,1df9:52ab,5f47,7f075,15a111,3ddd73,41ba11,3,7,7f,142f
-4314838,12265509:4efb,5bb1,7885,be27,16055,384d8d,2,7,1d,9d,df,11b,5ab,67f:2c17,39e65,b6b63,1e78e5,470bbd,48ae0d,7,d,67,eb9
-693025,2335558:9985,10187,12ff5,1792d,50e2e3,7f,15d,1bb,295,3b9,41b,2507:364f,a765,faab,11fd1,6adef,2335d5,44f0ad,2,3
211638,2879759:88ed,a2b3,d7a1,34f513,2,13,29,3d,df,67f,1163,164f,1ecb,1f01:9e53,26e69,28ccf,11010d,256909,828121,d,6cd
5964581,4301494:31f7,5aa9,5bd5,10c79,15787,39f8b3,b,1bb,26b,52f,16e5,1855:5deb,7de9,1b133,bdb8f,f5b39,2b8da5,704719,2,3,d
[/CODE][/QUOTE]


How did you convert the CWI data that I sent you? Many of the
relations would have had polynomials with more than 9 factors.

R.D. Silverman 2010-08-16 18:11

[QUOTE=R.D. Silverman;225707]How did you convert the CWI data that I sent you? Many of the
relations would have had polynomials with more than 9 factors.[/QUOTE]

Also, how long did the conversion take?

R.D. Silverman 2010-08-16 18:22

[QUOTE=xilman;225706]The CWI suite has a separate program, factorrelations, to find the "tiny" divisors. Quite frequently it would be used in several passes with successively smaller bounds on what constitutes "tiny", interleaved with filtering passes. The idea was to save file space and disk IO by filtering on large primes before going on to smaller.


Paul[/QUOTE]

Yes. I use this program. I run the filter code on large/larger primes,
refactor, then run the filter code on the fully factored relations.

I did in fact run it on the data that I sent Greg. Many of the
resulting relations would have had norms with more than 9 prime
factors. So I wonder how he converted my data??? He tells me that
he has filtered the data and will be starting the LA shortly.

I am having other problems with the CWI-->GGNFS relation converter
as well. It seems to reject many valid relations that the CWI refactor
code has no problem with. I do not know why.

I am going to give up on the converter. Instead, I will create a version
of my siever that emits relations in GG format. This will take time.

This will take time.


All times are UTC. The time now is 23:25.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.