mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   Running GGNFS (https://www.mersenneforum.org/showthread.php?t=9645)

Andi47 2010-05-31 05:07

[QUOTE=Batalov;216714]For use with factMsieve.pl, the file should contain "type: gnfs|snfs" no matter what it was created with. The perl script usually simply stops when this line is missing, and the reason is clear - gnfs and snfs parameters are very different, and [I]guessing[/I] will get you wrong results one way or another. So the perl script doesn't guess and stops - with a suggestion to add this line (unsure why it hasn't stopped in your case).

[SIZE=1]For those who doesn't use the script, this line is unimportant, but note that these users (runs sims and) set all the parameters themselves (and others, for distributed projects). That's a different story.[/SIZE][/QUOTE]

Hmmm... maybe you have missed my edit in the posting above, which I did after taking a second look:

The [I]original[/I] polynomial file (test.poly) did contain the line "type: gnfs"

[CODE]name: test
type: gnfs
n: 236708764512270870267910594603939528512072109211410346931645230023427790084558423739104638523341257857996001764863899983660894883
skew: 332552.31
Y0: -8222033980857525894160773
Y1: 91488489895543
c0: -9222601885280082007413455156084
c1: 109094498156246601498372704
c2: 3395885860600535601449
c3: -6049174266708986
c4: -29393783262
c5: 6300
[/CODE]

FactMsieve.pl created 8 files (test.job.T1, ..., test.job.T8) from this which contained the strange parameters above.

Questions:

1.) The line
[CODE] $DIGS = ($type eq "gnfs") ? $realDIGS/0.72 : $realDIGS;
[/CODE]

Does it mean: "if $type is equal to "gnfs", then set $DIGS to $realDIGS/0.72, else set $DIGS to $realDIGS;?

2.) The composite to be factored is a c129. When I take a closer look at
[CODE] if($DIGS>=160) { # the table parameters are easily splined; the table may be not needed at all --SB.
$RLIM= $ALIM = int(0.07*10**($DIGS/60.0)+0.5)*100000;
$LPBR= $LPBA = int(21+$DIGS/25);
$MFBR= $MFBA = ($DIGS<190) ? 2*$LPBR-1 : 2*$LPBR;
$RLAMBDA= $ALAMBDA= ($DIGS<200) ? 2.5 : 2.6;
$QINTSIZE=$QSTEP = 100000;
$classicalA = 4000000;
$classicalB = 400;
$paramDIGS = $realDIGS;
[/CODE]

it seems that $DIGS had a value of 179.16666666667, as r/alim was set to 6800000, lpbr/a was 28 and mfbr/a was 55, lambdas were 2.5.

My [I]guess[/I] after taking this close look: $DIGS [I]should be[/I] set to $realDIGS for GNFS and to $realDIGS/0.72 for SNFS, but is [I]is[/I] the other way round.

correct? If not, please correct me.

Batalov 2010-05-31 06:49

0. Yes, I missed that. tl;dr. It makes sense that there's no line in .job files, and if there wasn't the line in the .poly file, the script would have failed.

1. Yes. However note that $DIGS is a local variable and its meaning is (supposed to be) "effective SNFS complexity".

2. No. If ($type eq "gnfs"), then $realDIGS was the size of the number, else it was the SNFS difficulty ($realDIGS is $_[0] which in perl means the first supplied parameter; $type is $_[2] which similarly means the third parameter). This expression makes $DIGS to correspond to an idealized snfs case, for which M.Kamada's (and yours truly) splines are used.[FONT=Fixedsys][SUP][FONT=Fixedsys]*[/FONT][/SUP][/FONT]

3. The way the script is written, it is supposed to work in the dumbest mode - with the GGNFS's old def-par files. Because they were short, the script extends them in the area where the data were lacking. The script never overrides what is already put in the .poly file, so I recommend when you want to put your own parameters, put them in the .poly file.

There's a lot in the script that requires knowledge of perl. There are some books, so I will not even try to describe anything here; topics to read are subs, lexical scope (i.e. $type [I]is[/I] a different variable in different parts of the code; in a sub, the local definition eclipses the external one); eq vs. ==, etc. Or you can see the python script, if that's your [URL="http://www.youtube.com/watch?v=sMZwZiU0kKs"]language of choice[/URL].

_________
*This was done because there was always much more data for snfs trials and errors, see [URL="http://homepage2.nifty.com/m_kamada/math/graphs.htm"]these charts[/URL] ; the gnfs data was always sparse, harder to fit, so it was easier to think in terms "this gnfs is fairly well equivalent to such and such snfs", and then use the well researched snfs parameters.
I have later come to like the following decision boundary
[FONT=Fixedsys](1) gnfs_size <=compare=> snfs_diff * 0.56 + 30[/FONT]
over the older
[FONT=Fixedsys](2) gnfs_size <=compare=> snfs_diff * 0.72 [/FONT]
(0.72 for the easy {aliquot-like} gnfs jobs, then sliding to 0.7, then 0.68 in different difficulty areas)

Andi47 2010-05-31 07:19

[QUOTE=Batalov;216754]2. No. If ($type eq "gnfs"), then $realDIGS was the size of the number, else it was the SNFS difficulty ($realDIGS is $_[0] which in perl means the first supplied parameter; $type is $_[2] which similarly means the third parameter). This expression makes $DIGS to correspond to an idealized snfs case, for which M.Kamada's (and yours truly) splines are used.[FONT=Fixedsys][SUP][FONT=Fixedsys]*[/FONT][/SUP][/FONT][/quote]

Hmmmm... so do I understand correctly - the script calculates that GNFS-129 is ~as difficult as SNFS-179.16666 and thus takes parameters which are usually taken for SNFS-179? Hmmm... I think I will rise "(if $DIGS>160") to "(if $DIGS>220)" in my local copy of my script.

[QUOTE]3. The way the script is written, it is supposed to work in the dumbest mode - with the GGNFS's old def-par files. Because they were short, the script extends them in the area where the data were lacking. The script never overrides what is already put in the .poly file, so I recommend when you want to put your own parameters, put them in the .poly file.
[/QUOTE]

That's true if I chose to factor e.g. a Cunningham or Homogeneous Cunningham number. But for aliquot sequences one often prefers to "fire and forget" (the aliqueit script does several factorizations (Rho, ECM, P+/-1, QS, GNFS) in a row).

A nice compromise (over just rising the limit for letting factmsieve to chose the params) would be to look if def-params.txt holds suitable parameters (i.e. for a composite of similar size (+/- 3) as the composite to be factored, and iff not, then letting the script calculate its own parameters. But I don't know how to implement it, neither in pearl, nor in phyton.

[QUOTE]There's a lot in the script that requires knowledge of perl. There are some books, so I will not even try to describe anything here; topics to read are subs, lexical scope (i.e. $type [I]is[/I] a different variable in different parts of the code; in a sub, the local definition eclipses the external one); eq vs. ==, etc. Or you can see the python script, if that's your [URL="http://www.youtube.com/watch?v=sMZwZiU0kKs"]language of choice[/URL].[/QUOTE]

Unfortunately my phyton skills are even worse than my pearl skills.


[quote]I have later come to like the following decision boundary
[FONT=Fixedsys](1) gnfs_size <=compare=> snfs_diff * 0.56 + 30[/FONT]
over the older
[FONT=Fixedsys](2) gnfs_size <=compare=> snfs_diff * 0.72 [/FONT]
(0.72 for the easy {aliquot-like} gnfs jobs, then sliding to 0.7, then 0.68 in different difficulty areas)[/QUOTE]

I will probably change it accordingly in my local copy of factmsieve.pl. (for GNFS-129 $DIGS will be 176.7857 instead of 179.1666. Hmmm... that would be still lpbr/a 28 even though SNFS-177 seems to be just below the 27/28 crossover.)

Batalov 2010-05-31 08:10

You got the spirit of it.

I don't strongly object about the 27/28 crossover. It may be higher or lower. Usually around it, jumping up is almost effortless; the time will be roughly the same; don't think about it. I'd do 28 for a gnfs-129 and M.Kamada's examples show [URL="http://hpcgi2.nifty.com/m_kamada/f/c.cgi?q=89999_189"]the same[/URL]. (He does have his own opinions about mfba/r's which is fine. Everyone is entitled.)
I am sure that you can do 27 just as well. I'd say sim them - but that would contradict the fire-and-forget model.

If I were awaken in the middle of the night and asked where were the bit level jump points, I'd tell you without blinking an eye: "175, 200, 225, 250, 275 in SNFS terms" (that's for starting 28,29,30,31,32). I got very used to this mnemonic (but that's just a starting point for simming). And I know that in other people's books I sit on the low side; they use 32-bit even in the 260s. I am planning only now for the very first 32-bit job (B+D); all of the previous ones, I have cut lower and all of them were 31-bit tops with the 31/32 bit exception for the gnfs-180, - and they all worked fine.

So, I recommend adventure. This one with 27 if you are inclined. Later, you will get another 129, try 28. You know what I mean. There's more than one way to skin a cat.

jrk 2010-06-01 17:49

Speaking of GNFS,SNFS default parameters, why are they tabulated based on the size of the number being factored instead of, say, the Murphy E value of the polynomial?

Batalov 2010-06-01 18:34

Good question.
I guess because nobody volunteered to do that.

EdH 2010-06-01 19:45

Adding an extra computer to gain relations
 
I have a computer (single core, 2.4GHz) working on a c119. I am contemplating trying to cut its time by sieving on a separate machine and combining the relations. Do I have the following right:

- I plan to use the same sieve and poly file as the original
- I plan to choose the bounds sufficiently higher than what is currently running, so the two machines won't cover the same section (not really sure how concerned I should be, or how to judge)
- I plan to append the new test.dat (or, whatever I name it) to the original test.dat
- I expect the Python script to find the extra relations and act accordingly

Are there pitfalls I'm overlooking?

Thanks for any thoughts.

Mini-Geek 2010-06-02 00:45

[QUOTE=EdH;216994]- I plan to use the same sieve and poly file as the original[/QUOTE]
Same sieve file? Do you mean it'll be writing to a file named test.dat on the other computer? That doesn't really matter. What's more important is when you combine it. But if you avoid using test.dat, then you won't have to worry about renaming it when you go to put it in the same folder as test.dat to append them together.
And yes, definitely use the same polynomial file.
[QUOTE=EdH;216994]- I plan to choose the bounds sufficiently higher than what is currently running, so the two machines won't cover the same section (not really sure how concerned I should be, or how to judge)[/QUOTE]
You'll definitely want to make sure they sieve different ranges. You can use a command line like the one from [URL="http://www.mersenneforum.org/showthread.php?t=13328"]this team sieve[/URL] to make it a bit easier. Be sure to use all the right parameters for your work, most importantly the right gnfs-lasieve version for your work. The easiest way to find the right gnfs-lasieve version for your number is to see what the script chose to use on the first computer.
If you're not careful, you'll be duplicating work, and the useless duplicates will be ignored when you go to filter.
[QUOTE=EdH;216994]- I plan to append the new test.dat (or, whatever I name it) to the original test.dat[/QUOTE]
Yep, this sounds good. Just make sure the Python script isn't writing, or about to write, to test.dat, otherwise something will either fail to open the file for writing, or worse: write the data all mixed together, ruining those lines. (to be safest, first close the Python script and wait for it to finish appending everything to test.dat)
[QUOTE=EdH;216994]- I expect the Python script to find the extra relations and act accordingly[/QUOTE]
Yep, it will see the extra relations next time it checks the file and act accordingly. Just keep in mind that it won't know what ranges it needs to avoid running, it just blindly keeps running how it was. So if you start to duplicate work by hitting the values you started on the secondary machine, you'll want to stop it, adjust it, and start it again.

EdH 2010-06-02 13:50

"Same sieve file" meant "Same lattice siever" which in this case is 13e. Here is a view of the current sieving operation:
[code]
. . .
-> Estimated minimum relations needed: 8.7e+06
-> cleaning up before a restart
-> Running lattice siever ...
-> entering sieving loop
-> making sieve job for q = 2000000 in 2000000 .. 2100000 as file test.job.T0
-> Lattice sieving algebraic q from 2000000 to 2100000.
-> gnfs-lasieve4I13e -k -o spairs.out.T0 -v -n0 -a test.job.T0
FBsize 149256+0 (deg 5), 283145+0 (deg 1)
total yield: 567878, q=2100001 (0.02738 sec/rel)
. . .
Found 567879 relations, 6.5% of the estimated minimum (8700000).
-> making sieve job for q = 2100000 in 2100000 .. 2200000 as file test.job.T0
-> Lattice sieving algebraic q from 2100000 to 2200000.
. . .
Found 1131303 relations, 13.0% of the estimated minimum (8700000).
-> making sieve job for q = 2200000 in 2200000 .. 2300000 as file test.job.T0
-> Lattice sieving algebraic q from 2200000 to 2300000.
. . .
Found 1696383 relations, 19.5% of the estimated minimum (8700000).
-> making sieve job for q = 2300000 in 2300000 .. 2400000 as file test.job.T0
-> Lattice sieving algebraic q from 2300000 to 2400000.
. . .
Found 2264328 relations, 26.0% of the estimated minimum (8700000).
-> making sieve job for q = 2400000 in 2400000 .. 2500000 as file test.job.T0
-> Lattice sieving algebraic q from 2400000 to 2500000.
-> gnfs-lasieve4I13e -k -o spairs.out.T0 -v -n0 -a test.job.T0
. . .
[/code]
Estmating 6.5% per 1e5 range, I figured 3.5e6 and 2e5 for the values needed. With test.poly in a subdirectory named tmp, this gave me the following:
[code]
gnfs-lasieve4I13e -a tmp\test.poly -o tmp\additions.txt -f 3500000 -c 200000
[/code]
But, I was given back, this:
[code]
Please set all bounds to reasonable values!
[/code]
I then tried several different values and kept receiving the same message.

In reviewing my test.poly file, I see that a lot of values are missing, compared to the poly files we use for the Team sieves. My test.poly:
[code]
name: test
type: gnfs
n: 48580809819699200329708555632318129178027387020506913072044091953387200862671848975524614933791220018550035318044657829
skew: 54940.12
Y0: -79189672477676783946937
Y1: 6360030448193
c0: -4051842427045653038354536200
c1: -93642941254012491376278
c2: 6007458903024306275
c3: 13211933274479
c4: -1154858551
c5: 15600
[/code]
Since the Python script adds the -k and -n0 options, I tried them as well, but no luck. And, I can't seem to find the explanation for the option switches.:sad:

Should I construct some values for the ones omitted in the poly file, or should I look at something else?

Thanks!

Mini-Geek 2010-06-02 16:24

I think I see the problem. Looks like I was a little bit mistaken in my advice. :blush:
test.poly doesn't, as you've seen, contain all the bounds and such that the lattice sievers need. What does? Why, the file that's passed by the script to the sievers by the script instead of test.poly, of course! That's test.job.T0.

So delete test.poly, copy test.job.T0 there as test.poly, and remove the lines referring to q0, qintsize, and q1 (you'll be setting those on the command line). Then when you run the command line, it should all work.
Or if you prefer, leave those q lines in there and use them instead of setting that stuff on the command line. As implied by q1 being commented out, the only ones that really have an effect are q0 and qintsize, which correspond to -f and -c. q1 just makes it a bit easier to see how far those lines mean it will go.

EdH 2010-06-02 16:54

[quote=Mini-Geek;217131]. . . That's test.job.T0. . . [/quote]
Which I should have been able to recognize, had I been observent.:sad:
[code]
gnfs-lasieve4I13e -k -o spairs.out.T0 -v -n0 [B]-a test.job.T0[/B]
[/code]
I changed q0 and quintsize to 3500000 and 200000, respectively and ran:
[code]
gnfs-lasieve4I13e -a tmp\test.job.T0 -o tmp\additions.txt
[/code]
And, all seems to be working, although, a lot slower than expected. I might just change quintsize to 1e5 and restart. I can always run it again, if it finishes in good time.

Thanks much.


All times are UTC. The time now is 22:54.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.