mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   pol51 going nuts (https://www.mersenneforum.org/showthread.php?t=11665)

mklasson 2009-03-31 12:05

pol51 going nuts
 
My def-nm-params.txt contains the lines
[code]
104,47,1,4,2.55E+015,1.18E+014,1.30E+012,1.71E-009
105,52,1,4,3.74E+015,1.69E+014,1.80E+012,1.50E-009
106,58,1,4,5.47E+015,2.41E+014,2.50E+012,1.32E-009
[/code]

despite this pol51 will occasionally go nuts and continue running for much much too long.

On a c105 = 969285843405686434922111180198247129811119779490366570602072510950842965916129679209384200464095292408427
I just now caught it spewing
[code]
-> Searching leading coefficients from 244489001 to 244490000.
=> "c:/PROGRA~2/ggnfs/bin/pol51m0b.exe" -b test.polsel.Fluffputer.3748 -v -v -p 4 -n 3.74E+015 -a 244489 -A 244490 > test.polsel.Fluffputer.3748.log
ull5a5p-comp
-> Searching leading coefficients from 244490001 to 244491000.
=> "c:/PROGRA~2/ggnfs/bin/pol51m0b.exe" -b test.polsel.Fluffputer.3748 -v -v -p 4 -n 3.74E+015 -a 244490 -A 244491 > test.polsel.Fluffputer.3748.log
[...]
[/code]

after 3.5 hours... No better poly was found after the first 13 minutes. I can only hope it would have stopped at some point and not continued forever. Is this a known problem, and would it have stopped at some point? I'd hate to go away for a week and come back to find pol51 had wasted 4 cpu-weeks.

I'm using the svn340 release.

edit: I'm running factMsieve.pl btw.

jasonp 2009-03-31 13:04

These leading coefficients are enormously too big for a C105 (I'd give up after a coefficient of 100000 or so). I suspect that beyond a certain point there's a fatal error but the perl script just keeps spawning pol51m0b again.

mklasson 2009-04-01 12:43

On a similar note, I apparently got 33 files similar to
[code]
2009-03-31 10:45 0 badsched.test.job.0.4313807.4008667
2009-03-31 10:47 0 badsched.test.job.0.4316717.2717482
2009-03-31 10:53 0 badsched.test.job.0.4323947.3366609
2009-03-31 10:53 0 badsched.test.job.0.4325059.4011825
2009-03-31 10:54 0 badsched.test.job.0.4325393.2565123
[...]
[/code]
yesterday morning. I have no idea what they mean, but they don't look friendly...

Running the same setup as in previous post, factMsieve.pl etc.

test.poly:
[code]
name: test
n: 35373631192423512915418120852463046769722622393931134744531110598833584457248300247665784285065284152054474974571377279662403
skew: 63647.61
# norm 1.15e+017
c5: 126420
c4: -17975579263
c3: -4010545972620842
c2: 64557889599813041127
c1: 3229497510574796277726897
c0: -47806795774904077274785354044
# alpha -5.31
Y1: 35844769583447
Y0: -775127852616210677728069
# Murphy_E 1.47e-010
# M 482218219503405099638123000357002934503417822156871448744662127640645442594711301370820249192570539180457687464720228950693
type: gnfs
rlim: 8000000
alim: 8000000
lpbr: 27
lpba: 27
mfbr: 51
mfba: 51
rlambda: 2.5
alambda: 2.5
qintsize: 100000
[/code]

fivemack 2009-04-01 13:01

badsched files mean that an estimate for the size of an array didn't come out right, and the scheduling routine ran out of space.

They tend to occur when the special-Q limit on the side you're sieving is close to a power of two (2^22 in your case).

You lose some relations, but it's not catastrophic.


The message you're getting from pol51m0b ("ull5a5p-comp") means that 5 * A5 * 2^64 divided by a certain product of primes doesn't fit into 64 bits (IE that a5 is too big), and obviously the script ought to have stopped when it started getting this message.

mklasson 2009-04-01 13:09

[QUOTE=fivemack;167570]badsched files mean that an estimate for the size of an array didn't come out right, and the scheduling routine ran out of space.

They tend to occur when the special-Q limit on the side you're sieving is close to a power of two (2^22 in your case).

You lose some relations, but it's not catastrophic.

The message you're getting from pol51m0b ("ull5a5p-comp") means that 5 * A5 * 2^64 divided by a certain product of primes doesn't fit into 64 bits (IE that a5 is too big), and obviously the script ought to have stopped when it started getting this message.[/QUOTE]

Thank you both. Too bad I have no perl skills whatsoever.

10metreh 2009-04-01 13:41

[quote=fivemack;167570]badsched files mean that an estimate for the size of an array didn't come out right, and the scheduling routine ran out of space.

They tend to occur when the special-Q limit on the side you're sieving is close to a power of two (2^22 in your case).

You lose some relations, but it's not catastrophic.


The message you're getting from pol51m0b ("ull5a5p-comp") means that 5 * A5 * 2^64 divided by a certain product of primes doesn't fit into 64 bits (IE that a5 is too big), and obviously the script ought to have stopped when it started getting this message.[/quote]

I have sieved around 2^20 several times, and never got a badsched. How often do they occur?

fivemack 2009-04-01 13:54

fix
 
OK, the problem is the line 543

[code]
if ($res) {$H=$HH; next;} # lambda-comp and ull5-comp related errors can be simply skipped
[/code]

which jumps back to the top of the loop *without* testing for the termination condition. Easy to fix: replace that line with

[code]
unless ($res) {
[/code]
and close the } after line 612 ( unlink "$pname.log"; ). Could someone with SVN commit access check that that's right and commit it?

Batalov 2009-04-01 18:56

This is not the problem.
The problem is the wrong parameters in def-nm-param.txt for _easy_ numbers. Edit your copy of def-nm-param.txt (lower values in pertinent rows); no perl skills necessary.

No matter what one fixes in the perl script (instead of editing def-nm-param.txt), the failure will follow. The pol51 will fail either too early and not produce any polynomials (which is how it behaved before), or it may loop for too long (which is how it does now). That's where you simply stop the script and run the same line again - because at the time of this second run there [I]will[/I] be a .poly file in the directory already.

Between the script immediately terminating on the very first lambda-comp/ull error without producing any polynomial at all, and the script looping _and_ producing the polynomial, I easily choose the latter. How about you?

An alternative solution is to patch pol51. Hard. Try it.

When the def-nm-param.txt parameters are interpolated a bit better, the correct solution is to get rid of the whole file altogether and instead estimate parameters in the perl script, on the fly.

jasonp 2009-04-01 19:19

[QUOTE=Batalov;167651]An alternative solution is to patch pol51. Hard. Try it.
[/QUOTE]
That's quite an understatement. I managed to figure out how everything in pol51 works, except for the core knapsack computation. One of the big strengths of Kleinjung's improved algorithm is that it's much much simpler to implement.

fivemack 2009-04-01 19:19

The problem is that, once a5 is too big, pol51 will only ever produce ull5a5p-comp errors. Because the error-handling in the script as is misses out the terminate-after-given-time test if pol51 is constantly producing errors, the script will never finish. All my suggested patch does is skips the polynomial-parsing stage, rather than the polynomial-parsing and terminate-after-given-time stages, when pol51 produces an error.

My patch doesn't stop on the first error, which I agree is silly behaviour. It just means that, if pol51 keeps producing errors, the polynomial search will at least terminate after the given amount of time.

Batalov 2009-04-01 21:00

Oh, true! I've never checked how the script keeps track of time, I thought it was by catching a timer signal, but indeed it is in the end of that loop. Committing -now. SVN 347.

[SIZE=1]I've also had another verison that was halving $normmax on an error and bootstrapping once, but that wasn't very effective.[/SIZE]

Along the other line of attack: may I assume that the best "normmax*" parameters that you Tom were testing quite thoroughly, afair, were merged in the msieve source and grab them from there?

I was going to test one thing at a time. The snfs parameter hack separately from a def-nm-params.txt retirement hack. I was going to try to wrap this one up over the weekend.

Thank you, Tom!

Serge


All times are UTC. The time now is 07:30.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.