mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Factoring (https://www.mersenneforum.org/forumdisplay.php?f=19)
-   -   GGNFS problems on Windows (https://www.mersenneforum.org/showthread.php?t=14703)

Andi47 2011-01-15 09:31

1 Attachment(s)
[QUOTE=Brian Gladman;246492]Hi Andi,

I didn't see this crash earlier for some reason, but I do now. This turns out to be a different bug and, this time, one in the actual lattice sieve code. I hope that I have now corrected this one in the GGNFS SVN. I have sent you updated binaries for test.

Brian[/QUOTE]

got them and tested just now: the win32 binary seems to be properly working (total yield: 263, q=43505381 (1.04532 sec/rel)), the x64 code gives some "too large!" messages and then crashes.

[code]> gnfs-lasieve4I16e -M 1 -a 2_956+.poly -o test12.out -f 43505239 -c 100
Warning: lowering FB_bound to 43505238.
Too large!
Too large!
Too large!
Too large!
Too large!
Too large!
Too large!
*crash* [/code]

Edit: can you please additionally send my the binaries of the 13e sievers so that I can run a relatively quick test of SVN403 (or 404 if you do another bugfix) by SNFSing a homogeneous cunningham number?

Edit2: I just see that both binaries (win32 and x64) yield some relations, and the x64-output is the bigger(!) file. I attached them now (zipped).

Brian Gladman 2011-01-15 09:58

My fix was wrong - I'll update in SVN when I have tested the correction.

I corrected my fix but there is still a difference between the win32 and x64 versions. The win32 version runs for a short time and returns, apparently normally. The x64 version gives much more output and does not seem to terminate :-)

I think we need someone who understands the siever code better than I do to take a look at the changes I have made to see if they make sense or whether they are wrong.

Brian

Brian Gladman 2011-01-15 12:17

[QUOTE=Brian Gladman;246498]My fix was wrong - I'll update in SVN when I have tested the correction.

I corrected my fix but there is still a difference between the win32 and x64 versions. The win32 version runs for a short time and returns, apparently normally. The x64 version gives much more output and does not seem to terminate :-)

I think we need someone who understands the siever code better than I do to take a look at the changes I have made to see if they make sense or whether they are wrong.

Brian[/QUOTE]

There is a bug in gnfs-lasieve4e.c but removing it does not solve the problems with Windows binaries.

[CODE] { unsigned long small_factors[10];

nf = rho_factor(small_factors, large_factors[s1]);
for (i = 0; i < nf; i++) {
mpz_set_ui(large_primes[s1][i], small_factors[i]);
if (mpz_sizeinbase(large_primes[s1][i],2) > max_primebits[s1]) {
n_mpqsvain[s1]++;
break;
}
}
if ((i >= nf) && (nf >= 1))
nlp[s1] = nf;
else { nlp[s1]=0; }
}
[/CODE]

The function rho_factor() returns -1 if there are no factors but, because the variable 'i' in the following loop is unsigned, the 'i < nf' treats nf (= -1) as an unsigned maximum value and the loop is hence entered when it should not be.

This is what caused the bug that Andi has reported but, sadly, correcting it fails to fix the Windows code. The win32 code appears to work but the x64 version runs for a long time and then crashes.

For the moment I have reverted the SVN code to its original state (i.e. one that includes the above bug) but I will update to a corrected version if others who know more about the sieve code than I do agree that this is indeed a bug.

Brian

Andi47 2011-01-16 08:10

[QUOTE=Brian Gladman;246517]There is a bug in gnfs-lasieve4e.c but removing it does not solve the problems with Windows binaries.
[/QUOTE]

What about moving this one into a new thread named "bug in gnfs-lasieve4e.c"? In the latest months I got the impression, that some people think that the problems / crashes are caused by a compiler problem and therefore might not look closely into a "problems in windows" thread...

Brian Gladman 2011-01-16 08:48

Ok, I have done this.

Brian

Batalov 2011-01-16 09:03

In the src/experimental/lasieve4_64/ branch, gnfs-lasieve4e.c is different and doesn't try rho (this whole fragment appears to be removed, not even undef'd).

Maybe you could
[FONT=Arial Narrow]#undef TRY_RHO_ON_FAILURES[/FONT]

Most mpqs failures turned out to be on prime squares (except for very large inputs, this increases recently with 3LPs); see the next fragment, after
[FONT=Arial Narrow]/* did it fail on a square? */ comment.[/FONT]

Isn't rho_factor() as well going to fail when mpqs_factor() already failed on a q[SUP]2[/SUP]? You can also comment out the
[CODE] /* did it fail on a square? */
mpz_sqrtrem(large_primes[s1][0],large_primes[s1][1],large_factors[s1]);
if(mpz_sgn(large_primes[s1][1]) == 0) { /* remainder == 0? */
mpz_set(large_primes[s1][1],large_primes[s1][0]);
nlp[s1]= 2;
if(verbose > 1) {
fprintf(stderr," mpqs on a prime square ");
mpz_out_str(stderr,10,large_primes[s1][0]);
fprintf(stderr,"^2 ");
}
continue;
}
[/CODE]
part. I haven't tried it on Windows and if the mpz_sqrtrem in the windows gmplib implementation fails, then it will become obvious. It is an informative decoration anyway (instead of saying "mpqs failed for ###" it tries to save this relation, which is very insignificant, ~<10[SUP]-4..-5[/SUP] relations).

Brian Gladman 2011-01-16 10:53

[FONT=Arial Narrow]If I undefine [/FONT][FONT=Arial Narrow]TRY_RHO_ON_FAILURES, I then get for Andi's example:[/FONT]

mpqs factored 13561050935197071784639392397
mpqs factored 1171639110529703960025298763
mpqs factored 18533607707643278029061973209
mpqs factored 348989124076403221792624571
mpqs factored 2127838875975350313372722227
mpqs factored 2113969886967600561263133229
mpqs factored 51656567840362049130624673
mpqs factored 4323710232844063674410971
mpqs factored 18866333724005152007221811
mpqs factored 22639765297739040862595531
mpqs factored 16663867208938759657469423
mpqs factored 3661335120371304346850392519
mpqs failed for 973546285233621101(a,b): 58334026229 1692
mpqs factored 453262486784922268510652017
mpqs factored 20466601638854628186181699
mpqs factored 7853965905319013866295653
mpqs factored 4556774026948178070809312941
mpqs failed for 96828168343638203(a,b): 33220359399 202
mpqs factored 403304008280603690562533723
mpqs factored 852736408617441935152097561
mpqs factored 2989143349288270243959229019
mpqs factored 348138507751972267578731983
.......

apparently indefintely. If I take out the 'did it fail on a square?' bit, I then get a list of mpqr failures, again apparently indefinitely.

Brian

Batalov 2011-01-16 21:11

[QUOTE=Brian Gladman;246716]...
mpqs failed for 973546285233621101(a,b): 58334026229 1692
...
mpqs failed for 96828168343638203(a,b): 33220359399 202
[/QUOTE]
Ah, good, the mpz_sqrtrem works as planned then. These two (above) are not squares.
[FONT=Arial Narrow]96828168343638203 = 340563137 * 284317819[/FONT]
[FONT=Arial Narrow]973546285233621101 = 533380361 * 1825238341[/FONT]

I'll try to compare the codes if/when I will have time...

You could use the src/experimental/lasieve4_64/ branch as a reference branch (because it doesn't hang or crash; it is also newer, but is 'genetically modifed' for linux/asm). There's an even newer branch on the forum (codename lasieve5), but it is not adapted for the GGNFS file formats.


[COLOR=green]P.S. The truly original source is in .w files, and the PDF file made from its TEX portion is added to the SVN for convenience.[/COLOR]

Brian Gladman 2011-01-16 22:38

I would certainly be grateful for help in tracking this one down. It is very disappointing to find a bug and then discover that it doesn't solve the problem :-(

II don't think it is the compiler since I have used the Intel compiler with exactly the same outcome.

Brian


All times are UTC. The time now is 12:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.