mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   NFSNET Discussion (https://www.mersenneforum.org/forumdisplay.php?f=17)
-   -   CWI Matrix Solver (https://www.mersenneforum.org/showthread.php?t=5589)

R.D. Silverman 2006-03-09 11:56

CWI Matrix Solver
 
Hi,

Has anyone else ever encountered an error from the CWI solver
that says "too many orthogonal vectors" near the very end of the
computation?

I just got this message while doing the LA for 2,815-.

It is possible that I have some kind of memory/hardware problem.
This is the second time I have seen this. The last time I re-filtered
the data, re-ran the solver and things were fine.

Last night I tried backing up the computation to the last saved
checkfiles, but still had the same problem at the end.

I did some memory diagnostics with the BIOS, but turned up nothing.

Tonight, I will modify the software to ignore the condition and continue
the computation anywhere, but I do not expect success. I will burn
a CD, transfer the data to another machine and try re-running with the
same matrix, rather than re-filtering. This will tell me if it is likely a
software bug or a hardware problem.

But I will have to wait another 8 days for the result.

Ideas, anyone?

wblipp 2006-03-10 02:19

[QUOTE=R.D. Silverman]It is possible that I have some kind of memory/hardware problem.

I did some memory diagnostics with the BIOS, but turned up nothing.[/QUOTE]

Have you run the prime95 torture test for an extended period on this machine? That seems to turn up more problems than all the other memory diagnostics combined.

R.D. Silverman 2006-03-10 13:42

[QUOTE=wblipp]Have you run the prime95 torture test for an extended period on this machine? That seems to turn up more problems than all the other memory diagnostics combined.[/QUOTE]

I am re-solving the matrix on another machine with the exact same
data. If it succeeds, I will know that I have a hardware problem and
look into it further.

When not solving matrices my NFS code runs constantly (both threads)
on the machine that is having problems. This code makes heavy use
of memory, yet it has never had a problem. And it tortures the processor
as well.

I was wondering if anyone else had ever encountered this problem.

xilman 2006-03-14 13:09

[QUOTE=R.D. Silverman]Hi,

Has anyone else ever encountered an error from the CWI solver
that says "too many orthogonal vectors" near the very end of the
computation?

I just got this message while doing the LA for 2,815-.

It is possible that I have some kind of memory/hardware problem.
This is the second time I have seen this. The last time I re-filtered
the data, re-ran the solver and things were fine.

Last night I tried backing up the computation to the last saved
checkfiles, but still had the same problem at the end.

I did some memory diagnostics with the BIOS, but turned up nothing.

Tonight, I will modify the software to ignore the condition and continue
the computation anywhere, but I do not expect success. I will burn
a CD, transfer the data to another machine and try re-running with the
same matrix, rather than re-filtering. This will tell me if it is likely a
software bug or a hardware problem.

But I will have to wait another 8 days for the result.

Ideas, anyone?[/QUOTE]
I've seen it, and discussed it with Peter Montgomery. We never came to any firm conclusions, other than that the matrix should not be too over-square ([B]very[/b] unlikely in your case, given your experience, though I mention it for the possible edification of other readers) and that sometimes there are too many duplicated rows in the matrix. Peter modified the code to look for the latter. A few are normal --- usually corresponding to factorbase primes which are factors of the full SNFS number.

The only practical solution we ever found was to refilter and produce a slightly different matrix. I will be slightly surprised if subsequent runs produce a different and more useful answer. If it does, it may be an indication that the initial random vectors chosen at the start of the algorithm turned out not to be useful. I don't remember whether I've ever seen that happen in practice.


Paul

R.D. Silverman 2006-03-14 14:01

[QUOTE=xilman]I've seen it, and discussed it with Peter Montgomery. We never came to any firm conclusions, other than that the matrix should not be too over-square ([B]very[/b] unlikely in your case, given your experience, though I mention it for the possible edification of other readers) and that sometimes there are too many duplicated rows in the matrix. Peter modified the code to look for the latter. A few are normal --- usually corresponding to factorbase primes which are factors of the full SNFS number.

The only practical solution we ever found was to refilter and produce a slightly different matrix. I will be slightly surprised if subsequent runs produce a different and more useful answer. If it does, it may be an indication that the initial random vectors chosen at the start of the algorithm turned out not to be useful. I don't remember whether I've ever seen that happen in practice.


Paul[/QUOTE]


The matrix has 3392K rows and 3395K columns.. Very nearly square.

The same matrix is now running on a different machine. It will finish
Saturday.

R.D. Silverman 2006-03-14 15:05

[QUOTE=R.D. Silverman]The matrix has 3392K rows and 3395K columns.. Very nearly square.

The same matrix is now running on a different machine. It will finish
Saturday.[/QUOTE]

Would anyone like to place bets on whether this computation will
have the same problem as the first? Will switching machines solve
the problem?

N.B. I would hate to have to reformulate the matrix and do the reduction
yet another time.....

Meanwhile, filtering for 2,833+ is nearly done and sieving for 2,1406M is
in progress.

R.D. Silverman 2006-03-19 17:56

[QUOTE=R.D. Silverman]Would anyone like to place bets on whether this computation will
have the same problem as the first? Will switching machines solve
the problem?

N.B. I would hate to have to reformulate the matrix and do the reduction
yet another time.....

Meanwhile, filtering for 2,833+ is nearly done and sieving for 2,1406M is
in progress.[/QUOTE]

Switching machines did solve the problem. 2,815- finished yesterday.
It had prime factors of 57,67, and 72 digits.


All times are UTC. The time now is 00:50.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.