![]() |
|
|
#1 |
|
Jul 2003
So Cal
210610 Posts |
Code:
Sat Dec 12 18:52:20 2009 commencing linear algebra Sat Dec 12 18:52:24 2009 read 9998652 cycles Sat Dec 12 18:53:00 2009 matrix is 9998458 x 9998652 (2905.1 MB) with weight 865972158 (86.61/col) Sat Dec 12 18:53:00 2009 sparse part has weight 651574365 (65.17/col) Sat Dec 12 18:53:00 2009 saving the first 48 matrix rows for later Sat Dec 12 18:53:08 2009 matrix is 9998410 x 9998652 (2775.9 MB) with weight 674657217 (67.47/col) Sat Dec 12 18:53:08 2009 sparse part has weight 627702934 (62.78/col) Sat Dec 12 18:53:08 2009 matrix includes 64 packed rows Sat Dec 12 18:53:08 2009 using block size 65536 for processor cache size 3072 kB Sat Dec 12 18:53:55 2009 commencing Lanczos iteration (4 threads) Sat Dec 12 18:53:55 2009 memory use: 3054.1 MB Sat Dec 12 18:54:01 2009 restarting at iteration 129392 (dim = 8182080) Sat Dec 12 18:55:28 2009 linear algebra at 81.8%, ETA 57h44m Mon Dec 14 22:21:54 2009 lanczos error (dim = 9998318): not all columns used Mon Dec 14 22:21:54 2009 lanczos halted after 158116 iterations (dim = 9998318) Mon Dec 14 22:21:54 2009 linear algebra failed; retrying... Mon Dec 14 22:21:54 2009 commencing Lanczos iteration (4 threads) Mon Dec 14 22:21:54 2009 memory use: 3054.1 MB Mon Dec 14 22:21:54 2009 restarting at iteration 157351 (dim = 9950034) Mon Dec 14 22:23:09 2009 linear algebra at 99.5%, ETA 1h18m Mon Dec 14 23:41:38 2009 lanczos error (dim = 9998318): not all columns used Mon Dec 14 23:41:38 2009 lanczos halted after 158116 iterations (dim = 9998318) Mon Dec 14 23:41:38 2009 linear algebra failed; retrying... Mon Dec 14 23:41:38 2009 commencing Lanczos iteration (4 threads) Mon Dec 14 23:41:38 2009 memory use: 3054.1 MB Mon Dec 14 23:41:38 2009 restarting at iteration 157351 (dim = 9950034) Mon Dec 14 23:42:53 2009 linear algebra at 99.5%, ETA 1h18m Tue Dec 15 01:01:33 2009 lanczos error (dim = 9998318): not all columns used Tue Dec 15 01:01:33 2009 lanczos halted after 158116 iterations (dim = 9998318) Tue Dec 15 01:01:33 2009 linear algebra failed; retrying... Tue Dec 15 01:01:33 2009 commencing Lanczos iteration (4 threads) Tue Dec 15 01:01:33 2009 memory use: 3054.1 MB Tue Dec 15 01:01:34 2009 restarting at iteration 157351 (dim = 9950034) Tue Dec 15 01:02:49 2009 linear algebra at 99.5%, ETA 1h18m Edit: Or perhaps I'll try disabling that error and seeing if the dependencies are good. Edit 2: Nope, didn't help: Code:
Tue Dec 15 02:32:35 2009 lanczos halted after 158117 iterations (dim = 9998319) Tue Dec 15 02:33:21 2009 lanczos error: only trivial dependencies found Tue Dec 15 02:33:22 2009 BLanczosTime: 754 Tue Dec 15 02:33:22 2009 elapsed time 00:12:36 Last fiddled with by frmky on 2009-12-15 at 10:35 |
|
|
|
|
|
#2 |
|
Jul 2003
So Cal
210610 Posts |
Attached is the log file, in case any clues lie in the filtering. Using Serge's matrix dump program, I verified that the matrix has no empty or duplicate columns.
|
|
|
|
|
|
#3 |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
36·13 Posts |
I know that I know nothing (and most probably even less than Socrates and most other Greeks), but this --
Tue Dec 1 23:21:21 2009 found 23495926 cycles, need 10213258 may be the sign of under-removal and then the algorithm may have gotten into a unique territory where debugging will offer low return (because it will probably never happen again?). It then produced a matrix where the old code usually would have said "matrix can improve, retrying" -- heaviest cycle: 12 relations Right? The full merge overdid its function and made it sparse-ski. Do you think what I think? removing 20M relns from the top and /sigh/ restarting from -nc1... this is only on instincts - I am a savage savant in this business. P.S. Actually, in your 2,908+ and in 5,383+ you had heaviest cycles with 12, as well, and both turned out fine! (but they didn't have this overwhelming abundance "found/need" and both didn't spend that much time in full merge - barely 15 minutes, not 1.5 days... something was rotten in this state). Hmmm... Last fiddled with by Batalov on 2009-12-16 at 06:38 Reason: (looked at many large and average logs) |
|
|
|
|
|
#4 |
|
Tribal Bullet
Oct 2004
3,541 Posts |
It's possible for even a bug-free Lanczos implementation to fail with this error; you have to choose a subset of the remaining matrix columns that can form a small invertible matrix, and if there are very few remaining matrix columns that may not be possible. At least that's what I'd say if I had to make up a half-baked reason why it happened. Do you still have the matrix? I can try to take a look. Agree with Serge that the very long merge time is suspicious.
That being said, these sorts of errors have become gratifyingly rare; in fact this is the first one I can remember seeing in the last year or so. Preprocessing the matrix has really improved in latter-day msieve versions to cut down on Lanczos failures. Could you post the filtering results for the rerun too? Last fiddled with by jasonp on 2009-12-17 at 14:18 |
|
|
|
|
|
#5 |
|
Jul 2003
So Cal
210610 Posts |
Here's the log of the rerun with a more normal amount of relations.
|
|
|
|
|
|
#6 |
|
Jul 2003
So Cal
1000001110102 Posts |
And for your pleasure, here's a filtering run with an even more absurd number of relations. 32-bit LPs were used (as a test) on a number where 31-bit LPs would have been more appropriate. I just let full merge run for a long time, and it produced a very light matrix. For this run, the filtering source was modified to remove up to 2M cliques at a time rather than 400K, but an earlier run that used 400K had the same results until it was interrupted by an unexpected reboot. I'm of course rerunning this with fewer relations.
|
|
|
|
|
|
#7 |
|
Tribal Bullet
Oct 2004
3,541 Posts |
The second case basically ran out of 2-way cliques and still had a huge amount of excess left over. The merge phase only had to build a small matrix, but once it created that matrix it realized the matrix was very large and extremely sparse. So it kept doing more merging to reduce the matrix size. Every time it decided to do 2000 more merges, it forgets about the 500 heaviest ideals. Eventually, after a few days, it ran out of ideals to merge; in that time the forgotten heaviest ideals added up to 18M which is much larger than is necessary.
I guess the full merge should only forget about the heaviest ideals if they really are heavy. That won't make the merging any faster (in fact the opposite will happen :) but at least merging will not end prematurely. Greg, would you be willing to rerun if I point out the change to make? |
|
|
|
|
|
#8 | |
|
Jul 2003
So Cal
2·34·13 Posts |
Quote:
And look for a PM concerning the 6p323 matrix. Last fiddled with by frmky on 2009-12-18 at 02:48 |
|
|
|
|
|
|
#9 |
|
Tribal Bullet
Oct 2004
354110 Posts |
On line 510 of common/filter/merge.c, replace
Code:
for (i = 0; i < 500; i++) {
Code:
for (i = 0; inactive_heap.num_ideals > 0 &&
inactive_heap.worst_bin > 100 && i < 500; i++) {
Last fiddled with by jasonp on 2009-12-18 at 03:50 |
|
|
|
|
|
#10 |
|
Jul 2003
So Cal
1000001110102 Posts |
I made the change, and it is running now with all 580M relations.
In the meantime, I trimmed nearly 90M relations then ran the filtering. It behaved more normally. The log is attached. In the log, you will see the changes made to remove 2M cliques when >200M relations are left, then dropping to 1M cliques while >100M relations are remaining, then down to the usual 400K. This seems to work well, and significantly speeds the filtering. |
|
|
|
|
|
#11 |
|
Jul 2003
So Cal
2·34·13 Posts |
That change made essentially no difference.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| matrix needs more columns than rows | ryanp | Msieve | 2 | 2013-05-02 00:02 |
| matrix needs more columns than rows | wreck | Msieve | 7 | 2010-09-07 10:03 |
| Error: "matrix must have more columns than rows" | mdettweiler | Msieve | 10 | 2009-03-17 02:38 |
| BUG: Half-assigned exponent (blank columns) | sylvester | PrimeNet | 2 | 2008-10-28 21:32 |