mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > CADO-NFS

Reply
 
Thread Tools
Old 2020-08-21, 14:56   #12
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

73518 Posts
Default

I swapped over to a newer commit (Aug 5) and remembered why I wasn't using it - It won't communicate properly with clients:
Code:
ERROR:root:Invalid workunit file: Error: key STDOUT not recognized
I wonder if this is a conflict between commits and clients have to be closer to the server, In which case, I won't be able to use later commits because I still have some Core2 machines. . .
EdH is offline   Reply With Quote
Old 2020-08-21, 15:12   #13
RedGolpe
 
RedGolpe's Avatar
 
Aug 2006
Monza, Italy

73 Posts
Default

It seems the good guys at INRIA are already looking into my report. They don't seem to require more information for now.
RedGolpe is offline   Reply With Quote
Old 2020-08-21, 16:20   #14
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11×347 Posts
Default

I'll read the posts when I get my digest version. For now, I'm going to run my September commit and see what shows up later. I'll check the latest git again later on and see if the client communication issue has disappeared.
EdH is offline   Reply With Quote
Old 2021-05-03, 10:18   #15
bur
 
bur's Avatar
 
Aug 2020

3×5×19 Posts
Default

Unfortunately, I ran into that error on a C153 which ran over the weekend:

Code:
Warning:Command: Process with PID 849626 finished with return code -6           Error:Filtering - Duplicate Removal, removal pass: Program run on server failed
with exit code -6                                                               Error:Filtering - Duplicate Removal, removal pass: Command line was: /home/flori
an/Math/cado-nfs/build/florian-Precision-3640-Tower/filter/dup2 -poly ./workdir/AL30081984/1971-C153/c155.poly -nrels 62519376 -renumber ./workdir/AL30081984/19
71-C153/c155.renumber.gz ./workdir/AL30081984/1971-C153/c155.dup1//0/dup1.0.0000.gz ./workdir/AL30081984/1971-C153/c155.dup1//0/dup1.0.0001.gz > ./workdir/AL300
81984/1971-C153/c155.dup2.slice0.stdout.4 2> ./workdir/AL30081984/1971-C153/c155.dup2.slice0.stderr.4                                                           Error:Filtering - Duplicate Removal, removal pass: Stderr output (last 10 lines
only) follow (stored in file ./workdir/AL30081984/1971-C153/c155.dup2.slice0.std
err.4):
Error:Filtering - Duplicate Removal, removal pass:      antebuffer set to /home/
florian/Math/cado-nfs/build/florian-Precision-3640-Tower/utils/antebuffer
Error:Filtering - Duplicate Removal, removal pass:      [checking true duplicate
s on sample of 750234 cells]
Error:Filtering - Duplicate Removal, removal pass:      Allocated hash table of
75023359 entries (286MiB)
Error:Filtering - Duplicate Removal, removal pass:      Constructing the two fil
elists...
Error:Filtering - Duplicate Removal, removal pass:      2 files (2 new and 0 alr
eady renumbered)
Error:Filtering - Duplicate Removal, removal pass:      Reading files already re
numbered:
Error:Filtering - Duplicate Removal, removal pass:      Reading new files (using
 3 auxiliary threads for roots mod p):
Error:Filtering - Duplicate Removal, removal pass:      terminate called after t
hrowing an instance of 'renumber_t::corrupted_table'
Error:Filtering - Duplicate Removal, removal pass:        what():  Renumber tabl
e is corrupt: cannot find p=0x4a2bfa9, r=0xd70340 on side 1; note: vp=0x4a2bfb6,
 vr=0xd70340
Error:Filtering - Duplicate Removal, removal pass:
Traceback (most recent call last):
  File "./cado-nfs.py", line 122, in <module>
    factors = factorjob.run()
  File "./scripts/cadofactor/cadotask.py", line 6131, in run
    last_status = task.run()
  File "./scripts/cadofactor/cadotask.py", line 3845, in run
    raise Exception("Program failed")
Exception: Program failed
Restarting with parameters.snaphop.0 didn't help.

It seems I can still use the relations by having msieve continue the work? How would I do that?

According to https://www.mersenneforum.org/showth...48&page=21#227 it seems I can cat the gz files and have msieve process them. But if one of the files is apparently corrupted, how do I find out which one? They all have a size between 3 and 7 MB. I did a zcat | grep and the missing 4a2bfa9 prime is present in some relation, but does that help?

Please don't tell me all is lost...

Last fiddled with by bur on 2021-05-03 at 11:07
bur is offline   Reply With Quote
Old 2021-05-03, 12:13   #16
bur
 
bur's Avatar
 
Aug 2020

28510 Posts
Default

So I just ignored the cado error message and used the relations with msieve. In case someone has the same problem in the future:

All required files are in workdir/cxxx.upload.
First combine all gz compressed relations into one rels.dat:
Code:
zcat *.gz > rels.dat
Then use convert_poly in cado-nfs/build/machine/misc to convert the cnnn.poly file to cnnn.fb:
Code:
convert_poly -if cado -of msieve < c155.poly > c155.fb
I suggest copying both files to a new directory so nothing gets accidentally modified. Create a cnnn.n file with the number to be factored and then run:
Code:
../msieve/msieve -i c155.n -s rels.dat -l c155msieve.log -nf c155.fb -t 10 -nc1
../msieve/msieve -i c155.n -s rels.dat -l c155msieve.log -nf c155.fb -t 10 -nc2
../msieve/msieve -i c155.n -s rels.dat -l c155msieve.log -nf c155.fb -t 10 -nc3
Currently I'm at the -nc2 step and it's performing LA with an ETA of 2:20 hours.

For sake of completeness, if not enough relation are found, see https://www.mersenneforum.org/showth...48&page=21#230 for how to make cado-nfs do more sieving. After that it should be possible use msieve as explained above.
bur is offline   Reply With Quote
Old 2021-05-03, 13:04   #17
charybdis
 
charybdis's Avatar
 
Apr 2020

11×31 Posts
Default

Quote:
Originally Posted by bur View Post
Code:
../msieve/msieve -i c155.n -s rels.dat -l c155msieve.log -nf c155.fb -t 10 -nc1
../msieve/msieve -i c155.n -s rels.dat -l c155msieve.log -nf c155.fb -t 10 -nc2
../msieve/msieve -i c155.n -s rels.dat -l c155msieve.log -nf c155.fb -t 10 -nc3
-nc performs all of -nc1, -nc2, -nc3 in succession.
charybdis is offline   Reply With Quote
Old 2021-05-03, 13:09   #18
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

11·347 Posts
Default

Good post!

I thought I had posted a "How I ..." on using CADO-NFS for poly/sieving and Msieve for LA, but apparently I've been slacking. This is how I run all my larger jobs. I had originally written my own conversion (for the .fb), before I learned of the provided one.

For some of my scripts, I do a check for *.cyc after the -nc1 step. The scripts use the existence of that file to tell whether filtering succeeded or not. Then the scripts can either call -nc2 or call for more sieving.

Not sure if you know this (you probably do), but if -nc2 is interrupted, use -ncr to continue. If you use -nc2 again, it will start LA from scratch.
EdH is offline   Reply With Quote
Old 2021-05-03, 13:32   #19
bur
 
bur's Avatar
 
Aug 2020

3·5·19 Posts
Default

Thanks, it's basically your linked post with the small addition of how to convert poly to fb. I'm glad this error can easily be worked out, otherwise I'd be quite nervous on longer jobs.

Not sure why cado-nfs chokes on othe rels while msieve has no problem with them.

Quote:
This is how I run all my larger jobs.
Why is that? Is msieve faster on those steps?

Quote:
-nc performs all of -nc1, -nc2, -nc3 in succession.
Yes, and EdH already mentioned that in his post, I still used the seperate steps since I wasn't sure it would work at all with the corruption cado-nfs talked about.
bur is offline   Reply With Quote
Old 2021-05-03, 13:47   #20
charybdis
 
charybdis's Avatar
 
Apr 2020

1010101012 Posts
Default

Quote:
Originally Posted by bur View Post
Not sure why cado-nfs chokes on othe rels while msieve has no problem with them.
I don't think there's anything wrong with the relations, it's a bug in the way that CADO duplicate removal processes them. And if a few relations are bad, then msieve will just ignore them.

Quote:
Why is that? Is msieve faster on those steps?
The most time-consuming part of the postprocessing, the linear algebra (-nc2), is substantially faster with msieve than with CADO. In addition, CADO uses much more memory than msieve during the filtering stage, so a given machine will be able to run larger numbers with msieve than with CADO.
charybdis is offline   Reply With Quote
Old 2021-05-03, 14:16   #21
bur
 
bur's Avatar
 
Aug 2020

1000111012 Posts
Default

Ah, that's good to know!

Maybe a stupid question, but since msieve is open source why is the implementation of cado-nfs linear algebra not just taken from msieve?
bur is offline   Reply With Quote
Old 2021-05-03, 14:45   #22
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

4,861 Posts
Default

CADO's algorithm features less interprocess communication during the (longest) first stage of matrix solving than msieve, which allows jobs to be split among machines fruitfully. This allows larger jobs to be run on regular hardware.

An ideal solution would be to have an -msieve flag in CADO which runs the matrix using msieve within the cado-nfs.py wrapper.
VBCurtis is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
CADO-NFS error (exit code -9) RedGolpe CADO-NFS 6 2020-09-01 12:29
Is there an error code listing for msieve? EdH Msieve 2 2019-11-14 22:58
CADO-NFS Square Root Error Ferrier CADO-NFS 3 2019-11-01 23:51
Error Code 40 storm5510 Software 19 2016-11-14 15:59
HRF3.TXT now has computer-id and error code GP2 Data 2 2003-10-09 06:46

All times are UTC. The time now is 20:14.


Fri Jul 16 20:14:24 UTC 2021 up 49 days, 18:01, 1 user, load averages: 2.24, 2.19, 2.22

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.