mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   CADO-NFS (https://www.mersenneforum.org/forumdisplay.php?f=170)
-   -   CADO NFS (https://www.mersenneforum.org/showthread.php?t=11948)

VBCurtis 2020-04-16 06:00

Yes, but it's hard to understand how. It seems to be the database side that's failing, yet bigger numbers gather relations and hand out WUs more slowly than smaller jobs. So, things *should* function better, because the rate of transactions is smaller... at least, I would think? (well, I do think... but I'm wrong)

In the Cunningham 2330L C207 thread last summer, one of the CADO contributors suggested a different database backend. I have zero experience with such things, so I didn't consider exactly what he suggested, but that's why I think it's something about the server/backend rather than a sieve parameter; certainly, one might trigger the other for reasons unknown.

EdH 2020-04-16 13:43

I think I might have to set my laziness aside and join the CADO-NFS mailing list to see if I can gain a little more insight into the package. I'm a very amateur programmer, so I can't readily figure out in depth programs, but I do sometimes try alterations.

A "pet peeve" with CADO-NFS is its manner of crashing (dying) instead of gracefully ending, if you use "tasks.filter.run = false." Now I have 30+ pieces of "farm" equipment that are still looking for assignments. I'll probably have to restart without the "false" and crash that run, to "gracefully" stop the clients, rather than going to each one to issue CTRL-C..

Another difficulty is that I was unable to get a remote Colab install to run las on its own because it didn't like something about the roots1 data.

I do realize it's ME, but a lot of these things I keep trying, appear to be just opposite what the programmers intended. If I just had a better understanding (and a longer attention span). . .

EdH 2020-04-23 21:10

This is speculation, but I think I have discovered why CADO-NFS stops issuing WUs.

If WUs are tardy, the server tries to wait for the late WUs until timedout is reached, whereas it reissues them. But, it appears that if too many are late, it stops sending all others until they are caught up.

I have changed my timedout to 43200 (12 hours) to cover the sleeping time for some of my machines and have had no noticeable instances of WUs not being handed out for my current run.

Let's see if this theory is disproven now that I have posted it. . .

Dylan14 2020-04-29 17:11

So I ran into a problem with the cado-nfs-client.py file, in particular, it couldn't find my las executable:

[code]FileNotFoundError: [Errno 2] No such file or directory: "'/home/dylan/bin/cado/cado-nfs/build/dylan-xps159570/sieve/las'"[/code]

Looking into the code and adding a print statement in the run_command function I got this as the command_list:

[CODE]["'/home/dylan/bin/cado/cado-nfs/build/dylan-xps159570/sieve/las'", '-poly', "'download/sean198.c198.poly'", '-q0', '11290000', '-A', '30', '-q1', '11292000', '-lim0', '536000000', '-lim1', '536000000', '-lpb0', '33', '-lpb1', '33', '-mfb0', '64', '-mfb1', '95', '-ncurves0', '30', '-ncurves1', '15', '-fb1', "'download/sean198.c198.roots1.gz'", '-out', "'dylan-xps159570.79cb1ba5.work/sean198.c198.11290000-11292000.gz'", '-t', '6', '-stats-stderr'][/CODE]

The path inside all the quotes is indeed the location of my las executable, but somehow all the quotes confused it. So, to fix this I did the following:
(*) imported the shlex module
(*) replaced
[CODE]command_list = command if isinstance(command, list) else command.split(" ")[/CODE]
with
[code]command_list = shlex.split(command_str)[/code]

and then this worked (at least to start the run up. Waiting to see if it submits successfully - which it did!)

henryzz 2020-04-30 07:14

[QUOTE=Dylan14;544203]So I ran into a problem with the cado-nfs-client.py file, in particular, it couldn't find my las executable:

[code]FileNotFoundError: [Errno 2] No such file or directory: "'/home/dylan/bin/cado/cado-nfs/build/dylan-xps159570/sieve/las'"[/code]

Looking into the code and adding a print statement in the run_command function I got this as the command_list:

[CODE]["'/home/dylan/bin/cado/cado-nfs/build/dylan-xps159570/sieve/las'", '-poly', "'download/sean198.c198.poly'", '-q0', '11290000', '-A', '30', '-q1', '11292000', '-lim0', '536000000', '-lim1', '536000000', '-lpb0', '33', '-lpb1', '33', '-mfb0', '64', '-mfb1', '95', '-ncurves0', '30', '-ncurves1', '15', '-fb1', "'download/sean198.c198.roots1.gz'", '-out', "'dylan-xps159570.79cb1ba5.work/sean198.c198.11290000-11292000.gz'", '-t', '6', '-stats-stderr'][/CODE]

The path inside all the quotes is indeed the location of my las executable, but somehow all the quotes confused it. So, to fix this I did the following:
(*) imported the shlex module
(*) replaced
[CODE]command_list = command if isinstance(command, list) else command.split(" ")[/CODE]
with
[code]command_list = shlex.split(command_str)[/code]

and then this worked (at least to start the run up. Waiting to see if it submits successfully - which it did!)[/QUOTE]
You might want to report this error to the devs. It looks like the sort of thing that could be broken python 2/3 compatibility. I believe CADO uses python 2 but they aim for it to work in 3 as well. I don't have much python experience so I may be wrong.

Dylan14 2020-04-30 14:34

[QUOTE=henryzz;544259]You might want to report this error to the devs. It looks like the sort of thing that could be broken python 2/3 compatibility. I believe CADO uses python 2 but they aim for it to work in 3 as well. I don't have much python experience so I may be wrong.[/QUOTE]


Just submitted the issue on their repository. Waiting for a response.

Dylan14 2020-04-30 16:42

The issue is fixed in a new commit, basically it was a conflict of version between a "new" client and a "old" server.

EdH 2020-08-26 19:55

I'm happy to mention that I have the git version:
[code]
commit a1dbe6b800a0bef436f0723e62c5b502955e0c2a
Author: Emmanuel Thom´┐Ż <Emmanuel.Thome@inria.fr>
Date: Sat Aug 22 18:56:34 2020 +0200
[/code]running on all my machines, including the Core2 ones that wouldn't compile last year.:smile:

VBCurtis 2020-08-26 20:41

Excellent! That means I can update my Core2-Xeons this winter for another factorization. They sat unused this past winter, as the version of CADO installed on them wouldn't work as a client with the current version as server and the then-current version wouldn't compile. I missed out on all that "free" room heating!

EdH 2020-08-26 21:25

[QUOTE=VBCurtis;555057]Excellent! That means I can update my Core2-Xeons this winter for another factorization. They sat unused this past winter, as the version of CADO installed on them wouldn't work as a client with the current version as server and the then-current version wouldn't compile. I missed out on all that "free" room heating![/QUOTE]
My Core2s are Intel Quads (Q8400 and Q9???). They wouldn't compile the then current git last year, but worked fine as clients with an earlier commit until just recently when there was change that forced me to update everything - none of my clients would communicate with the newer version server and I was experiencing an occasional failure with the one I was running. ATM, all is well.

EdH 2020-08-27 18:01

[QUOTE=VBCurtis;555057]Excellent! That means I can update my Core2-Xeons this winter for another factorization. They sat unused this past winter, as the version of CADO installed on them wouldn't work as a client with the current version as server and the then-current version wouldn't compile. I missed out on all that "free" room heating![/QUOTE]
I must have jinxed it! Don't upgrade just yet. More later.


All times are UTC. The time now is 12:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.