mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   CADO-NFS (https://www.mersenneforum.org/forumdisplay.php?f=170)
-   -   CADO NFS (https://www.mersenneforum.org/showthread.php?t=11948)

VBCurtis 2018-02-02 04:29

[QUOTE=EdH;479057]On a different note:

Am I correct in thinking the polyselect is distributed as well as sieving? Is LA distributed also? If not, How large a number should I be able to handle with 12GB of RAM? (I might be able to increase that to 16GB, if necessary.)
Ed[/QUOTE]

Polyselect is distributed, and also multi-threaded; it will use precisely the same resources as sieving (well, less memory). The postprocessing steps are not distributed; they run only on the host/server machine. Memory use is higher than msieve, but I haven't run a job larger than low 140s yet so I don't have a firm idea of how it scales or whether GNFS-170 will fit into 12GB. My guess is that you're good into the 160s with 12GB, but maybe not 170s.
I'm running GNFS-139 now with I=13, with each siever instance taking ~500MB virt ~250MB res according to "top". I=14 means 4x the memory use per instance, so 1GB minimum. I'm not sure I'll be running an I=15 job unless we do a group-job in the 180s.

EdH 2018-02-02 18:31

[QUOTE=Dubslow;479058][URL]https://askubuntu.com/questions/421642/libc-so-6-version-glibc-2-14-not-found[/URL]

How old is that Debian client?[/QUOTE]
Ancient!:smile:

OK, my Ubuntu i7 jumped right in and is working along, although, oddly, the ETA has not changed on the server.

I watched as I added four threads to fully occupy the client and none of the additions affected the ETA. The terminals were affected, in that as I added another terminal, all the others appeared to slow down. hmmm...

BTW, I had missed the ".local" portion of the address for my local machine, so now it is working by name.

Thanks for all th help. I'll play a bit and see where it leads me.

EdH 2018-02-02 23:48

Disappointment and too much lack of understanding!

The trial with client failed again due to too many errors. Maybe it has to do with communication timeouts, but if the two machines are on a LAN, they should be able to communicate pretty fast.

I'm now trying to see if the factoring will complete with only the server, which it seems to be doing. But, I have noticed that with the current invocation line, I have 16 instances of las running. Is this because I have tasks.threads=2 and --server-threads=8?

I think I have to increase the WU size. It just seems to be iterating at too high a rate - just about every three seconds, if that.

VBCurtis 2018-02-03 02:09

Which client failed? The old debian client was throwing errors about not having current libraries, so that's a lost cause unless you update the OS.

2.3.0 is good at figuring out how many threads your system has; I use the overrides because I have other jobs running and don't want CADO to use every available thread. 16 instances of las suggests your params file has number of slaves set improperly; I use two (I think it's called slaves.nrclients, in the first section of parameters where you put the input number). Combined with client-threads=2, you'd have two 2-threaded instances of polyselect and then las running on the server itself. If you set slaves.nrclients = 4, you should get 4 2-threaded instances which would use the entire i7.

The sieve parameters section has the setting for size of sieving work; I've stuck to 10000 for no particular reason, and for a 100-digit input they'll go by pretty fast (not 3 seconds fast, though!). Poly select also has a parameter for this; adrange, I believe it's called. If you're doing a small test run, adrange can be set to 1/10th of admax so that workunits are larger than a couple seconds.

If you do complete a factorization, you should know that I've been working on improved parameter files; c95 through c115 have been improved by 20-30% over default. Those files will be in the development version (perhaps presently, perhaps soon- I haven't updated my beta via git since sending the files to the developers).

I'm working on improvements to c120 through c140 presently.

EdH 2018-02-03 03:48

Well, running just the server machine failed as well!:sad:
[code]
Error:Lattice Sieving: Program run on localhost+2 failed with exit code 1
Error:Lattice Sieving: Stderr output follows (stored in file /tmp/work/test_runC.upload/test_runC_sieving_4306000-4308000.5bh7ukx1.stderr0):
b'ERROR: The special q (23 bits) is larger than the large prime bound on side 1 (22 bits).\n You can disable this check with the -allow-largesq argument,\n It is for instance useful for the descent.\n'
Error:Lattice Sieving: Exceeded maximum number of failed workunits, maxfailed=100
Traceback (most recent call last):
File "./cado-nfs.py", line 122, in <module>
factors = factorjob.run()
File "./scripts/cadofactor/cadotask.py", line 5429, in run
last_status, last_task = self.run_next_task()
File "./scripts/cadofactor/cadotask.py", line 5504, in run_next_task
return [task.run(), task.title]
File "./scripts/cadofactor/cadotask.py", line 2843, in run
self.submit_command(p, "%d-%d" % (q0, q1), commit=False)
File "./scripts/cadofactor/cadotask.py", line 1505, in submit_command
self.wait()
File "./scripts/cadofactor/cadotask.py", line 1576, in wait
if not self.send_request(Request.GET_WU_RESULT):
File "./scripts/cadofactor/cadotask.py", line 1367, in send_request
return super().send_request(request)
File "./scripts/cadofactor/patterns.py", line 66, in send_request
return self.__mediator.answer_request(request)
File "./scripts/cadofactor/cadotask.py", line 5572, in answer_request
result = self.request_map[key]()
File "./scripts/cadofactor/wudb.py", line 1578, in send_result
was_received = self.notifyObservers(message)
File "./scripts/cadofactor/patterns.py", line 32, in notifyObservers
if observer.updateObserver(message):
File "./scripts/cadofactor/cadotask.py", line 2858, in updateObserver
if self.handle_error_result(message):
File "./scripts/cadofactor/cadotask.py", line 1647, in handle_error_result
raise Exception("Too many failed work units")
Exception: Too many failed work units
[/code]I had no clients running. This was a C100 trial.

The "old Debian" is actually on a flaky machine - a Core2 Quad that has to have the memory reseated every few days.

VBCurtis 2018-02-03 06:13

Try setting I = 12 in the params file. The error indicates you're running out of Q; however, the "too many failed workunits" suggests something seriously failing in your install, so changing I to 12 (equivalent of 12e siever rather than 11e) won't change the workunit-failing situation.

My CADO install has a publicly-accessible server; PM me your IP address (last 3 digits don't matter) and I'll add you to my server's whitelist. That way, you can confirm the client software is working.

I double-checked my home machine's files, and my general parameters look as follows:
[code]name = c92a
N = 13123169399981357427979584830444634510645722298048711728728973540444487774736313087181371287
slaves.hostnames = localhost
slaves.nrclients = 1
tasks.verbose = 1[/code]
The rest of the file can be left unchanged, and CADO will run using a single two-threaded process via command line: ./cado-nfs.py a.c92
(I named the param file a.c92 in the main CADO directory).

EdH 2018-02-03 17:00

Thanks Curtis,

Let me try one thing at a time for the moment and see if I can get the server part to complete the C100. When I originally compiled cado-nfs it ran fine for the example composite (C59?).

I just thought of something, though. The C100 is from the db's list of composites. I have run no ECM and don't know the structure of the factors. If it has small factors, would that cause a problem? Maybe I need to construct a composite with larger factors.

On a side note of interest (to me, at least), I've been running the Debian machine for months, if not years. I have to keep it relatively close and reseat the memory every few days, so I let it do short jobs. Since I ran it as a client, I can't get it to run at all. To be fair, I have no monitor, keyboard or mouse on it to examine anything, but it "used" to come up headless without an issue after reseating the memory.

VBCurtis 2018-02-03 18:12

I once ran a job through CADO without pretesting, and found an 11-digit factor; it's a safe bet that NFS doesn't care about the size of the cofactors, theoretically or in practice.

xilman 2018-02-03 19:31

[QUOTE=VBCurtis;479187]I once ran a job through CADO without pretesting, and found an 11-digit factor; it's a safe bet that NFS doesn't care about the size of the cofactors, theoretically or in practice.[/QUOTE]NFS certainly doesn't.

A recent mailing to the cado-nfs list was from someone who complained about the time taken to factor a C180 (or so, this from memory) which was the product of a P2 and a P178. From this, we can deduce that CADO doesn't look for small factors. The conclusion should have been obvious to anyone paying even the slightest attention to the software.

EdH 2018-02-03 20:23

Well, here's another instance that didn't work:

I decided to start from scratch and d/l'ed the git version. It compiled with no issues. I then copied the parameters file into the cado-nfs directory as workparams and changed the following within:
[code]
name = test1_run1
sourcedir=${HOME}/Math/cado-nfs
[/code]It had trouble with ${HOSTNAME}. I guess it wasn't exported properly, so I changed that.
[code]
builddir=$(sourcedir)/build/math90
[/code]No other changes were made. I ran using:
[code]
./cado-nfs.py workparams
[/code]Two instances of the siever ran according to top. It completed successfully in a very short time frame.

So, I changed:
[code]
name = test1_run2
N = <C85 from factordb composite list>
[/code]It ran for a long time, again using only 2 instances of las according to top. But, it eventually failed in the same manner as before:
[code]
Info:Lattice Sieving: Marking workunit test1_run2_sieving_4296000-4298000 as not ok (98.7% => ETA Sat Feb 3 14:40:41 2018)
Info:Lattice Sieving: Resubmitting workunit test1_run2_sieving_4296000-4298000 as test1_run2_sieving_4296000-4298000#2
Info:HTTP server: 127.0.0.1 Sending workunit test1_run2_sieving_4302000-4304000 to client localhost
Info:HTTP server: 127.0.0.1 Sending workunit test1_run2_sieving_4304000-4306000 to client localhost+2
Info:HTTP server: 127.0.0.1 Sending workunit test1_run2_sieving_4306000-4308000 to client localhost
Info:HTTP server: 127.0.0.1 Sending workunit test1_run2_sieving_4308000-4310000 to client localhost+2
Info:Lattice Sieving: Adding workunit test1_run2_sieving_4318000-4320000 to database
Info:Lattice Sieving: Adding workunit test1_run2_sieving_4320000-4322000 to database
Error:Lattice Sieving: Program run on localhost failed with exit code 1
Error:Lattice Sieving: Stderr output follows (stored in file /tmp/work/test1_run2.upload/test1_run2_sieving_4298000-4300000.dri5sfxg.stderr0):
b'ERROR: The special q (23 bits) is larger than the large prime bound on side 1 (22 bits).\n You can disable this check with the -allow-largesq argument,\n It is for instance useful for the descent.\n'
Error:Lattice Sieving: Exceeded maximum number of failed workunits, maxfailed=100
Traceback (most recent call last):
File "./cado-nfs.py", line 122, in <module>
factors = factorjob.run()
File "./scripts/cadofactor/cadotask.py", line 5441, in run
last_status, last_task = self.run_next_task()
File "./scripts/cadofactor/cadotask.py", line 5516, in run_next_task
return [task.run(), task.title]
File "./scripts/cadofactor/cadotask.py", line 2855, in run
self.submit_command(p, "%d-%d" % (q0, q1), commit=False)
File "./scripts/cadofactor/cadotask.py", line 1505, in submit_command
self.wait()
File "./scripts/cadofactor/cadotask.py", line 1581, in wait
if not self.send_request(Request.GET_WU_RESULT):
File "./scripts/cadofactor/cadotask.py", line 1367, in send_request
return super().send_request(request)
File "./scripts/cadofactor/patterns.py", line 66, in send_request
return self.__mediator.answer_request(request)
File "./scripts/cadofactor/cadotask.py", line 5584, in answer_request
result = self.request_map[key]()
File "./scripts/cadofactor/wudb.py", line 1578, in send_result
was_received = self.notifyObservers(message)
File "./scripts/cadofactor/patterns.py", line 32, in notifyObservers
if observer.updateObserver(message):
File "./scripts/cadofactor/cadotask.py", line 2870, in updateObserver
if self.handle_error_result(message):
File "./scripts/cadofactor/cadotask.py", line 1652, in handle_error_result
raise Exception("Too many failed work units")
Exception: Too many failed work units
[/code]

RichD 2018-02-03 21:45

I recall reading somewhere that if the project name had an underscore within, it caused problems later on. CADO parses the three pieces of the WU name to decide how to handle it. I assume it has been fixed with the version you are using. (or change the name)

EdH 2018-02-03 21:55

I think my problem was trying to copy parameters into the main directory and just changing a couple things. I seem to have it working properly without using a parameters file at this point. Even polyselect and the right number of sievers seem to be running. The C85 was factored fine. Next, if this C100 succeeds, I'll start trying to add options.

Thanks for everyone's help.

EdH 2018-02-03 21:56

[QUOTE=RichD;479194]I recall reading somewhere that if the project name had an underscore within, it caused problems later on. CADO parses the three pieces of the WU name to decide how to handle it. I assume it has been fixed with the version you are using. (or change the name)[/QUOTE]
I remember seeing something about that as well. Maybe that did have bearing, since I'm now not using the parameters file that had the setting.

EdH 2018-02-03 23:33

OK, so here's where I am now:

I have a running server/client by "cat"ing the c110 parameter file to my workparams file which only consisted of:
[code]
N = 1188994805567322501851273099675616403261749446772493394217359815098305093295029641819505406307228193
task.workdir = /tmp/work/
slaves.hostnames = localhost
sourcedir=${HOME}/Math/cado-nfs
builddir=$(sourcedir)/build/math90

server.whitelist = 192.168.0.0/24
server.ssl = no
server.port = 13579
slaves.nrclients = 2
[/code]I used:
[code]
./cado-nfs.py workparams
[/code] to start the run. I have four instances running on the separate client machine and they all seem to be running correctly. The only "problem" I see, is that my server is overrun with sievers. I have no less than 20 las entries (via top) on the server. I have tried different values of --server-threads=2, 4, 8 with no change.

But, happily, it completed a C100 within 21 minutes while I wrote this post.:smile:

VBCurtis 2018-02-04 07:01

This sounds like progress!

Have you made sure none of those las instances are from previous factorization attempts? I've seen las continue hammering away for quite a while after a server has been killed. I would be surprised if your server invocation fired up 10 las processes!

xilman 2018-02-04 07:20

[QUOTE=RichD;479194]I recall reading somewhere that if the project name had an underscore within, it caused problems later on. CADO parses the three pieces of the WU name to decide how to handle it. I assume it has been fixed with the version you are using. (or change the name)[/QUOTE]Guess who found and reported that one ...

EdH 2018-02-06 04:22

[QUOTE=VBCurtis;479225]This sounds like progress!

Have you made sure none of those las instances are from previous factorization attempts? I've seen las continue hammering away for quite a while after a server has been killed. I would be surprised if your server invocation fired up 10 las processes![/QUOTE]
I had noticed the persistence of the sievers in prior runs. The latest two tests were started well after the last instance of las disappeared from the top list. No instances appeared again until after the polyselect finished and the 20 instances appeared for a second time.

Something has pulled me away from this project for a short time, but hopefully I can get back to it soon.

EdH 2018-02-25 20:12

OK, I need more education. I tried to use the params.c95 file to "enhance" the factoring of a c94. The basic run without the params.c95 took very little time:
[code]
Total cpu/elapsed time for entire factorization: 4240.37/752.941
[/code]I ran the command line:
[code]
./cado-nfs.py --parameters parameters/factor/params.c95 1975636228803860706131861386351317508435774072460176838764200263234956507563682801432890234281
[/code]It appeared like it was going to work:
[code]Info:root: No database exists yet
Info:root: Created temporary directory /tmp/cado.lf6yfpql
Info:Database: Opened connection to database /tmp/cado.lf6yfpql/c95.db
Info:root: Set tasks.threads=8 based on detected logical cpus
Info:root: tasks.polyselect.threads = 2
Info:root: tasks.sieve.las.threads = 2
...
Info:Complete Factorization: Factoring 1975636228803860706131861386351317508435774072460176838764200263234956507563682801432890234281
...
Info:Polynomial Selection (size optimized): Starting
Info:Polynomial Selection (size optimized): 0 polynomials in queue from previous run
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_0-10000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_10000-20000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_20000-30000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_30000-40000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_40000-50000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_50000-60000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_60000-70000 to database
Info:Polynomial Selection (size optimized): Adding workunit c95_polyselect1_70000-80000 to database
[/code]But, it never returned and there was no evidence via top of any cado-nfs procedures running.

edit: I see that the file I was trying to use is, in fact, the one it defaults to if no parameters file is given. But, shouldn't it still work the way I tried to invoke it?

VBCurtis 2018-02-25 20:47

Default behavior does what you seem to be trying to do manually: CADO looks in the parameters folders for default params to choose for the exact number of digits of your input, then rounds to the nearest 5 digits and looks for a params file of that size.

I think by directing CADO to the place it usually looks, you're also triggering behavior that expects other items in the file.

Generally, we either put the composite on the command line and let CADO choose the params file itself; or we edit the params file with info unique to a single factorization, save the file in the main CADO-NFS directory, and invoke CADO with "./cado-nfs.py samplec94.file" (where samplec94.file is mostly the c95.params file with any edits you wish to make.

See the c90.params for more details, I believe.

EdH 2018-02-25 20:56

Thanks Curtis,

I see I was editing when you posted. I had discovered that I was trying to call the same file as the default called. I hadn't caught that at first. I guess I can understand the rest.

If I wanted to factor a c100, but to add other things, then the best approach would be to copy the c100 file and add the number and the extra things to the copy or supply the extra via the command line and let CADO-NFS default to the c100. Does that sound right?

Ed

edit: I had noticed about the extra descriptions in the c90 file, but thanks for pointing me in that direction.

EdH 2018-02-25 23:44

OK, I stepped back and reread my earlier posts and found the stoppage. I needed to add the line slaves.hostnames = localhost. Apparently, without that option, it wasn't starting any work on the host machine.

My next question is, why the default is to start only 2 threads each for polyselect and sievers on the host? The README says:
[code]
... where the option '-t 2' tells how many cores (via threads) to use on the
current machine (for polynomial selection, sieving, linear algebra, among
others). [B]It is possible to set '-t all' (which, in fact, is the default)[/B]
to use all threads on the current machine.
[/code]However, when I run, even with -t all, I get:
[code]
Info:root: Set tasks.threads=8 based on detected logical cpus
Info:root: tasks.polyselect.threads = 2
Info:root: tasks.sieve.las.threads = 2
[/code]and it shows only 200% for the polyselect and las threads listed in top.

Thanks for all.

KingsAlpaca 2018-02-26 15:19

Need help on factorization
 
I am trying to use Cado-nfs to factorize a 172-digit number. And I have finished the sieving stage. However, due to an expected disk problem, my operation was interrupted. I am trying to restart this process, but it looks like the program started again from the very beginning. I just wonder if there is a way to let it start again at the stage after sieving? Many thanks!

KingsAlpaca 2018-02-26 16:25

I found some instructions in /scripts/cadofactor/README, which looks like the solution to my previous post. However, for the command:

[QUOTE]If you want to import already computed relations, use:

tasks.sieve.import=foo

where "foo" is the name of the relation file to be imported (in CADO-NFS
format).[/QUOTE]

I am not sure about how the relations files look like and where I can find them. I feel like they are in cxxx.upload , but there are a lot of files in there.

KingsAlpaca 2018-02-26 17:51

New problem
 
I managed to make it run again from the stopping point. However a new problem appears and I have no idea about it.

The program is interrupted again! It would be grateful if anybody can help. Here is the error code:

[CODE]Error:Filtering - Merging: Program run on server failed with exit code 1
Error:Filtering - Merging: Command line was: /home/ubuntu/cado-nfs/build/ip-172-31-6-213/filter/merge -mat /tmp/cado.jtfg_n5g/c170.purged.gz -out /tmp/cado.jtfg_n5g/c170.history.gz -maxlevel 40 -keep 160 -skip 32 -target_density 170.0 -t 72 > /tmp/cado.jtfg_n5g/c170.merge.stdout.1 2> /tmp/cado.jtfg_n5g/c170.merge.stderr.1
Error:Filtering - Merging: Stderr output follows (stored in file /tmp/cado.jtfg_n5g/c170.merge.stderr.1):
b'# Warning: parameter verbose_flags is checked by this program but is undocumented.\nError: maxlevel should be positive and less than 32\nUsage: /home/ubuntu/cado-nfs/build/ip-172-31-6-213/filter/merge <parameters>\nThe available parameters are the following:\n -mat input purged file\n -out output history file\n -keep excess to keep (default 160)\n -skip number of heavy columns to bury (default 32)\n -maxlevel maximum number of rows in a merge (default 10)\n -target_density stop when the average row density exceeds this value (default 170.0)\n -resume resume from history file\n -mkztype controls how the weight of a merge is approximated (default 1)\n -wmstmax controls until when a mst is used with -mkztype 2 (default 7)\n -forbidden-cols list of columns that cannot be used for merges\n -force-posix-threads (switch)\n -path_antebuffer path to antebuffer program\n -v verbose level\n -t number of threads\n'
Traceback (most recent call last):
File "./cado-nfs.py", line 122, in <module>
factors = factorjob.run()
File "./scripts/cadofactor/cadotask.py", line 5459, in run
last_status, last_task = self.run_next_task()
File "./scripts/cadofactor/cadotask.py", line 5534, in run_next_task
return [task.run(), task.title]
File "./scripts/cadofactor/cadotask.py", line 3911, in run
raise Exception("Program failed")
Exception: Program failed[/CODE]

Dubslow 2018-02-26 18:02

[quote]Error: maxlevel should be positive and less than 32[/quote]

From your pasted code.

Also, gah the error printing needs some formatting work.

KingsAlpaca 2018-02-26 18:37

Thanks Dubslow,

I changed my "tasks.filter.maxlevel" to 32 and indeed it looks ok now. Although I have no idea why the initial value of "tasks.filter.maxlevel" is 40.

I also noticed the problem after posting the previous message. I am still a newbie on factorization. I will try to improve in the future! Thanks again for your advice.

EdH 2018-02-28 23:17

I'm coming up with many things to question and some machines that don't work as clients even though they seem to run well as a standalone. Is it better to address everything here or would it be better to use the discussion group?

I'm kind of up in the air over whether I will replace all my factmsieve.py distributive scripts with CADO-NFS. My scripts are cumbersome and need rewriting, but they may have an edge on CADO-NFS. My overall scripts distribute ECM as well and initiate poly selection on an ancient GPU while ECM is running. I could still use the ECM portion and replace the gnfs part with CADO-NFS. If I read the READMEs correctly, I could still run the poly selection on the GPU and import it to CADO-NFS. Perhaps I will experiment with that...

Any insights?

Dubslow 2018-02-28 23:30

What edge may your scripts have over CADO?

EdH 2018-03-01 04:22

[QUOTE=Dubslow;481209]What edge may your scripts have over CADO?[/QUOTE]
I was thinking mostly of factmsieve.py and ggnfs having an edge over CADO. As to the scripts, they run continuously awaiting to either perform ECM segments or sieving, going idle during off-times. Whether they exist or not is of no significance to the host and the main script doesn't really care if they aren't working properly, other than possibly receiving bad relations that may need to be culled. There is also a minimum of communication, consisting mainly of work requests and acknowledgments,

I must admit to not having run the scripts for a few months and I would have to familiarize myself with them again to better describe them. And, I did have collisions on occasion that duplicated an occasional bit of work. They need some refinement.

I used them mostly as an automated Aliqueit system. I ran quite a few sequences with them a while back with over 20 machines of various level contributing to the whole.

EdH 2018-03-01 21:41

I currently have one host machine and three client machines all running properly, as far as I can tell. This, to me, means I must have something right. All are i7 2600 systems.

However, I have an i7 920 system that runs fine by itself, but won't play well with others. It mostly works as a host, but the client shows messages like:
[code]
ERROR:root:Existing file download/c95.roots.gz has wrong checksum ... Deleting file.
[/code]When I try to run it as a client, it won't work at all:
[code]
INFO:root:Attaching file math69.1ba8791a.work/c95.94000-96000.gz to upload
ERROR:root:Could not read file math69.1ba8791a.work/c95.94000-96000.gz: [Errno 2] No such file or directory: 'math69.1ba8791a.work/c95.94000-96000.gz'
INFO:root:Attaching stderr for command 0 to upload
INFO:root:Sending result for workunit c95_sieving_94000-96000 to http://math79.local:13531/cgi-bin/upload.py
INFO:root:Cleaning up for workunit c95_sieving_94000-96000
INFO:root:Removing result file math69.1ba8791a.work/c95.94000-96000.gz
ERROR:root:Could not remove file: [Errno 2] No such file or directory: 'math69.1ba8791a.work/c95.94000-96000.gz'
INFO:root:Removing workunit file download/WU.math69.1ba8791a
INFO:root:Downloading http://math79.local:13531/cgi-bin/getwu?clientid=math69.1ba8791a to download/WU.math69.1ba8791a (cafile = None)
INFO:root:download/c95.roots.gz already exists, not downloading
INFO:root:download/c95.poly already exists, not downloading
INFO:root:download/las already exists, not downloading
INFO:root:Result file math69.1ba8791a.work/c95.96000-98000.gz does not exist
INFO:root:Running 'download/las' -I 13 -poly 'download/c95.poly' -q0 96000 -q1 98000 -lim0 3660220 -lim1 2758600 -lpb0 23 -lpb1 23 -mfb0 22 -mfb1 45 -ncurves0 2 -ncurves1 13 -fb 'download/c95.roots.gz' -out 'math69.1ba8791a.work/c95.96000-98000.gz' -t 2 -stats-stderr
ERROR:root:Command resulted in exit code 132
ERROR:root:Stderr: Illegal instruction (core dumped)
[/code]Any words of wisdom?

henryzz 2018-03-01 22:37

Is the client downloading the siever from the server? Can you stop that behaviour?

VictordeHolland 2018-03-01 22:42

It's alive
 
1 Attachment(s)
CADO-NFS works on my ARM board (Odroid-U2)!
Compiles with just a simple 'make' command with GCC 6.3
So kudo's to the people [B]making[/B] the [B]make[/B] file!

Tested it with RSA-100

[code]PID27408 2018-03-01 18:32:52,010 Info:Square Root: finished
PID27408 2018-03-01 18:32:52,011 Info:Square Root: Factors: 40094690950920881030683735292761468389214899724061 37975227936943673922808872755445627854565536638199
PID27408 2018-03-01 18:32:52,012 Info:Square Root: Total cpu/real time for sqrt: 419.33/145.825
PID27408 2018-03-01 18:32:52,014 Info:Polynomial Selection (size optimized): Total time: 1138.84
PID27408 2018-03-01 18:32:52,016 Info:Polynomial Selection (root optimized): Aggregate statistics:
PID27408 2018-03-01 18:32:52,016 Info:Polynomial Selection (root optimized): Total time: 338.18
PID27408 2018-03-01 18:32:52,016 Info:Polynomial Selection (root optimized): Rootsieve time: 336.08
PID27408 2018-03-01 18:32:52,017 Info:Generate Factor Base: Total cpu/real time for makefb: 5.19/1.45861
PID27408 2018-03-01 18:32:52,017 Info:Generate Free Relations: Total cpu/real time for freerel: 145.56/40.3167
PID27408 2018-03-01 18:32:52,020 Info:Lattice Sieving: Total CPU time: 28356.3s
PID27408 2018-03-01 18:32:52,020 Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 16.03/43.1087
PID27408 2018-03-01 18:32:52,021 Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
PID27408 2018-03-01 18:32:52,021 Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 42.7s
PID27408 2018-03-01 18:32:52,022 Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 80.83/58.5983
PID27408 2018-03-01 18:32:52,023 Info:Filtering - Singleton removal: Total cpu/real time for purge: 53.38/32.7059
PID27408 2018-03-01 18:32:52,023 Info:Filtering - Merging: Total cpu/real time for merge: 292.17/277.597
PID27408 2018-03-01 18:32:52,024 Info:Filtering - Merging: Total cpu/real time for replay: 25.63/30.9723
PID27408 2018-03-01 18:32:52,024 Info:Linear Algebra: Total cpu/real time for bwc: 12419.8/0.000614882
PID27408 2018-03-01 18:32:52,025 Info:Linear Algebra: Aggregate statistics:
PID27408 2018-03-01 18:32:52,025 Info:Linear Algebra: Krylov: WCT time 1882.77
PID27408 2018-03-01 18:32:52,026 Info:Linear Algebra: Lingen CPU time 586.27, WCT time 163.55
PID27408 2018-03-01 18:32:52,027 Info:Linear Algebra: Mksol: WCT time 1043.0
PID27408 2018-03-01 18:32:52,027 Info:Quadratic Characters: Total cpu/real time for characters: 16.38/7.23538
PID27408 2018-03-01 18:32:52,028 Info:Square Root: Total cpu/real time for sqrt: 419.33/145.825
PID27408 2018-03-01 18:32:52,204 Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 43307.6/12677.9[/code]

So
Poly Select: 1475 sec (24 min 35 sec)
Sieving: 28356 CPUsec (~about 2h clock time)
Filtering: ~486 sec (~8 min)
LA: 1883 + 164 (586 CPUsec) + 1043 sec (~52 clock min)
SQR: 419 CPUsec (146 sec wall clock)
Total: 43,307 CPUsec / 12,678 sec WCT

Or about 3.5h real time, not bad for such a tiny board with outdated processor (quadcore Cortex-A9) and memory (LPDDR2-880).

EdH 2018-03-02 04:37

[QUOTE=henryzz;481268]Is the client downloading the siever from the server? Can you stop that behaviour?[/QUOTE]
I had tried figuring out how to use --bindir= and had it wrong. I think I have it figured out now. Thanks!

henryzz 2018-03-02 12:20

[QUOTE=VictordeHolland;481269]CADO-NFS works on my ARM board (Odroid-U2)!
Compiles with just a simple 'make' command with GCC 6.3
So kudo's to the people [B]making[/B] the [B]make[/B] file!

Tested it with RSA-100

[code]PID27408 2018-03-01 18:32:52,010 Info:Square Root: finished
PID27408 2018-03-01 18:32:52,011 Info:Square Root: Factors: 40094690950920881030683735292761468389214899724061 37975227936943673922808872755445627854565536638199
PID27408 2018-03-01 18:32:52,012 Info:Square Root: Total cpu/real time for sqrt: 419.33/145.825
PID27408 2018-03-01 18:32:52,014 Info:Polynomial Selection (size optimized): Total time: 1138.84
PID27408 2018-03-01 18:32:52,016 Info:Polynomial Selection (root optimized): Aggregate statistics:
PID27408 2018-03-01 18:32:52,016 Info:Polynomial Selection (root optimized): Total time: 338.18
PID27408 2018-03-01 18:32:52,016 Info:Polynomial Selection (root optimized): Rootsieve time: 336.08
PID27408 2018-03-01 18:32:52,017 Info:Generate Factor Base: Total cpu/real time for makefb: 5.19/1.45861
PID27408 2018-03-01 18:32:52,017 Info:Generate Free Relations: Total cpu/real time for freerel: 145.56/40.3167
PID27408 2018-03-01 18:32:52,020 Info:Lattice Sieving: Total CPU time: 28356.3s
PID27408 2018-03-01 18:32:52,020 Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 16.03/43.1087
PID27408 2018-03-01 18:32:52,021 Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
PID27408 2018-03-01 18:32:52,021 Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 42.7s
PID27408 2018-03-01 18:32:52,022 Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 80.83/58.5983
PID27408 2018-03-01 18:32:52,023 Info:Filtering - Singleton removal: Total cpu/real time for purge: 53.38/32.7059
PID27408 2018-03-01 18:32:52,023 Info:Filtering - Merging: Total cpu/real time for merge: 292.17/277.597
PID27408 2018-03-01 18:32:52,024 Info:Filtering - Merging: Total cpu/real time for replay: 25.63/30.9723
PID27408 2018-03-01 18:32:52,024 Info:Linear Algebra: Total cpu/real time for bwc: 12419.8/0.000614882
PID27408 2018-03-01 18:32:52,025 Info:Linear Algebra: Aggregate statistics:
PID27408 2018-03-01 18:32:52,025 Info:Linear Algebra: Krylov: WCT time 1882.77
PID27408 2018-03-01 18:32:52,026 Info:Linear Algebra: Lingen CPU time 586.27, WCT time 163.55
PID27408 2018-03-01 18:32:52,027 Info:Linear Algebra: Mksol: WCT time 1043.0
PID27408 2018-03-01 18:32:52,027 Info:Quadratic Characters: Total cpu/real time for characters: 16.38/7.23538
PID27408 2018-03-01 18:32:52,028 Info:Square Root: Total cpu/real time for sqrt: 419.33/145.825
PID27408 2018-03-01 18:32:52,204 Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 43307.6/12677.9[/code]

So
Poly Select: 1475 sec (24 min 35 sec)
Sieving: 28356 CPUsec (~about 2h clock time)
Filtering: ~486 sec (~8 min)
LA: 1883 + 164 (586 CPUsec) + 1043 sec (~52 clock min)
SQR: 419 CPUsec (146 sec wall clock)
Total: 43,307 CPUsec / 12,678 sec WCT

Or about 3.5h real time, not bad for such a tiny board with outdated processor (quadcore Cortex-A9) and memory (LPDDR2-880).[/QUOTE]
How does that compare with the pure c version of the ggnfs siever?

EdH 2018-03-02 19:36

[QUOTE=EdH;481290]I had tried figuring out how to use --bindir= and had it wrong. I think I have it figured out now. Thanks![/QUOTE]
This was also the issue with my ancient Debian Core 2 machine. It was trying to run the Ubuntu i7 binaries. It seems fine now that it's running its own...

It looks like I'm nearing a point where I'll run a test with RSA-130.

Thanks for all the help everyone.

EdH 2018-03-02 19:55

[QUOTE=EdH;481290]I had tried figuring out how to use --bindir= and had it wrong. I think I have it figured out now. Thanks![/QUOTE]
This was also the issue with my ancient Debian Core 2 machine. It was trying to run the Ubuntu i7 binaries. It seems fine now that it's running its own...

It looks like I'm nearing a point where I'll run a test with RSA-130.

Thanks for all the help everyone.

EdH 2018-03-02 21:12

If a moderator is near, could you remove one of my duplicates, please?

Double posting was caused by network troubles.

Thanks!

VBCurtis 2018-03-02 22:12

If you do get around to running RSA130, consider running it twice; I have a draft of new parameters for C130, and would like an independent tester to confirm my settings are faster than default. If you're interested in helping, please record timings for RSA130 for poly select, sieving, and LA; then do the same for the params file I would send along

I sent settings for 95 to 120 digits in already, and my 120 file was tested to be no faster than CADO's (even though it was faster on my rig, consisting of mostly core2 Xeon cores). Before Real Life got in the way, I was building files for 125, 130, 135 digits.

VictordeHolland 2018-03-02 23:18

[QUOTE=henryzz;481326]How does that compare with the pure c version of the ggnfs siever?[/QUOTE]
I don't have a ggnfs lasieve4 for ARM

But if you want to compare yourself this was the poly
[code]n: 1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139
Y0: -767853202003767126726051
Y1: 984954475843
c0: -106889640063652703788276460
c1: -490217649256458074859
c2: 4687825801522500
c3: 2755578274
c4: 4380
skew: 375657.611
type: gnfs
# MurphyE (Bf=1.052e+06,Bg=9.191e+05,area=2.206e+12) = 1.13e-08
# f(x) = 4380*x^4+2755578274*x^3+4687825801522500*x^2-490217649256458074859*x-106889640063652703788276460
# g(x) = 984954475843*x-767853202003767126726051
[/code]And CADO used these arguments for the sieving step
[code]
-I 11
-lim0 919082
-lim1 1051872
-lpb0 24
-lpb1 25
-mfb0 49
-mfb1 50
-ncurves0 11
-ncurves1 16
[/code]Q 1051872 - 2040000

EdH 2018-03-03 02:51

[QUOTE=VBCurtis;481407]If you do get around to running RSA130, consider running it twice; I have a draft of new parameters for C130, and would like an independent tester to confirm my settings are faster than default. If you're interested in helping, please record timings for RSA130 for poly select, sieving, and LA; then do the same for the params file I would send along

I sent settings for 95 to 120 digits in already, and my 120 file was tested to be no faster than CADO's (even though it was faster on my rig, consisting of mostly core2 Xeon cores). Before Real Life got in the way, I was building files for 125, 130, 135 digits.[/QUOTE]I've run RSA-130, but I would need to run it twice more to do a valid test. For this run, I added about 5 machines along the way. For a true comparison, I need to use the same machines for the entire run. Here is today's run with my mix of many types of machines with either Debian or Ubuntu, but various versions:
[code]
Info:Polynomial Selection (size optimized): Aggregate statistics:
Info:Polynomial Selection (size optimized): potential collisions: 38210.8
Info:Polynomial Selection (size optimized): raw lognorm (nr/min/av/max/std): 38857/38.650/46.136/49.480/0.866
Info:Polynomial Selection (size optimized): optimized lognorm (nr/min/av/max/std): 38857/37.180/41.522/46.960/1.141
Info:Polynomial Selection (size optimized): Total time: 8294.25
Info:Polynomial Selection (root optimized): Aggregate statistics:
Info:Polynomial Selection (root optimized): Total time: 3615.37
Info:Polynomial Selection (root optimized): Rootsieve time: 3613.85
Info:Generate Factor Base: Total cpu/real time for makefb: 51.71/10.7454
Info:Generate Free Relations: Total cpu/real time for freerel: 308.57/40.5473
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 19266332
Info:Lattice Sieving: Average J: 7782.82 for 77739 special-q, max bucket fill: 0.628863
Info:Lattice Sieving: Total CPU time: 321572s
Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 50.69/114.322
Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 114.2s
Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 265.52/215.066
Info:Filtering - Singleton removal: Total cpu/real time for purge: 177.43/140.715
Info:Filtering - Merging: Total cpu/real time for merge: 905.7/886.818
Info:Filtering - Merging: Total cpu/real time for replay: 88.94/91.873
Info:Linear Algebra: Total cpu/real time for bwc: 76793.8/0.00016737
Info:Linear Algebra: Aggregate statistics:
Info:Linear Algebra: Krylov: WCT time 6213.28
Info:Linear Algebra: Lingen CPU time 511.46, WCT time 77.63
Info:Linear Algebra: Mksol: WCT time 3433.77
Info:Quadratic Characters: Total cpu/real time for characters: 63.27/17.5241
Info:Square Root: Total cpu/real time for sqrt: 2893.31/390.914
Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 415081/16661.7
[/code]Will this be enough timing info, or will you need more?

I will do another run tomorrow with a set machine list.

I am allowing cado-nfs to choose the params.c130 file and set all the defaults, except for the few (.ssl, .whitelist, .port) options I've added to the command line call. Will I be replacing the params.c130 file in my parameters/factor folder?

VBCurtis 2018-03-03 04:11

It's up to you whether to replace it, or just run the factorization from a file. I'll get you the info Saturday, hopefully!

EdH 2018-03-03 04:33

[QUOTE=VBCurtis;481434]It's up to you whether to replace it, or just run the factorization from a file. I'll get you the info Saturday, hopefully![/QUOTE]
No hurry. I'm tied up for part of Saturday. I may run the first one tomorrow in the morning if I get a chance. Then I can run yours whenever. I would prefer to replace the current .c130 with your new version and use the same command line. That way I'm sure all is the same for the two runs except the parameters.

EdH 2018-03-04 01:47

OK, I have a set of machines to use for the comparison. I did have a glitch - one of my core2 Quads either failed to start one process (2 threads) or dropped it early in the run. I will watch that machine for the next test and stop one process if two are running*. All else seemed to run well, finishing at just over 4 hours:
[code]
Info:Polynomial Selection (size optimized): Aggregate statistics:
Info:Polynomial Selection (size optimized): potential collisions: 38210.8
Info:Polynomial Selection (size optimized): raw lognorm (nr/min/av/max/std): 38857/38.650/46.136/49.480/0.866
Info:Polynomial Selection (size optimized): optimized lognorm (nr/min/av/max/std): 38857/37.180/41.522/46.960/1.141
Info:Polynomial Selection (size optimized): Total time: 8820.95
Info:Polynomial Selection (root optimized): Aggregate statistics:
Info:Polynomial Selection (root optimized): Total time: 3256.54
Info:Polynomial Selection (root optimized): Rootsieve time: 3255.06
Info:Generate Factor Base: Total cpu/real time for makefb: 51.91/10.6228
Info:Generate Free Relations: Total cpu/real time for freerel: 309.8/40.0664
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 19363171
Info:Lattice Sieving: Average J: 7781.26 for 78385 special-q, max bucket fill: 0.623386
Info:Lattice Sieving: Total CPU time: 324016s
Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 51.54/108.384
Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 108.3s
Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 267.44/211.993
Info:Filtering - Singleton removal: Total cpu/real time for purge: 176.36/123.832
Info:Filtering - Merging: Total cpu/real time for merge: 525.71/447.307
Info:Filtering - Merging: Total cpu/real time for replay: 53.04/43.8131
Info:Linear Algebra: Total cpu/real time for bwc: 69859.5/0.000160933
Info:Linear Algebra: Aggregate statistics:
Info:Linear Algebra: Krylov: WCT time 5636.19
Info:Linear Algebra: Lingen CPU time 474.24, WCT time 72.75
Info:Linear Algebra: Mksol: WCT time 3116.54
Info:Quadratic Characters: Total cpu/real time for characters: 61.83/17.0987
Info:Square Root: Total cpu/real time for sqrt: 2834.79/384.939
Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 410286/14660.9
[/code]*I am leaning toward running this as is again to see what difference one process would actually make. I might do that tomorrow AM.

EdH 2018-03-04 04:32

I added more machines and ran RSA-110 this evening. I didn't have enough time to run RSA-130 again tonight. My CADO-NFS setup factored RSA-110 in just under 17 minutes and all my machines acted as they should have (including the one that didn't earlier):
[code]
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 41008.1/1005.69
[/code]

VBCurtis 2018-03-04 06:53

1 Attachment(s)
Ed-
Attached is my draft params.c130 (renamed as a txt for forum compatibility). The timings I'd like from you, for default vs this run:
CPU time for poly select (combined size and root, or each individually)
CPU time for sieve
CPU time for LA
CPU and Wall clock time for entire factorization

The CADO developer to whom I've been sending these param files wishes for both CPU time and wall clock time to be improved in order to consider the file an improvement on the default.

EdH 2018-03-04 16:31

[QUOTE=VBCurtis;481524]Ed-
Attached is my draft params.c130 (renamed as a txt for forum compatibility). The timings I'd like from you, for default vs this run:
CPU time for poly select (combined size and root, or each individually)
CPU time for sieve
CPU time for LA
CPU and Wall clock time for entire factorization

The CADO developer to whom I've been sending these param files wishes for both CPU time and wall clock time to be improved in order to consider the file an improvement on the default.[/QUOTE]
OK, I have it. I'm just about to make a run with the original and then, if things look good, I'll run the draft. I should have these by late today, but I'll need to be sure all the machines ran correctly.

EdH 2018-03-05 02:03

There does seem to be a "bit" of improvement according to my machines.:smile:

If I have your request right...
Here is the timing comparison based on your list (default c130):
[code]
Info:Polynomial Selection (size optimized): Total time: 8979.84
Info:Polynomial Selection (root optimized): Total time: 5150.85
Info:Lattice Sieving: Total CPU time: 316354s
Info:Linear Algebra: Total cpu/real time for bwc: 76931/0.000181913
Info:Complete Factorization: Total cpu/elapsed time for entire
factorization: 412019/15992.9
[/code]Here is the timing comparison based on your list (draft c130):
[code]
Info:Polynomial Selection (size optimized): Total time: 17322.1
Info:Polynomial Selection (root optimized): Total time: 5028.87
Info:Lattice Sieving: Total CPU time: 241079s
Info:Linear Algebra: Total cpu/real time for bwc: 29136.9/0.000170231
Info:Complete Factorization: Total cpu/elapsed time for entire
factorization: 295891/8302.44
[/code]Here is the final portion of the default c130 run:
[code]
Info:Polynomial Selection (size optimized): Aggregate statistics:
Info:Polynomial Selection (size optimized): potential collisions: 38210.8
Info:Polynomial Selection (size optimized): raw lognorm (nr/min/av/max/std): 38857/38.650/46.136/49.480/0.866
Info:Polynomial Selection (size optimized): optimized lognorm (nr/min/av/max/std): 38857/37.180/41.522/46.960/1.141
Info:Polynomial Selection (size optimized): Total time: 8979.84
Info:Polynomial Selection (root optimized): Aggregate statistics:
Info:Polynomial Selection (root optimized): Total time: 5150.85
Info:Polynomial Selection (root optimized): Rootsieve time: 5148.76
Info:Generate Factor Base: Total cpu/real time for makefb: 52.04/10.6654
Info:Generate Free Relations: Total cpu/real time for freerel: 309.37/40.0536
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 19248288
Info:Lattice Sieving: Average J: 7776.6 for 78228 special-q, max bucket fill: 0.61197
Info:Lattice Sieving: Total CPU time: 316354s
Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 50.85/104.564
Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 104.4s
Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 265.42/211.441
Info:Filtering - Singleton removal: Total cpu/real time for purge: 182.69/141.73
Info:Filtering - Merging: Total cpu/real time for merge: 742.06/679.881
Info:Filtering - Merging: Total cpu/real time for replay: 60.65/50.098
Info:Linear Algebra: Total cpu/real time for bwc: 76931/0.000181913
Info:Linear Algebra: Aggregate statistics:
Info:Linear Algebra: Krylov: WCT time 6212.15
Info:Linear Algebra: Lingen CPU time 515.09, WCT time 78.28
Info:Linear Algebra: Mksol: WCT time 3423.47
Info:Quadratic Characters: Total cpu/real time for characters: 63.17/17.4738
Info:Square Root: Total cpu/real time for sqrt: 2877.11/390.897
Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 412019/15992.9
Info:root: Cleaning up computation data in /tmp/cado.v39r93v9
39685999459597454290161126162883786067576449112810064832555157243 45534498646735972188403686897274408864356301263205069600999044599
[/code]Here is the final portion of the draft c130 run:
[code]
Info:Polynomial Selection (size optimized): Aggregate statistics:
Info:Polynomial Selection (size optimized): potential collisions: 38122.5
Info:Polynomial Selection (size optimized): raw lognorm (nr/min/av/max/std): 38617/37.230/46.866/55.040/1.375
Info:Polynomial Selection (size optimized): optimized lognorm (nr/min/av/max/std): 38617/37.100/41.238/48.200/0.988
Info:Polynomial Selection (size optimized): Total time: 17322.1
Info:Polynomial Selection (root optimized): Aggregate statistics:
Info:Polynomial Selection (root optimized): Total time: 5028.87
Info:Polynomial Selection (root optimized): Rootsieve time: 5026.89
Info:Generate Factor Base: Total cpu/real time for makefb: 11.77/2.32834
Info:Generate Free Relations: Total cpu/real time for freerel: 312.01/40.6757
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 25179555
Info:Lattice Sieving: Average J: 3795.47 for 242231 special-q, max bucket fill: 0.678855
Info:Lattice Sieving: Total CPU time: 241079s
Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 66.03/101.62
Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 101.6s
Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 290.99/75.7305
Info:Filtering - Singleton removal: Total cpu/real time for purge: 126.17/44.2392
Info:Filtering - Merging: Total cpu/real time for merge: 251.66/211.464
Info:Filtering - Merging: Total cpu/real time for replay: 33.98/27.0952
Info:Linear Algebra: Total cpu/real time for bwc: 29136.9/0.000170231
Info:Linear Algebra: Aggregate statistics:
Info:Linear Algebra: Krylov: WCT time 2312.63
Info:Linear Algebra: Lingen CPU time 323.98, WCT time 49.7
Info:Linear Algebra: Mksol: WCT time 1302.22
Info:Quadratic Characters: Total cpu/real time for characters: 45.25/12.3093
Info:Square Root: Total cpu/real time for sqrt: 2186.27/294.068
Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 295891/8302.44
Info:root: Cleaning up computation data in /tmp/cado.qyip4xoh
45534498646735972188403686897274408864356301263205069600999044599 39685999459597454290161126162883786067576449112810064832555157243
[/code]

Dubslow 2018-03-05 02:08

Wow, that's an incredible speedup. Nice work Curtis!

VBCurtis 2018-03-05 03:45

WooHoo!!!! I think I'll email this one to PaulZ for inclusion in the git version of CADO. I've been targeting 5% of total job time for poly select, and your run indicates I've overshot that. I'll reduce poly-select P a bit for future tests.

Factoring a C130 in 2 hr 20 min is impressive!! I mean, sure Ed's using a lot of cores, but factoring mid-sized numbers *fast* sure is nice.

I have a solid idea of C125 improvements as well. If you come across a suitably-sized candidate to factor, please let me know and I'll post the improved file for that, too.

As I factor numbers for Aliquot sequences, I keep tweaking parameters and testing new settings. When I get a record-fast result for a given size, I keep the same parameters and run another number to confirm the settings are quick (rather than merely stumbling onto a lucky poly one time). I've done this from 97 to 134ish digits, and I'm just too lazy to run default settings for time comparisons.

I noted previously in this thread that I'm finding time doubling every 5.8 digits or so; I'm using my record-for-each-size timings to construct that regression. If the regression is accurate, Ed can factor a C147 in under 24 hrs. :)

Edit: That BWC reduction is quite important, since that's the step that doesn't parallelize to multiple machines. I've had a secondary goal of reducing the matrix time, even if overall job length isn't reduced, because that reduces wall clock time for multi-machine users.

VBCurtis 2018-03-05 06:16

Turns out the C125 settings haven't been well-tested; I'll post a params.c125 file in a few days.

EdH 2018-03-05 15:53

[QUOTE=VBCurtis;481602]WooHoo!!!! I think I'll email this one to PaulZ for inclusion in the git version of CADO. I've been targeting 5% of total job time for poly select, and your run indicates I've overshot that. I'll reduce poly-select P a bit for future tests.[/QUOTE]For a while, in my scripted system, I was interrupting poly select as soon as a half way decent score showed up. I never really studied whether that was good overall or not. But might your reduction attempt cause a late good poly to be missed?

[QUOTE=VBCurtis;481602]I noted previously in this thread that I'm finding time doubling every 5.8 digits or so; I'm using my record-for-each-size timings to construct that regression. If the regression is accurate, Ed can factor a C147 in under 24 hrs. :)[/QUOTE]In theory! In practice, I have several machines that go to sleep at night. Fortunately, they are mostly the "weaker" ones.

[QUOTE=VBCurtis;481602]Edit: That BWC reduction is quite important, since that's the step that doesn't parallelize to multiple machines. I've had a secondary goal of reducing the matrix time, even if overall job length isn't reduced, because that reduces wall clock time for multi-machine users.[/QUOTE]In the days of the Pentium, I ran msieve's LA via mpi with a gigabit switch, which helped a bit in memory and time. If CADO-NFS LA could be distributed without a massive amount of comm, that would be helpful. But, if it was easy, I'm sure it would have been implemented.

CRGreathouse 2018-03-05 16:23

[QUOTE=EdH;481620]If CADO-NFS LA could be distributed without a massive amount of comm, that would be helpful. But, if it was easy, I'm sure it would have been implemented.[/QUOTE]

My inexpert understanding is that it takes massive amounts of communication... infiniband or don't bother.

EdH 2018-03-05 17:55

[QUOTE=CRGreathouse;481621]My inexpert understanding is that it takes massive amounts of communication... infiniband or don't bother.[/QUOTE]
Yeah, as soon as I moved up from Pentium, gigabit switching showed a degradation rather than improvement. It did allow for a larger matrix, but I've improved the memory footprint now.

VBCurtis 2018-03-05 20:26

As we move toward matrices that stretch our available memory, it's possibly worth having in our arsenal knowledge that we can halve memory use by MPI'ing across gigabit ethernet, even if we gain no speed. If there's a substantial speed penalty, I suppose it's not worthwhile.

I've been re-reading the factorization-announcement papers from the CADO group, and I think one of them mentioned that one of the three CADO matrix phases can be parallelized across gigabit; I'll have a look one of these evenings to see if I can find and quote that passage.

EdH 2018-03-06 15:09

My memory does not always serve me well, but IIRC, with Pentium 4s I got a decrease in time when I used mpi with two machines, but no decrease further if any more were added. I seem to remember posting somewhere that the time remained the same.

I tried to use gigabit directly between two static IP Core2 machines and got no decrease in time, but I don't recall if time increased with the Core2 machines. It seems like it might have.

I'm not actually up for that type of trial ATM, but maybe a little later, to see what would happen with my i7s.

EdH 2018-03-07 19:44

[QUOTE=Dubslow;481209]What edge may your scripts have over CADO?[/QUOTE]
I just reworked my scripts and ran RSA-130 on the same set of machines via msieve and factmsieve.py. It took ~1 hour 40 minutes.

I ran msieve polyselect with a 5 minute wall-time limit and 8 threads. I then ran factmsieve.py starting with the factor base file. This took 1:35:22 to log the factors starting from rsa130.fb.

Setting up the machines to run the scripts was probably another 5 minutes or a little extra.

Dubslow 2018-03-07 20:10

[QUOTE=EdH;481812]I just reworked my scripts and ran RSA-130 on the same set of machines via msieve and factmsieve.py. It took ~1 hour 40 minutes.

I ran msieve polyselect with a 5 minute wall-time limit and 8 threads. I then ran factmsieve.py starting with the factor base file. This took 1:35:22 to log the factors starting from rsa130.fb.

Setting up the machines to run the scripts was probably another 5 minutes or a little extra.[/QUOTE]

That's ~100 minutes vs ~138 for CADO with Curtis' improved C130 settings, am I understanding that right? So in total, CADO still has some ground to make up, even if some of its parts are winners?

VBCurtis 2018-03-07 20:22

[QUOTE=Dubslow;481815]That's ~100 minutes vs ~138 for CADO with Curtis' improved C130 settings, am I understanding that right? So in total, CADO still has some ground to make up, even if some of its parts are winners?[/QUOTE]

Yes, that is my experience as well. I looked into the data a little bit, and the biggest slowdown CADO faces is that for a given LP-bound size, CADO requires quite a lot more relations to build a matrix. That's where I got the idea to reduce CADO's target density; alas, it hasn't closed the gap far enough to justify using CADO for all production.

I do believe the sievers and poly select are up to msieve/GGNFS standards & speeds. So, there's hope that a future improvement to CADO filtering can close much of the remaining gap, or even surpass GGNFS.

EdH 2018-03-07 22:02

Of note, remember that I set the cut-off for poly select at 5 minutes with 8 threads. If I had let msieve run what it wanted, it would have easily used up the extra time:
[code]
... poly select deadline: 52159
... time limit set to 14.49 CPU-hours
[/code]

VBCurtis 2018-03-10 21:28

[QUOTE=VBCurtis;481642]I've been re-reading the factorization-announcement papers from the CADO group, and I think one of them mentioned that one of the three CADO matrix phases can be parallelized across gigabit; I'll have a look one of these evenings to see if I can find and quote that passage.[/QUOTE]

Here's that paper:
[url]https://hal.inria.fr/inria-00502899/file/grid.pdf[/url]

The first and third phases can be split to not-well-connected machines, at the cost of copying the matrix and some other files from machine to machine. The CADO docs make gentle reference to doing so, but I could not make out an actual way to trigger the CADO package to do so. Perhaps that's coming soon in 3.0!

EdH 2018-04-17 03:01

@Curtis: I'm getting ready to do some more testing this week. I'm looking at RSA150, but I might just grab a C150 from one of the unreserved Aliquot sequences.

Anyway, have you done any parameter work for CADO-NFS for C150s?

I'll do a run with CADO-NFS and another with my gnfs scripts for comparison. If you have another set of CADO-NFS params, I can do a run with those as well. I hope to run them all this week, but I often procrastinate on these tests.

Ed

VBCurtis 2018-04-17 05:09

Ed-
I haven't yet, but I could try a draft based on what has worked up to 140 digits. I've been using my CADO resources on poly select for some big GNFS jobs the past month. I'm still working on my tax returns, but by Thursday or so I should be able to guess at some improved parameters.

Gotta start somewhere; we can start together! I have a hypothesis that CADO scales better than GGNFS/msieve, so I'm hoping to find that a C150 is closer than C130 between the two packages; if so I'll be quite encouraged to try future GNFS-160 class jobs with CADO, with the hope of exceeding GGNFS/msieve performance.

EdH 2018-04-17 14:13

Curtis,

I'll be doing some testing throughout the week to see what machines and configurations I can use. A minor annoyance is that about half of my machines go to sleep at night, so I can't use them for any benchmarking if I can't fit in, at least, all the work up to LA, before slumber.

I'm going to use a C150 from the db that has survived YAFU's battery of ECM:

[URL]http://www.factordb.com/index.php?id=1100000001092093149[/URL]

from Aliquot Sequence [URL="http://factordb.com/sequences.php?se=1&aq=281592&action=last20"]281592[/URL]

More later...

Ed

EdH 2018-04-19 14:23

C150 Hangs Up at 98% of the Polynomial Selection Phase
 
I seem to be having a bit of trouble. I've tried twice without success.

I'm trying to run the C150 described earlier and have factored it via my home brew scripts and msieve/ggnfs. It has three primes, if that is of any significance.

I've tried CADO-NFS twice with the same results.

I'm running 1 server, (obvously), and 37 clients. All seems fine until I get to 98% of the way through the Polynomial Selection. Then the server stops:
[code]
...
Info:Polynomial Selection (size optimized): Marking workunit c150_polyselect1_394000-395000 as ok (97.5% => ETA Thu Apr 19 08:18:55 2018)
Info:Polynomial Selection (size optimized): Parsed 138 polynomials, added 0 to priority queue (has 100)
Info:Polynomial Selection (size optimized): Marking workunit c150_polyselect1_390000-391000 as ok (97.8% => ETA Thu Apr 19 08:18:56 2018)
Info:Polynomial Selection (size optimized): Parsed 146 polynomials, added 0 to priority queue (has 100)
Info:Polynomial Selection (size optimized): Marking workunit c150_polyselect1_397000-398000 as ok (98.0% => ETA Thu Apr 19 08:19:17 2018)
[/code]All my clients are at idle:
[code]
...
ERROR:root:Download failed, URL error: HTTP Error 404: No work available
ERROR:root:Waiting 10.0 seconds before retrying (I have been waiting since 7120.0 seconds)
ERROR:root:Download failed, URL error: HTTP Error 404: No work available
ERROR:root:Waiting 10.0 seconds before retrying (I have been waiting since 7130.0 seconds)
ERROR:root:Download failed, URL error: HTTP Error 404: No work available
ERROR:root:Waiting 10.0 seconds before retrying (I have been waiting since 7140.0 seconds)
ERROR:root:Download failed, URL error: HTTP Error 404: No work available
ERROR:root:Waiting 10.0 seconds before retrying (I have been waiting since 7150.0 seconds)
... and counting ...
[/code]Could a misbehaving client cause this, or is it solely a server borne trouble? I will be turning my clients away from the project momentarily and probably the server as well, but will keep whatever files I can from the server.

Any thoughts appreciated...

Dubslow 2018-04-19 17:35

You could always try attaching [c]pdb[/c] to it?

VBCurtis 2018-04-19 18:48

I once had a poly-select range that timed out repeatedly; msieve has a time-limit for any particular coeff, but it seems CADO does not. In my case, the offending range was under 1000, so I set admin=1e3 in the params file and all was well.

You could also try increasing the time-out limit from default (I think 3600 sec).

EdH 2018-04-20 01:29

[QUOTE=Dubslow;485715]You could always try attaching [c]pdb[/c] to it?[/QUOTE]My ignorance won't allow me to try this yet...

[QUOTE=VBCurtis;485719]I once had a poly-select range that timed out repeatedly; msieve has a time-limit for any particular coeff, but it seems CADO does not. In my case, the offending range was under 1000, so I set admin=1e3 in the params file and all was well.

You could also try increasing the time-out limit from default (I think 3600 sec).[/QUOTE]
I've added tasks.polyselect.admin = 1e3 to the params.c150 Polynomial selection section. I'll try a restart tomorrow morning. If it doesn't run tomorrow, I'll try RSA150 and see if it will get past polyselect.

Thanks...

EdH 2018-04-20 13:49

The admin = 1e3 appears to have done the trick. The process has moved into Lattice Sieving without any noticed trouble.

Thanks!

EdH 2018-04-21 03:23

Preliminary results do not look very favorable. Using scripts to run msieve and ggnfs, it took my conglomeration of machines 14:43:33 to complete the factorization of the C150 described above, if my time math is correct. ( 10:53:46 to 01:45:00 )

I started the CADO-NFS run today at 08:39:34. Currently, the LA (krylov) stage is estimating 20:19:55 tomorrow and with each print line the ETA is drifting later, sometimes adding more than 10 minutes.

I'm not sure how well the parameters can be tweaked, but it would need to be a pretty huge change to be anywhere toward the msieve/ggnfs package.

The only machine running now is the server doing LA. A quick check shows it's only using 5 of the 16 available GB, so memory shouldn't be an issue ATM. But, I don't remember the time comparisons for the rest of the steps after krylov, or how the memory use may change.

If the time holds, it's going to be way more than double that of the other setup.

VBCurtis 2018-04-21 05:39

My C130 test with improved parameters lists 62k thread-seconds for LA, while C140 lists 126K thread-seconds (2.3ghz dual-quad-xeon, core2 era). If C150 with good parameters doubles time again, 260K thread-seconds rounded generously for newer hardware would be 12ish hours on a quad-core for just the LA. That definitely does not compare favorably with msieve/ggnfs!

I was really hoping overall package performance would converge in the 150s or 160s...

Dubslow 2018-04-21 06:28

But isn't CADO's polyselect+siever better though? Why can't we combine the best of each?

EdH 2018-04-21 12:44

[QUOTE=Dubslow;485864]But isn't CADO's polyselect+siever better though? Why can't we combine the best of each?[/QUOTE]
I was actually thinking this morning of trying to see if I can copy the relations over and use msieve for LA on a different machine. In README.msieve, this is one of the sections. However, I don't want to "distract" the server from what it is doing.

Now that I'm able to distribute the msieve poly selection among my machines, I think msieve/ggnfs is the way to go for a LAN based system. CADO_NFS would still have appeal if I was to add outside machines.

If further details are of interest:

170 cores of various flavors were used for the msieve/ggnfs run. A weak dual core laptop failed to run CADO-NFS. This should only be a very minor loss.

The msieve/ggnfs run started poly selection at 10:53:46.
Sieving started at 11:01:27.
Matrix solving began at 19:01:45
LA began at 19:01:46.
Square root started at 00:57:58
The three factors were logged at 01:45:00

The advantage I had originally considered for CADO-NFS was (and still is to some degree), the ease of setting it up and the distributed poly select. But, now I have the scripts and the ability to distribute poly select.

EdH 2018-04-22 02:00

Hey Curtis,

I'm not sure you can adjust the params enough to make up the difference. Now the server is running LA (mksol) and giving an ETA of 07:46:58.

It's looking well over three times as long for CADO-NFS.:sad:

Ed

VBCurtis 2018-04-22 02:59

Ed-
Even if you don't run this factorization again with my params, your default run's timing results will help me choose parameters for my own future use. I'll post them, of course, in case someone wishes to try them.

C150 is about where I started to tinker with factmsieve parameters too, so I think CADO will be even worse off against those.

EdH 2018-04-22 13:17

[QUOTE=VBCurtis;485914]Ed-
Even if you don't run this factorization again with my params, your default run's timing results will help me choose parameters for my own future use. I'll post them, of course, in case someone wishes to try them.

C150 is about where I started to tinker with factmsieve parameters too, so I think CADO will be even worse off against those.[/QUOTE]Hi Curtis,

I still expect to give your params a test run when you get them set. I just can't see running CADO-NFS as a norm at this point, but I have a script alternative for msieve/ggnfs.

And, the results are in:
[code]
Info:Polynomial Selection (size optimized): Aggregate statistics:
Info:Polynomial Selection (size optimized): potential collisions: 56795
Info:Polynomial Selection (size optimized): raw lognorm (nr/min/av/max/std): 57461/46.070/54.041/58.840/0.891
Info:Polynomial Selection (size optimized): optimized lognorm (nr/min/av/max/std): 57461/44.100/48.552/54.860/1.446
Info:Polynomial Selection (size optimized): Total time: 91122.7
Info:Polynomial Selection (root optimized): Aggregate statistics:
Info:Polynomial Selection (root optimized): Total time: 8478.5
Info:Polynomial Selection (root optimized): Rootsieve time: 8477.14
Info:Generate Factor Base: Total cpu/real time for makefb: 36.18/8.17465
Info:Generate Free Relations: Total cpu/real time for freerel: 624.23/82.0787
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 53833827
Info:Lattice Sieving: Average J: 3798.46 for 2342851 special-q, max bucket fill: 0.697496
Info:Lattice Sieving: Total CPU time: 6.52511e+06s
Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 165.89/578.793
Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 577.6000000000001s
Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 1273.34/1043.39
Info:Filtering - Singleton removal: Total cpu/real time for purge: 1078.85/1214.3
Info:Filtering - Merging: Total cpu/real time for merge: 3665.72/3278.51
Info:Filtering - Merging: Total cpu/real time for replay: 230.56/198.062
Info:Linear Algebra: Total cpu/real time for bwc: 945170/0.000172853
Info:Linear Algebra: Aggregate statistics:
Info:Linear Algebra: Krylov: WCT time 78981.07
Info:Linear Algebra: Lingen CPU time 2012.76, WCT time 306.77
Info:Linear Algebra: Mksol: WCT time 42773.62
Info:Quadratic Characters: Total cpu/real time for characters: 205.42/64.8022
Info:Square Root: Total cpu/real time for sqrt: 11575.4/1607.64
Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 7.58874e+06/171384
Info:root: Cleaning up computation data in /tmp/cado.vzafj00h
35912223503197268109418424875344813700437442706880566137398291217213 22947545427314151445011966017377 584458412373341050558641690854880477452541557046361
[/code]It was finished in just less than two days - 47h 36m 24s.

Let me know when you get some changes made for the params...

Ed

VictordeHolland 2018-04-27 12:47

[URL]http://mersenneforum.org/showpost.php?p=486353&postcount=6[/URL]
[quote]GGNFS+Mieve vs. CADO-NFS C150
Poly select: 7:39 vs. 9:46 (99,601 cpusec /170 cores=~586 sec realtime)
Sieving: 8:00:28 vs. 10:39:43 (6.52511e+06s cpusec /170 cores =~38,383 sec realtime)
LA: 5:56:13 vs. 33:54:21 (78,981.07 + 306.77 + 42,773.62 = 122,061 sec)
SQR: 47:02 vs. 26:47 (1607.64 sec)[/quote]So the big difference is really in the LA phase??

henryzz 2018-04-27 13:18

[QUOTE=VictordeHolland;486355][URL]http://mersenneforum.org/showpost.php?p=486353&postcount=6[/URL]
So the big difference is really in the LA phase??[/QUOTE]

How many cores was the LA run on?

EdH 2018-04-27 13:52

[QUOTE=henryzz;486359]How many cores was the LA run on?[/QUOTE]
The same i7-2600 @ 3.4 GHz with 16GB ran both LAs on 8 threads (4c x 2t).

edit: Also note that I hard stopped the polyselect for the msieve/ggnfs run. I don't know what msieve would have preferred.

edit2: msieve wanted 193.43 CPU-hours.I think this would have far out done the time CADO-NFS used.

VBCurtis 2018-04-27 21:21

[QUOTE=EdH;485933]Hi Curtis,

[code]Info:Polynomial Selection (size optimized): Total time: 91122.7
Info:Polynomial Selection (root optimized): Rootsieve time: 8477.14
Info:Lattice Sieving: Total number of relations: 53833827
Info:Lattice Sieving: Total CPU time: 6.52511e+06s
Info:Linear Algebra: Total cpu/real time for bwc: 945170/0.000172853
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 7.58874e+06/171384
[/code]It was finished in just less than two days - 47h 36m 24s.
Let me know when you get some changes made for the params...
Ed[/QUOTE]
Ed-
Here are my best-guess params for C150:
[code]tasks.polyselect.degree = 5

tasks.polyselect.P = 600000
tasks.polyselect.admax = 25e4
tasks.polyselect.adrange = 5e2
tasks.polyselect.incr = 60
tasks.polyselect.nq = 15625
tasks.polyselect.nrkeep = 100
tasks.polyselect.ropteffort = 12

###########################################################################
# Sieve
###########################################################################

lim0 = 13000000
lim1 = 32000000
lpb0 = 29
lpb1 = 30
tasks.sieve.mfb0 = 58
tasks.sieve.mfb1 = 60
tasks.sieve.ncurves0 = 16
tasks.sieve.ncurves1 = 21
tasks.I = 14

tasks.sieve.qrange = 2000
tasks.sieve.qmin = 3000000

###########################################################################
# Filtering
###########################################################################

tasks.filter.purge.keep = 175
tasks.filter.maxlevel = 30
tasks.filter.target_density = 155.0[/code]
Major changes:
1. Ditched 3 large primes.
2. I = 14 rather than 13. I'm not sure this is correct, as I think with CADO the transition from 13 to 14 is right around 150. However, this means that if it's a mistake it should be a small one.
3. Reduced target density from 170 to 155.
4. Almost tripled poly-select effort. Your data showed 100k sec on poly select on a job that took ~7.6M seconds, meaning poly select was around 1.5% of total time. I've found best results at 4-5%, so I added a bunch in hopes that overall time will drop a bunch to make my selection ~5%.
5. Set qmin = 3M. I think in the absence of this parameter CADO defaults to begin sieving at lim0, which was 16M. I expect you'll complete sieving before 16M, so almost no Q's will overlap your previous run!

I look forward to your test results.

EdH 2018-04-28 03:31

Thanks Curtis,

I hope to get this started in the morning.

Ed

VBCurtis 2018-04-28 05:23

If you get a chance before your run, reduce lim1 from 32e6 to 28e6. It won't make much difference, but I believe that smaller lim's reduce matrix size and LA time. Since that's the one part stuck on a single machine, it's worth trading a bit of sieve efficiency to try for a smaller matrix.

EdH 2018-04-28 15:38

[QUOTE=VBCurtis;486441]If you get a chance before your run, reduce lim1 from 32e6 to 28e6. It won't make much difference, but I believe that smaller lim's reduce matrix size and LA time. Since that's the one part stuck on a single machine, it's worth trading a bit of sieve efficiency to try for a smaller matrix.[/QUOTE]
It looks like I got the chance. The process got hung up again, as before. Should I add "tasks.polyselect.admin = 1e3" again? I left it out when I rewrote params.c150.

Due to other things, it looks like I'm going to have to wait until Monday to retry...

VBCurtis 2018-04-28 15:49

Sure, that's probably wise for general use on medium-sized inputs (say, C130 or higher). Smaller numbers seem to be more likely to have a best-poly-score on a very small coefficient, so I'd not put that in for numbers below C130.
I set poly work size to 5e2 in part to try to avoid this problem; guess that wasn't enough.

EdH 2018-04-29 03:43

[QUOTE=VBCurtis;486467]Sure, that's probably wise for general use on medium-sized inputs (say, C130 or higher). Smaller numbers seem to be more likely to have a best-poly-score on a very small coefficient, so I'd not put that in for numbers below C130.
I set poly work size to 5e2 in part to try to avoid this problem; guess that wasn't enough.[/QUOTE]
The C130 work ran without the issue. I don't think I tried any C140 work. I've added it now for the C150, but it won't run until at least Monday.

EdH 2018-04-30 15:50

I started the run at pretty close to 10:00 local, but stopped it at a little prior to 11:30 because over a quarter of my machines dropped a client!

I restarted and will see if this happens again.

I have some changes to make to my scripts because I can "see" why temporary terminals fail. Currently, I'm using instances that close on their own when done. I might just add a sleep 1000000 line or log any stderr.

EdH 2018-05-01 14:20

Well done, Curtis,

The results are in and I would have to say you made quite a difference!

[code]
Total cpu/elapsed time for entire factorization: 7.58874e+06/171384 (default)
Total cpu/elapsed time for entire factorization: 3.71132e+06/78710.3
(modified)
[/code]More details:
[code]
Info:Polynomial Selection (size optimized): Aggregate statistics:
Info:Polynomial Selection (size optimized): potential collisions: 173101
Info:Polynomial Selection (size optimized): raw lognorm (nr/min/av/max/std): 174374/44.360/54.513/60.990/1.171
Info:Polynomial Selection (size optimized): optimized lognorm (nr/min/av/max/std): 174374/43.240/48.295/54.910/1.154
Info:Polynomial Selection (size optimized): Total time: 332234
Info:Polynomial Selection (root optimized): Aggregate statistics:
Info:Polynomial Selection (root optimized): Total time: 9896.75
Info:Polynomial Selection (root optimized): Rootsieve time: 9895.19
Info:Generate Factor Base: Total cpu/real time for makefb: 32.03/6.94067
Info:Generate Free Relations: Total cpu/real time for freerel: 1238.12/162.031
Info:Lattice Sieving: Aggregate statistics:
Info:Lattice Sieving: Total number of relations: 72135822
Info:Lattice Sieving: Average J: 7845.96 for 812343 special-q, max bucket fill: 0.635471
Info:Lattice Sieving: Total CPU time: 2.94595e+06s
Info:Filtering - Duplicate Removal, splitting pass: Total cpu/real time for dup1: 390.51/174.008
Info:Filtering - Duplicate Removal, splitting pass: Aggregate statistics:
Info:Filtering - Duplicate Removal, splitting pass: CPU time for dup1: 173.79999999999998s
Info:Filtering - Duplicate Removal, removal pass: Total cpu/real time for dup2: 1072.58/245.117
Info:Filtering - Singleton removal: Total cpu/real time for purge: 473.63/191.305
Info:Filtering - Merging: Total cpu/real time for merge: 1677.58/1476.49
Info:Filtering - Merging: Total cpu/real time for replay: 140.79/127.468
Info:Linear Algebra: Total cpu/real time for bwc: 408448/0.000201702
Info:Linear Algebra: Aggregate statistics:
Info:Linear Algebra: Krylov: WCT time 34125.2
Info:Linear Algebra: Lingen CPU time 1353.24, WCT time 207.16
Info:Linear Algebra: Mksol: WCT time 18367.97
Info:Quadratic Characters: Total cpu/real time for characters: 174.74/48.4327
Info:Square Root: Total cpu/real time for sqrt: 9588.38/1313.39
Info:HTTP server: Shutting down HTTP server
Info:Complete Factorization: Total cpu/elapsed time for entire factorization: 3.71132e+06/78710.3
Info:root: Cleaning up computation data in /tmp/cado.w2u_0xvq
22947545427314151445011966017377 35912223503197268109418424875344813700437442706880566137398291217213 584458412373341050558641690854880477452541557046361
[/code]Ed

VBCurtis 2018-05-01 16:04

My params scored.... half? Really? Half the time? That's nuts. That also means that whatever the CADO people are doing for their automated parameter-generation, it's lousy.

Note that my guess for poly select effort was off by a factor of two; you spent 10% of total job length on poly select rather than 5%. I'll cut admax in half to fix this.
The CADO matrix still took 15 hr, but for overall CPU-time did CADO beat GGNFS?

Edit: If I've got the math right, based on post #173 in this thread the CADO-sieve phase took around 5 hr, beating GGNFS's 8hr handily. Can you confirm? If so, CADO is the package of choice for C150, and I have much inventive to create params files for 135-160.

CRGreathouse 2018-05-01 17:48

:bow:

EdH 2018-05-01 18:53

Here are the logs for the msieve/ggnfs run. Keep in mind I hard stopped the poly select at five minutes.

msieve.log:
[code]
Tue Apr 17 10:53:46 2018 Msieve v. 1.54 (SVN 1015)
Tue Apr 17 10:53:46 2018 random seeds: 97bb614a 866c9fbb
Tue Apr 17 10:53:46 2018 factoring 481650646493457195451617145540517069613323533182564220176531860824741008875219974531986243531278506390059994303957879242329686004440770045179594064661 (150 digits)
Tue Apr 17 10:53:46 2018 searching for 15-digit factors
Tue Apr 17 10:53:47 2018 commencing number field sieve (150-digit input)
Tue Apr 17 10:53:47 2018 commencing number field sieve polynomial selection
Tue Apr 17 10:53:47 2018 polynomial degree: 5
Tue Apr 17 10:53:47 2018 max stage 1 norm: 9.39e+22
Tue Apr 17 10:53:47 2018 max stage 2 norm: 1.32e+21
Tue Apr 17 10:53:47 2018 min E-value: 4.54e-12
Tue Apr 17 10:53:47 2018 poly select deadline: 300
Tue Apr 17 10:53:47 2018 time limit set to 0.08 CPU-hours
Tue Apr 17 10:53:47 2018 expecting poly E from 4.62e-12 to > 5.31e-12
Tue Apr 17 10:53:47 2018 searching leading coefficients from 1 to 3000
Tue Apr 17 11:00:27 2018 polynomial selection complete
Tue Apr 17 11:00:27 2018 elapsed time 00:06:41
[/code]comp.log:
[code]
Tue Apr 17 11:01:27 2018 -> factmsieve.py (v0.86)
Tue Apr 17 11:01:27 2018 -> This is client 1 of 40
Tue Apr 17 11:01:27 2018 -> Running on 4 Cores with 2 hyper-threads per Core
Tue Apr 17 11:01:27 2018 -> Working with NAME = comp
Tue Apr 17 11:01:27 2018 -> Selected lattice siever: gnfs-lasieve4I14e
Tue Apr 17 11:01:27 2018 -> Creating param file to detect parameter changes...
Tue Apr 17 11:01:27 2018 -> Running setup ...
Tue Apr 17 11:01:27 2018 -> Estimated minimum relations needed: 4.55551e+07
Tue Apr 17 11:01:27 2018 -> cleaning up before a restart
Tue Apr 17 11:01:27 2018 -> Running lattice siever ...
Tue Apr 17 11:01:27 2018 -> entering sieving loop
...
Tue Apr 17 18:59:10 2018 commencing in-memory singleton removal
Tue Apr 17 18:59:11 2018 begin with 15132165 relations and 15833872 unique ideals
Tue Apr 17 18:59:20 2018 reduce to 14929590 relations and 13892562 ideals in 9 passes
Tue Apr 17 18:59:20 2018 max relations containing the same ideal: 103
Tue Apr 17 18:59:25 2018 removing 1913461 relations and 1513461 ideals in 400000 cliques
Tue Apr 17 18:59:26 2018 commencing in-memory singleton removal
Tue Apr 17 18:59:27 2018 begin with 13016129 relations and 13892562 unique ideals
Tue Apr 17 18:59:35 2018 reduce to 12829019 relations and 12187402 ideals in 10 passes
Tue Apr 17 18:59:35 2018 max relations containing the same ideal: 92
Tue Apr 17 18:59:40 2018 removing 1706382 relations and 1331989 ideals in 374393 cliques
Tue Apr 17 18:59:41 2018 commencing in-memory singleton removal
Tue Apr 17 18:59:41 2018 begin with 11122637 relations and 12187402 unique ideals
Tue Apr 17 18:59:48 2018 reduce to 10953600 relations and 10681967 ideals in 9 passes
Tue Apr 17 18:59:48 2018 max relations containing the same ideal: 82
Tue Apr 17 18:59:54 2018 relations with 0 large ideals: 755
Tue Apr 17 18:59:54 2018 relations with 1 large ideals: 2391
Tue Apr 17 18:59:54 2018 relations with 2 large ideals: 36537
Tue Apr 17 18:59:54 2018 relations with 3 large ideals: 262731
Tue Apr 17 18:59:54 2018 relations with 4 large ideals: 1017374
Tue Apr 17 18:59:54 2018 relations with 5 large ideals: 2276452
Tue Apr 17 18:59:54 2018 relations with 6 large ideals: 3092854
Tue Apr 17 18:59:54 2018 relations with 7+ large ideals: 4264506
Tue Apr 17 18:59:54 2018 commencing 2-way merge
Tue Apr 17 18:59:59 2018 reduce to 6708869 relation sets and 6437236 unique ideals
Tue Apr 17 18:59:59 2018 commencing full merge
Tue Apr 17 19:01:15 2018 memory use: 774.4 MB
Tue Apr 17 19:01:16 2018 found 3603757 cycles, need 3563436
Tue Apr 17 19:01:17 2018 weight of 3563436 cycles is about 249642142 (70.06/cycle)
Tue Apr 17 19:01:17 2018 distribution of cycle lengths:
Tue Apr 17 19:01:17 2018 1 relations: 454531
Tue Apr 17 19:01:17 2018 2 relations: 450624
Tue Apr 17 19:01:17 2018 3 relations: 450191
Tue Apr 17 19:01:17 2018 4 relations: 405986
Tue Apr 17 19:01:17 2018 5 relations: 364225
Tue Apr 17 19:01:17 2018 6 relations: 314802
Tue Apr 17 19:01:17 2018 7 relations: 269466
Tue Apr 17 19:01:17 2018 8 relations: 223623
Tue Apr 17 19:01:17 2018 9 relations: 177255
Tue Apr 17 19:01:17 2018 10+ relations: 452733
Tue Apr 17 19:01:17 2018 heaviest cycle: 18 relations
Tue Apr 17 19:01:18 2018 commencing cycle optimization
Tue Apr 17 19:01:22 2018 start with 18655987 relations
Tue Apr 17 19:01:39 2018 pruned 422846 relations
Tue Apr 17 19:01:40 2018 memory use: 636.2 MB
Tue Apr 17 19:01:40 2018 distribution of cycle lengths:
Tue Apr 17 19:01:40 2018 1 relations: 454531
Tue Apr 17 19:01:40 2018 2 relations: 460781
Tue Apr 17 19:01:40 2018 3 relations: 465658
Tue Apr 17 19:01:40 2018 4 relations: 414553
Tue Apr 17 19:01:40 2018 5 relations: 372861
Tue Apr 17 19:01:40 2018 6 relations: 317946
Tue Apr 17 19:01:40 2018 7 relations: 271268
Tue Apr 17 19:01:40 2018 8 relations: 221444
Tue Apr 17 19:01:40 2018 9 relations: 173339
Tue Apr 17 19:01:40 2018 10+ relations: 411055
Tue Apr 17 19:01:40 2018 heaviest cycle: 18 relations
Tue Apr 17 19:01:43 2018 RelProcTime: 966
Tue Apr 17 19:01:45 2018 elapsed time 00:16:09
Tue Apr 17 19:01:45 2018 LatSieveTime: 3030.97
Tue Apr 17 19:01:45 2018 -> Running matrix solving step ...
Tue Apr 17 19:01:45 2018 -> ./msieve -s ../factorMain/factorWork/comp.dat -l ../factorMain/factorWork/comp.log -i ../factorMain/factorWork/comp.ini -nf ../factorMain/factorWork/comp.fb -t 8 -nc2
Tue Apr 17 19:01:45 2018
Tue Apr 17 19:01:45 2018
Tue Apr 17 19:01:45 2018 Msieve v. 1.54 (SVN 1015)
Tue Apr 17 19:01:45 2018 random seeds: c6e44713 6025a0d7
Tue Apr 17 19:01:45 2018 factoring 481650646493457195451617145540517069613323533182564220176531860824741008875219974531986243531278506390059994303957879242329686004440770045179594064661 (150 digits)
Tue Apr 17 19:01:45 2018 searching for 15-digit factors
Tue Apr 17 19:01:46 2018 commencing number field sieve (150-digit input)
Tue Apr 17 19:01:46 2018 R0: -83721706899140253437921287641
Tue Apr 17 19:01:46 2018 R1: 2993340772548997
Tue Apr 17 19:01:46 2018 A0: -66005598344673508475341101414281312
Tue Apr 17 19:01:46 2018 A1: 756102870683322094138799460536
Tue Apr 17 19:01:46 2018 A2: 355141757301006322882440
Tue Apr 17 19:01:46 2018 A3: -1818204891139602416
Tue Apr 17 19:01:46 2018 A4: -2130702262759
Tue Apr 17 19:01:46 2018 A5: 117096
Tue Apr 17 19:01:46 2018 skew 1071014.76, size 1.235e-14, alpha -6.383, combined = 4.891e-12 rroots = 3
Tue Apr 17 19:01:46 2018
Tue Apr 17 19:01:46 2018 commencing linear algebra
Tue Apr 17 19:01:46 2018 read 3563436 cycles
Tue Apr 17 19:01:50 2018 cycles contain 10710077 unique relations
Tue Apr 17 19:02:32 2018 LatSieveTime: 2316.42
Tue Apr 17 19:02:51 2018 read 10710077 relations
Tue Apr 17 19:02:55 2018 LatSieveTime: 5806.55
Tue Apr 17 19:03:02 2018 using 20 quadratic characters above 4294917295
Tue Apr 17 19:03:48 2018 building initial matrix
Tue Apr 17 19:04:44 2018 LatSieveTime: 1987.28
Tue Apr 17 19:05:12 2018 memory use: 1469.6 MB
Tue Apr 17 19:05:13 2018 read 3563436 cycles
Tue Apr 17 19:05:13 2018 matrix is 3563257 x 3563436 (1082.4 MB) with weight 337682605 (94.76/col)
Tue Apr 17 19:05:13 2018 sparse part has weight 240989372 (67.63/col)
Tue Apr 17 19:05:43 2018 filtering completed in 2 passes
Tue Apr 17 19:05:43 2018 matrix is 3560664 x 3560843 (1082.2 MB) with weight 337585252 (94.80/col)
Tue Apr 17 19:05:43 2018 sparse part has weight 240962460 (67.67/col)
Tue Apr 17 19:05:59 2018 matrix starts at (0, 0)
Tue Apr 17 19:05:59 2018 matrix is 3560664 x 3560843 (1082.2 MB) with weight 337585252 (94.80/col)
Tue Apr 17 19:05:59 2018 sparse part has weight 240962460 (67.67/col)
Tue Apr 17 19:05:59 2018 saving the first 48 matrix rows for later
Tue Apr 17 19:06:00 2018 matrix includes 64 packed rows
Tue Apr 17 19:06:00 2018 matrix is 3560616 x 3560843 (1038.7 MB) with weight 267536437 (75.13/col)
Tue Apr 17 19:06:00 2018 sparse part has weight 236679147 (66.47/col)
Tue Apr 17 19:06:00 2018 using block size 8192 and superblock size 786432 for processor cache size 8192 kB
Tue Apr 17 19:06:10 2018 commencing Lanczos iteration (8 threads)
Tue Apr 17 19:06:10 2018 memory use: 831.4 MB
Tue Apr 17 19:06:20 2018 linear algebra at 0.0%, ETA 5h51m
Tue Apr 17 19:06:23 2018 checkpointing every 610000 dimensions
Tue Apr 17 19:07:51 2018 LatSieveTime: 2024.64
Tue Apr 17 15:11:09 2018 LatSieveTime: 4297.73
Tue Apr 17 19:11:58 2018 LatSieveTime: 4107.64
Tue Apr 17 19:12:07 2018 LatSieveTime: 4236.04
Tue Apr 17 19:12:30 2018 LatSieveTime: 5995.5
Tue Apr 17 19:13:59 2018 LatSieveTime: 2024.09
Tue Apr 17 19:16:49 2018 LatSieveTime: 2279.4
Sun Feb 18 10:19:55 2018 LatSieveTime: 4227.26
Tue Apr 17 19:25:11 2018 LatSieveTime: 2189.92
Tue Apr 17 19:25:46 2018 LatSieveTime: 4227.41
Tue Apr 17 19:26:46 2018 LatSieveTime: 3295.35
Tue Apr 17 19:27:04 2018 LatSieveTime: 4347.59
Tue Apr 17 15:28:31 2018 LatSieveTime: 2630.73
Tue Apr 17 19:30:00 2018 LatSieveTime: 7810.49
Tue Apr 17 19:31:54 2018 LatSieveTime: 1967.34
Tue Apr 17 19:34:23 2018 LatSieveTime: 3455.62
Tue Apr 17 19:34:39 2018 LatSieveTime: 2718.09
Tue Apr 17 19:42:35 2018 LatSieveTime: 3320.02
Tue Apr 17 19:36:38 2018 LatSieveTime: 8180.81
Tue Apr 17 19:43:00 2018 LatSieveTime: 10666.7
Tue Apr 17 19:45:35 2018 LatSieveTime: 5277.45
Tue Apr 17 19:47:36 2018 LatSieveTime: 6248.6
Tue Apr 17 19:52:47 2018 LatSieveTime: 16278.5
Tue Apr 17 19:56:33 2018 LatSieveTime: 8311.56
Tue Apr 17 19:57:51 2018 LatSieveTime: 4493.17
Tue Apr 17 20:03:34 2018 LatSieveTime: 4837.61
Tue Apr 17 20:14:07 2018 LatSieveTime: 11559.8
Tue Apr 17 20:18:42 2018 LatSieveTime: 6825.77
Tue Apr 17 20:23:41 2018 LatSieveTime: 6695.49
Tue Apr 17 20:29:42 2018 LatSieveTime: 17437.6
Tue Apr 17 19:47:09 2018 LatSieveTime: 11764.6
Tue Apr 17 21:20:43 2018 LatSieveTime: 12584
Tue Apr 17 21:35:48 2018 LatSieveTime: 12749.7
Tue Apr 17 21:57:29 2018 LatSieveTime: 13336.7
Tue Apr 17 23:12:40 2018 LatSieveTime: 22402.1
Wed Apr 18 00:57:55 2018 lanczos halted after 56316 iterations (dim = 3560616)
Wed Apr 18 00:57:58 2018 recovered 34 nontrivial dependencies
Wed Apr 18 00:57:58 2018 BLanczosTime: 21372
Wed Apr 18 00:57:58 2018 elapsed time 05:56:13
Wed Apr 18 00:57:58 2018 -> Running square root step ...
Wed Apr 18 00:57:58 2018 -> ./msieve -s ../factorMain/factorWork/comp.dat -l ../factorMain/factorWork/comp.log -i ../factorMain/factorWork/comp.ini -nf ../factorMain/factorWork/comp.fb -t 8 -nc3
Wed Apr 18 00:57:58 2018
Wed Apr 18 00:57:58 2018
Wed Apr 18 00:57:58 2018 Msieve v. 1.54 (SVN 1015)
Wed Apr 18 00:57:58 2018 random seeds: 0dc252cd 548d7aec
Wed Apr 18 00:57:58 2018 factoring 481650646493457195451617145540517069613323533182564220176531860824741008875219974531986243531278506390059994303957879242329686004440770045179594064661 (150 digits)
Wed Apr 18 00:57:59 2018 searching for 15-digit factors
Wed Apr 18 00:57:59 2018 commencing number field sieve (150-digit input)
Wed Apr 18 00:57:59 2018 R0: -83721706899140253437921287641
Wed Apr 18 00:57:59 2018 R1: 2993340772548997
Wed Apr 18 00:57:59 2018 A0: -66005598344673508475341101414281312
Wed Apr 18 00:57:59 2018 A1: 756102870683322094138799460536
Wed Apr 18 00:57:59 2018 A2: 355141757301006322882440
Wed Apr 18 00:57:59 2018 A3: -1818204891139602416
Wed Apr 18 00:57:59 2018 A4: -2130702262759
Wed Apr 18 00:57:59 2018 A5: 117096
Wed Apr 18 00:57:59 2018 skew 1071014.76, size 1.235e-14, alpha -6.383, combined = 4.891e-12 rroots = 3
Wed Apr 18 00:57:59 2018
Wed Apr 18 00:57:59 2018 commencing square root phase
Wed Apr 18 00:57:59 2018 reading relations for dependency 1
Wed Apr 18 00:58:00 2018 read 1780260 cycles
Wed Apr 18 00:58:02 2018 cycles contain 5353580 unique relations
Wed Apr 18 00:58:33 2018 read 5353580 relations
Wed Apr 18 00:58:55 2018 multiplying 5353580 relations
Wed Apr 18 01:03:51 2018 multiply complete, coefficients have about 288.86 million bits
Wed Apr 18 01:03:52 2018 initial square root is modulo 152563
Wed Apr 18 01:09:42 2018 found factor: 584458412373341050558641690854880477452541557046361
Wed Apr 18 01:09:42 2018 reading relations for dependency 2
Wed Apr 18 01:09:43 2018 read 1780316 cycles
Wed Apr 18 01:09:45 2018 cycles contain 5356386 unique relations
Wed Apr 18 01:10:16 2018 read 5356386 relations
Wed Apr 18 01:10:38 2018 multiplying 5356386 relations
Wed Apr 18 01:15:33 2018 multiply complete, coefficients have about 289.01 million bits
Wed Apr 18 01:15:34 2018 initial square root is modulo 153529
Wed Apr 18 01:21:26 2018 found factor: 584458412373341050558641690854880477452541557046361
Wed Apr 18 01:21:26 2018 reading relations for dependency 3
Wed Apr 18 01:21:26 2018 read 1777883 cycles
Wed Apr 18 01:21:28 2018 cycles contain 5350910 unique relations
Wed Apr 18 01:22:00 2018 read 5350910 relations
Wed Apr 18 01:22:21 2018 multiplying 5350910 relations
Wed Apr 18 01:27:18 2018 multiply complete, coefficients have about 288.72 million bits
Wed Apr 18 01:27:20 2018 initial square root is modulo 151579
Wed Apr 18 01:33:12 2018 found factor: 584458412373341050558641690854880477452541557046361
Wed Apr 18 01:33:12 2018 reading relations for dependency 4
Wed Apr 18 01:33:12 2018 read 1779216 cycles
Wed Apr 18 01:33:14 2018 cycles contain 5349988 unique relations
Wed Apr 18 01:33:46 2018 read 5349988 relations
Wed Apr 18 01:34:07 2018 multiplying 5349988 relations
Wed Apr 18 01:39:05 2018 multiply complete, coefficients have about 288.67 million bits
Wed Apr 18 01:39:06 2018 initial square root is modulo 151337
Wed Apr 18 01:45:00 2018 sqrtTime: 2821
Wed Apr 18 01:45:00 2018 p32 factor: 22947545427314151445011966017377
Wed Apr 18 01:45:00 2018 p51 factor: 584458412373341050558641690854880477452541557046361
Wed Apr 18 01:45:00 2018 p68 factor: 35912223503197268109418424875344813700437442706880566137398291217213
Wed Apr 18 01:45:00 2018 elapsed time 00:47:02
Wed Apr 18 01:45:00 2018 -> Computing time scale for this machine...
Wed Apr 18 01:45:00 2018 -> procrels -speedtest> PIPE
Wed Apr 18 01:45:04 2018 -> Factorization summary written to g150-comp.txt
[/code]Note, that, although it doesn't affect the cado-nfs vs. msieve/ggnfs results, I'm just a little concerned about my initial cado-nfs run with default params. I might have missed it, if any clients dropped out, since I wasn't looking for it at that time.

VBCurtis 2018-05-01 19:25

[QUOTE=EdH;486740]Note, that, although it doesn't affect the cado-nfs vs. msieve/ggnfs results, I'm just a little concerned about my initial cado-nfs run with default params. I might have missed it, if any clients dropped out, since I wasn't looking for it at that time.[/QUOTE]

This would only affect wall-clock time, not the CPU-time that we're mainly using for comparison. Most of the wall-clock time is eaten by the matrix, so a few clients disappearing during the sieve phase shouldn't matter much.

I set CADO to use 29/30-bit large primes, and your run used 72M raw relations to build the matrix. That doesn't sound massively higher than my recollection of msieve at this size, so perhaps my claim that CADO filtering is a hindrance is mistaken.
I'll dig up some C150 tasks and give this GGNFS vs CADO comparison a shot myself sometime soon.

EdH 2018-05-04 03:18

Curtis,

I've added another machine to my mix and hope to rework some of my scripts in the next few days. I'm trying to see if there is a way to fit a 160 into my daytime on time for several of the machines. I think you came up with a doubling of effort at something over 5 digits with cado-nfs. I calculated the poly select and sieving at just over 7 hours for the C150. This only allows for one doubling, if that, within my limited window.

How do you think your C150 params would run for a C155, if I try to see how it would compare to the C150, realizing an added machine would skew any direct comparison?

Ed

VBCurtis 2018-05-04 04:06

Ed-
I'll have a C155 best-guess file for you in a couple hours. I won't change much from the C150, but I'll try to make it a bit better than the existing C150 params would be.

If you happen to still have the log from the good C150 run, try to dig through the file to discover the last Q searched? That info gives me an idea of whether the alim/rlim choices need to be inflated a bit for a larger factorization. I'm hoping this final-Q is around 15-18M.
If I get the lim's set correctly for C155, you should manage to come in under double the time for C150. I realized that I can make poly select a bit more efficient for your mega-multi-process setup; I'll put that into the params file too. It's a simple change: poly select was previously split into segments of 500, with CADO searching every multiple of 60. If we change that to 480, each process will always search 8 coefficients, instead of some receiving 9 to split into two threads. Fewer single-threaded running will take place, so wall-clock time will improve a bit for the same CPU-time.

wombatman 2018-05-04 04:08

Curtis, would you be willing to go ahead and take a stab at a C165 parameter file (or suggest what you think I could change to improve on the default)? I have a C164 I want to tackle with CADO, but given that you were able to cut the time on the C150 in half, I think it would be more than worthwhile to try and do the same for the C164.

VBCurtis 2018-05-04 04:17

Sure! I think I got lucky on the C150, as it's easier to extrapolate on slightly-smaller jobs, but I'll be happy to give it a shot. Having you guys test my settings makes refining parameters a whole lot less of a grind for me.
I'll post both after I finish the Winnipeg/Nashville game on my DVR, likely by 11PM PDT.

VBCurtis 2018-05-04 05:41

Ed-
Here's my C155 draft params:
[code]###########################################################################
# Polynomial selection
###########################################################################

tasks.polyselect.degree = 5

tasks.polyselect.P = 650000
tasks.polyselect.admin = 1020
tasks.polyselect.admax = 22e4
tasks.polyselect.adrange = 480
tasks.polyselect.incr = 60
tasks.polyselect.nq = 15625
tasks.polyselect.nrkeep = 100
tasks.polyselect.ropteffort = 13

###########################################################################
# Sieve
###########################################################################

lim0 = 18000000
lim1 = 33000000
lpb0 = 30
lpb1 = 31
tasks.sieve.mfb0 = 60
tasks.sieve.mfb1 = 62
tasks.sieve.ncurves0 = 17
tasks.sieve.ncurves1 = 25
tasks.I = 14

tasks.sieve.qrange = 2000
tasks.sieve.qmin = 4500000

###########################################################################
# Filtering
###########################################################################

tasks.filter.purge.keep = 175
tasks.filter.maxlevel = 30
tasks.filter.target_density = 155.0[/code]

This should spend about the same time on poly select as the C150 did. Sieving will hopefully be just under twice as long, and no idea what the matrix will do.
If you get a chance on the console output to note the last Q sieved, I would appreciate that info in addition to the stats you previously reported. If yield is similar to an old G155 GGNFS run I found a log for, it'll be around 28M. I expect yield to be better than GGNFS, so hopefully your max-Q will be below 25M.
Good luck!

VBCurtis 2018-05-04 05:56

wombatman-
Here's my guess at c165 params:
[code]###########################################################################
# Polynomial selection
###########################################################################

tasks.polyselect.degree = 5

tasks.polyselect.P = 1200000
tasks.polyselect.admin = 1500
tasks.polyselect.admax = 6e5
tasks.polyselect.adrange = 960
tasks.polyselect.incr = 60
tasks.polyselect.nq = 15625
tasks.polyselect.nrkeep = 100
tasks.polyselect.ropteffort = 15

###########################################################################
# Sieve
###########################################################################

tasks.I = 14
tasks.sieve.qmin = 7000000
lim0 = 38000000
lim1 = 60000000
lpb0 = 31
lpb1 = 32
tasks.sieve.mfb0 = 62
tasks.sieve.mfb1 = 64
tasks.sieve.ncurves0 = 18
tasks.sieve.ncurves1 = 25
tasks.sieve.qrange = 2000

###########################################################################
# Filtering
###########################################################################

tasks.filter.purge.keep = 175
tasks.filter.maxlevel = 32
tasks.filter.target_density = 160.0
[/code]
Note that bwc.interval is just a "how often we update the screen" setting; I like it at 500 so I get more frequent screen output.
The poly select should take ~1.4M cpu-seconds, which is 4x longer than Ed's C155 file. I did a CADO poly select for C166 a month or two ago, and spent 1.55M cpu-seconds; I adjusted params a bit lower for your C164.
If you could report the CPU and wall-clock times for size-opt, root-opt, sieve, LA as Ed did, that would help me out for future optimization. If you happen to see the maximum Q sieved, that helps refine alim/rlim selection.

EdH 2018-05-04 13:19

Thanks Curtis,

I still might be a "couple" days before I get the scripts where I want them and I lost the drive on one of my i7s. Hopefully, just a minor inconvenience.

I was sure I kept the temporary directory, but it isn't there. However, I do have the last part of the terminal output. The last assignment says:
[code]
Sending workunit c150_sieving_15994000-15996000 to client
[/code]The thread default for all the clients is two. If I am not planning on the machines doing anything but cado-nfs, would it be more efficient to customize the threads based on the available processors to possibly one client using all threads? I could easily do this within my scripts.

I might be over thinking this, but what about the difference between machine capability? Let's say a Core2 Quad 2.4GHz (4 proc) vs. an i7 3.4GHz with HT (8 proc). Would it be better to load up a single client on both machines, or maybe give the Core2 a single loaded client and the i7 four, two thread clients for a total of five more balanced instances?

How much value does bogomips have?

Ed


All times are UTC. The time now is 12:37.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.