mersenneforum.org C305_142_139
 Register FAQ Search Today's Posts Mark Forums Read

 2020-12-19, 03:03 #1 swellman     Jun 2012 296110 Posts C305_142_139 Ryan Propper is currently attempting to factor C305_142_139 via SNFS. This is the only composite in the XYYXF project (1 < y < x < 151) with no known factors. This job ultimately required 3 LP on the rational side run as a 34-bit job as sieved by CADO (using las). The invocation of las was (partially) as follows: Code: ./las -poly ./c215.poly -q0 65210000 -q1 65220000 -I 16 -lim0 500000000 -lim1 500000000 -lpb0 34 -lpb1 34 -mfb0 67 -mfb1 98 -ncurves0 26 -ncurves1 21 -fb1 ./c215.roots.gz With the c215.poly: Code: n: 20328474162529416647311982959537843499999146827155841098525615602167851195804786454820231727377603285082682529078490663751950024881559689820074466693727520628346302576066843023544023886449021575509433349099599424248755063838567414089807242454201325599982322964639626679563755005084965722261867838430023049 # 142^139+139^142, difficulty: 308.59, anorm: 3.31e+39, rnorm: 6.77e+56 # scaled difficulty: 311.48, suggest sieving rational side # size = 5.823e-16, alpha = 0.000, combined = 4.242e-16, rroots = 0 type: snfs size: 308 skew: 11.8319 c6: 1 c0: 2743582 Y1: -31814999504641997296916177121902819369397243609088 Y0: 2706170815394782263161480417494333876776014302110241 There some false starts and bumps along the way, as well as some sage advice from several members of the forum. Sieving ultimately took Ryan ~10 days. Now in the home stretch, switching back to msieve for filtering and further post processing on a 64-core Xeon with TD=130 results in a ~116M matrix: Code: Thu Dec 17 05:17:14 2020 reduce to 400969081 relations and 396476986 ideals in 6 passes Thu Dec 17 05:17:14 2020 max relations containing the same ideal: 150 Thu Dec 17 05:18:55 2020 relations with 0 large ideals: 144572 Thu Dec 17 05:18:55 2020 relations with 1 large ideals: 56456 Thu Dec 17 05:18:55 2020 relations with 2 large ideals: 667332 Thu Dec 17 05:18:55 2020 relations with 3 large ideals: 4638677 Thu Dec 17 05:18:55 2020 relations with 4 large ideals: 19434196 Thu Dec 17 05:18:55 2020 relations with 5 large ideals: 51810443 Thu Dec 17 05:18:55 2020 relations with 6 large ideals: 90047701 Thu Dec 17 05:18:55 2020 relations with 7+ large ideals: 234169704 Thu Dec 17 05:18:55 2020 commencing 2-way merge Thu Dec 17 05:24:03 2020 reduce to 261402996 relation sets and 256910901 unique ideals Thu Dec 17 05:24:03 2020 commencing full merge Thu Dec 17 07:05:52 2020 memory use: 31315.8 MB Thu Dec 17 07:06:18 2020 found 116397412 cycles, need 116255101 Thu Dec 17 07:07:06 2020 weight of 116255101 cycles is about 15113350070 (130.00/cycle) Thu Dec 17 07:07:06 2020 distribution of cycle lengths: Thu Dec 17 07:07:06 2020 1 relations: 6616255 Thu Dec 17 07:07:06 2020 2 relations: 8297414 Thu Dec 17 07:07:06 2020 3 relations: 9424454 Thu Dec 17 07:07:06 2020 4 relations: 9281085 Thu Dec 17 07:07:06 2020 5 relations: 9047337 Thu Dec 17 07:07:06 2020 6 relations: 8604374 Thu Dec 17 07:07:06 2020 7 relations: 7990075 Thu Dec 17 07:07:06 2020 8 relations: 7464905 Thu Dec 17 07:07:06 2020 9 relations: 6850961 Thu Dec 17 07:07:06 2020 10+ relations: 42678241 Thu Dec 17 07:07:06 2020 heaviest cycle: 28 relations Thu Dec 17 07:07:35 2020 commencing cycle optimization Thu Dec 17 07:11:22 2020 start with 980341833 relations ... commencing linear algebra read 113102881 cycles cycles contain 384567336 unique relations read 384567336 relations using 20 quadratic characters above 4294917295 building initial matrix memory use: 50442.4 MB read 113102881 cycles matrix is 113102703 x 113102881 (56946.4 MB) with weight 16093062952 (142.29/col) sparse part has weight 13797140491 (121.99/col) filtering completed in 2 passes matrix is 113100600 x 113100774 (56946.3 MB) with weight 16093003071 (142.29/col) sparse part has weight 13797122852 (121.99/col) matrix starts at (0, 0) matrix is 113100600 x 113100774 (56946.3 MB) with weight 16093003071 (142.29/col) sparse part has weight 13797122852 (121.99/col) saving the first 240 matrix rows for later matrix includes 256 packed rows matrix is 113100360 x 113100774 (53664.6 MB) with weight 12790337696 (113.09/col) sparse part has weight 12258232914 (108.38/col) using block size 8192 and superblock size 608256 for processor cache size 25344 kB commencing Lanczos iteration (64 threads) memory use: 74766.4 MB linear algebra at 0.0%, ETA 2547h50m100774 dimensions (0.0%, ETA 2547h50m) checkpointing every 50000 dimensions100774 dimensions (0.0%, ETA 2480h38m) linear algebra completed 161067 of 113100774 dimensions (0.1%, ETA 2241h52m) Anyone have access to a MPI cluster willing to help bring this job home? Last fiddled with by swellman on 2020-12-19 at 03:05
 2020-12-19, 03:36 #2 VBCurtis     "Curtis" Feb 2005 Riverside, CA 111058 Posts I'd sieve more until I got that matrix under 100M dimensions, but that won't change the solve time by more than 25%. With Ryan's resources, sieving is cheap compared to 12 weeks on the fancy Xeon... maybe one day of sieving to save 10-14 days of matrix?
2020-12-20, 04:58   #3
ryanp

Jun 2012
Boulder, CO

5·53 Posts

Quote:
 Originally Posted by VBCurtis I'd sieve more until I got that matrix under 100M dimensions, but that won't change the solve time by more than 25%. With Ryan's resources, sieving is cheap compared to 12 weeks on the fancy Xeon... maybe one day of sieving to save 10-14 days of matrix?
I can certainly try to sieve a bit more, but I'm also finding the river of uniques drying up; I'm getting a higher percentage of dupes found. Perhaps I should try with "-I 17" instead of "-I 16"?

2020-12-20, 05:13   #4
charybdis

Apr 2020

193 Posts

Quote:
 Originally Posted by ryanp I can certainly try to sieve a bit more, but I'm also finding the river of uniques drying up; I'm getting a higher percentage of dupes found. Perhaps I should try with "-I 17" instead of "-I 16"?
Looking at your command line, you've been sieving on the algebraic side, so going back to smaller q and sieving on the rational side (add -sqside 0 to your command line) should get you some more relations.
Moving up to -I 17 or -A 32 would help you get the most out of the improved yield for small q; -A 32 is a 2^16 x 2^16 sieve region so basically halfway between -I 16 and -I 17. Be warned that -I 17 will use something like 36GB of memory; -A 32 will be more like 18GB.

 2020-12-20, 05:45 #5 VBCurtis     "Curtis" Feb 2005 Riverside, CA 3·1,559 Posts I concur- I'd go with side 0 and -A 32 over changing to I=17. Maybe at Q=200M? Duplicate ratio will still be very high, but I think it'll be more efficient than going higher on -a side.
2020-12-20, 18:05   #6
ryanp

Jun 2012
Boulder, CO

10916 Posts

Quote:
 Originally Posted by VBCurtis I concur- I'd go with side 0 and -A 32 over changing to I=17. Maybe at Q=200M? Duplicate ratio will still be very high, but I think it'll be more efficient than going higher on -a side.
Hm, I already had -sqside 0 in the command, not sure why it wasn't included above. Re-running some more sieving with "-A 32".

Here's a sample command now:

Code:
./las -poly ./c215.poly -q0 186490000 -q1 186500000 -A 32 \
-lim0 500000000 -lim1 500000000 -lpb0 34 -lpb1 34 \
-mfb0 67 -mfb1 98 -ncurves0 26 -ncurves1 21 \
-sqside 0 -fb1 ./c215.roots.gz -out out.186490000.c215.gz -t 4 -stats-stderr

2020-12-20, 18:12   #7
charybdis

Apr 2020

19310 Posts

Quote:
 Originally Posted by ryanp Hm, I already had -sqside 0 in the command, not sure why it wasn't included above. Re-running some more sieving with "-A 32". Here's a sample command now: Code: ./las -poly ./c215.poly -q0 186490000 -q1 186500000 -A 32 \ -lim0 500000000 -lim1 500000000 -lpb0 34 -lpb1 34 \ -mfb0 67 -mfb1 98 -ncurves0 26 -ncurves1 21 \ -sqside 0 -fb1 ./c215.roots.gz -out out.186490000.c215.gz -t 4 -stats-stderr
This is fine if you're running special-q below your initial starting point. Otherwise, you'll be wasting lots of effort, so you should switch to the algebraic side by removing -sqside 0 from the command line. Curtis's tests on this number showed that there wasn't much to choose between the two sides.

 2020-12-25, 17:33 #8 ryanp     Jun 2012 Boulder, CO 5·53 Posts Update: with ~1.34B uniques, I've managed to form a somewhat smaller matrix using target_density=140: Code: Thu Dec 24 13:17:39 2020 commencing in-memory singleton removal Thu Dec 24 13:18:42 2020 begin with 1341300152 relations and 1130994057 unique ideals Thu Dec 24 13:32:22 2020 reduce to 858598906 relations and 600312276 ideals in 16 passes Thu Dec 24 13:32:22 2020 max relations containing the same ideal: 33 Thu Dec 24 13:33:07 2020 reading ideals above 720000 Thu Dec 24 13:33:08 2020 commencing singleton removal, initial pass Thu Dec 24 15:14:35 2020 memory use: 21024.0 MB Thu Dec 24 15:14:36 2020 reading all ideals from disk Thu Dec 24 15:15:27 2020 memory use: 40880.6 MB Thu Dec 24 15:17:33 2020 keeping 736080120 ideals with weight <= 200, target excess is 4817030 Thu Dec 24 15:19:59 2020 commencing in-memory singleton removal Thu Dec 24 15:21:40 2020 begin with 858614950 relations and 736080120 unique ideals Thu Dec 24 15:35:04 2020 reduce to 858474141 relations and 735936294 ideals in 10 passes ... Fri Dec 25 04:35:04 2020 reduce to 229895200 relation sets and 224306331 unique ideals Fri Dec 25 04:35:04 2020 commencing full merge Fri Dec 25 05:52:10 2020 memory use: 27271.7 MB Fri Dec 25 05:52:32 2020 found 99814363 cycles, need 99574531 Fri Dec 25 05:53:12 2020 weight of 99574531 cycles is about 13941235773 (140.01/cycle) ... Fri Dec 25 06:29:10 2020 heaviest cycle: 28 relations Fri Dec 25 06:33:03 2020 RelProcTime: 79459 Fri Dec 25 06:33:43 2020 elapsed time 22:05:00 Now in LA, waiting to get an estimate for time to completion...
 2020-12-31, 14:08 #9 swellman     Jun 2012 32·7·47 Posts “matrix needs more columns than rows; try adding 2-3% more relations” Ryan tells me he keeps getting the above error message from msieve when it attempts to build the matrix. I believe this is the old “large dataset bug” biting him here. At last check Ryan is trying different TDs and various revs of msieve on an attempt to get past the bug. Does anyone remember if this big data bug was ever eliminated, or techniques developed to avoid it? Is using CADO for LA a practical option with a job of this size?
 2020-12-31, 14:57 #10 charybdis   Apr 2020 193 Posts I've spent a fair amount of time browsing old threads here, and from what I recall, at least some cases of the bug were caused by the GGNFS siever occasionally producing relations with composite factors. Msieve now checks for this, but there is one caveat: a quick read of the msieve code suggests it only checks factors below 2^32 for primality, so a larger composite factor could potentially still cause problems. However, I doubt the CADO siever has such a bug; I've never seen "skipped x relations with composite factors" on a CADO-sieved job. The bug didn't appear in the filtering run with fewer relations, so it ought to be possible to binary-search using "filter_maxrels" to discover exactly where the bug first appears. If it's due to a corrupt relation that somehow hasn't been caught, this would find it. Otherwise I've got no clue. CADO linear algebra would presumably work but would be slower and more memory-intensive than msieve. Last fiddled with by charybdis on 2020-12-31 at 14:58
 2020-12-31, 16:33 #11 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 2×23×79 Posts What about trying remdups4 before filtering? Would that pick out composite relations?