20201219, 03:03  #1 
Jun 2012
2961_{10} Posts 
C305_142_139
Ryan Propper is currently attempting to factor C305_142_139 via SNFS. This is the only composite in the XYYXF project (1 < y < x < 151) with no known factors.
This job ultimately required 3 LP on the rational side run as a 34bit job as sieved by CADO (using las). The invocation of las was (partially) as follows: Code:
./las poly ./c215.poly q0 65210000 q1 65220000 I 16 lim0 500000000 lim1 500000000 lpb0 34 lpb1 34 mfb0 67 mfb1 98 ncurves0 26 ncurves1 21 fb1 ./c215.roots.gz Code:
n: 20328474162529416647311982959537843499999146827155841098525615602167851195804786454820231727377603285082682529078490663751950024881559689820074466693727520628346302576066843023544023886449021575509433349099599424248755063838567414089807242454201325599982322964639626679563755005084965722261867838430023049 # 142^139+139^142, difficulty: 308.59, anorm: 3.31e+39, rnorm: 6.77e+56 # scaled difficulty: 311.48, suggest sieving rational side # size = 5.823e16, alpha = 0.000, combined = 4.242e16, rroots = 0 type: snfs size: 308 skew: 11.8319 c6: 1 c0: 2743582 Y1: 31814999504641997296916177121902819369397243609088 Y0: 2706170815394782263161480417494333876776014302110241 Now in the home stretch, switching back to msieve for filtering and further post processing on a 64core Xeon with TD=130 results in a ~116M matrix: Code:
Thu Dec 17 05:17:14 2020 reduce to 400969081 relations and 396476986 ideals in 6 passes Thu Dec 17 05:17:14 2020 max relations containing the same ideal: 150 Thu Dec 17 05:18:55 2020 relations with 0 large ideals: 144572 Thu Dec 17 05:18:55 2020 relations with 1 large ideals: 56456 Thu Dec 17 05:18:55 2020 relations with 2 large ideals: 667332 Thu Dec 17 05:18:55 2020 relations with 3 large ideals: 4638677 Thu Dec 17 05:18:55 2020 relations with 4 large ideals: 19434196 Thu Dec 17 05:18:55 2020 relations with 5 large ideals: 51810443 Thu Dec 17 05:18:55 2020 relations with 6 large ideals: 90047701 Thu Dec 17 05:18:55 2020 relations with 7+ large ideals: 234169704 Thu Dec 17 05:18:55 2020 commencing 2way merge Thu Dec 17 05:24:03 2020 reduce to 261402996 relation sets and 256910901 unique ideals Thu Dec 17 05:24:03 2020 commencing full merge Thu Dec 17 07:05:52 2020 memory use: 31315.8 MB Thu Dec 17 07:06:18 2020 found 116397412 cycles, need 116255101 Thu Dec 17 07:07:06 2020 weight of 116255101 cycles is about 15113350070 (130.00/cycle) Thu Dec 17 07:07:06 2020 distribution of cycle lengths: Thu Dec 17 07:07:06 2020 1 relations: 6616255 Thu Dec 17 07:07:06 2020 2 relations: 8297414 Thu Dec 17 07:07:06 2020 3 relations: 9424454 Thu Dec 17 07:07:06 2020 4 relations: 9281085 Thu Dec 17 07:07:06 2020 5 relations: 9047337 Thu Dec 17 07:07:06 2020 6 relations: 8604374 Thu Dec 17 07:07:06 2020 7 relations: 7990075 Thu Dec 17 07:07:06 2020 8 relations: 7464905 Thu Dec 17 07:07:06 2020 9 relations: 6850961 Thu Dec 17 07:07:06 2020 10+ relations: 42678241 Thu Dec 17 07:07:06 2020 heaviest cycle: 28 relations Thu Dec 17 07:07:35 2020 commencing cycle optimization Thu Dec 17 07:11:22 2020 start with 980341833 relations ... commencing linear algebra read 113102881 cycles cycles contain 384567336 unique relations read 384567336 relations using 20 quadratic characters above 4294917295 building initial matrix memory use: 50442.4 MB read 113102881 cycles matrix is 113102703 x 113102881 (56946.4 MB) with weight 16093062952 (142.29/col) sparse part has weight 13797140491 (121.99/col) filtering completed in 2 passes matrix is 113100600 x 113100774 (56946.3 MB) with weight 16093003071 (142.29/col) sparse part has weight 13797122852 (121.99/col) matrix starts at (0, 0) matrix is 113100600 x 113100774 (56946.3 MB) with weight 16093003071 (142.29/col) sparse part has weight 13797122852 (121.99/col) saving the first 240 matrix rows for later matrix includes 256 packed rows matrix is 113100360 x 113100774 (53664.6 MB) with weight 12790337696 (113.09/col) sparse part has weight 12258232914 (108.38/col) using block size 8192 and superblock size 608256 for processor cache size 25344 kB commencing Lanczos iteration (64 threads) memory use: 74766.4 MB linear algebra at 0.0%, ETA 2547h50m100774 dimensions (0.0%, ETA 2547h50m) checkpointing every 50000 dimensions100774 dimensions (0.0%, ETA 2480h38m) linear algebra completed 161067 of 113100774 dimensions (0.1%, ETA 2241h52m) Last fiddled with by swellman on 20201219 at 03:05 
20201219, 03:36  #2 
"Curtis"
Feb 2005
Riverside, CA
11105_{8} Posts 
I'd sieve more until I got that matrix under 100M dimensions, but that won't change the solve time by more than 25%. With Ryan's resources, sieving is cheap compared to 12 weeks on the fancy Xeon... maybe one day of sieving to save 1014 days of matrix?

20201220, 04:58  #3 
Jun 2012
Boulder, CO
5·53 Posts 
I can certainly try to sieve a bit more, but I'm also finding the river of uniques drying up; I'm getting a higher percentage of dupes found. Perhaps I should try with "I 17" instead of "I 16"?

20201220, 05:13  #4  
Apr 2020
193 Posts 
Quote:
Moving up to I 17 or A 32 would help you get the most out of the improved yield for small q; A 32 is a 2^16 x 2^16 sieve region so basically halfway between I 16 and I 17. Be warned that I 17 will use something like 36GB of memory; A 32 will be more like 18GB. 

20201220, 05:45  #5 
"Curtis"
Feb 2005
Riverside, CA
3·1,559 Posts 
I concur I'd go with side 0 and A 32 over changing to I=17. Maybe at Q=200M?
Duplicate ratio will still be very high, but I think it'll be more efficient than going higher on a side. 
20201220, 18:05  #6  
Jun 2012
Boulder, CO
109_{16} Posts 
Quote:
Here's a sample command now: Code:
./las poly ./c215.poly q0 186490000 q1 186500000 A 32 \ lim0 500000000 lim1 500000000 lpb0 34 lpb1 34 \ mfb0 67 mfb1 98 ncurves0 26 ncurves1 21 \ sqside 0 fb1 ./c215.roots.gz out out.186490000.c215.gz t 4 statsstderr 

20201220, 18:12  #7  
Apr 2020
193_{10} Posts 
Quote:


20201225, 17:33  #8 
Jun 2012
Boulder, CO
5·53 Posts 
Update: with ~1.34B uniques, I've managed to form a somewhat smaller matrix using target_density=140:
Code:
Thu Dec 24 13:17:39 2020 commencing inmemory singleton removal Thu Dec 24 13:18:42 2020 begin with 1341300152 relations and 1130994057 unique ideals Thu Dec 24 13:32:22 2020 reduce to 858598906 relations and 600312276 ideals in 16 passes Thu Dec 24 13:32:22 2020 max relations containing the same ideal: 33 Thu Dec 24 13:33:07 2020 reading ideals above 720000 Thu Dec 24 13:33:08 2020 commencing singleton removal, initial pass Thu Dec 24 15:14:35 2020 memory use: 21024.0 MB Thu Dec 24 15:14:36 2020 reading all ideals from disk Thu Dec 24 15:15:27 2020 memory use: 40880.6 MB Thu Dec 24 15:17:33 2020 keeping 736080120 ideals with weight <= 200, target excess is 4817030 Thu Dec 24 15:19:59 2020 commencing inmemory singleton removal Thu Dec 24 15:21:40 2020 begin with 858614950 relations and 736080120 unique ideals Thu Dec 24 15:35:04 2020 reduce to 858474141 relations and 735936294 ideals in 10 passes ... Fri Dec 25 04:35:04 2020 reduce to 229895200 relation sets and 224306331 unique ideals Fri Dec 25 04:35:04 2020 commencing full merge Fri Dec 25 05:52:10 2020 memory use: 27271.7 MB Fri Dec 25 05:52:32 2020 found 99814363 cycles, need 99574531 Fri Dec 25 05:53:12 2020 weight of 99574531 cycles is about 13941235773 (140.01/cycle) ... Fri Dec 25 06:29:10 2020 heaviest cycle: 28 relations Fri Dec 25 06:33:03 2020 RelProcTime: 79459 Fri Dec 25 06:33:43 2020 elapsed time 22:05:00 
20201231, 14:08  #9 
Jun 2012
3^{2}·7·47 Posts 
“matrix needs more columns than rows; try adding 23% more relations”
Ryan tells me he keeps getting the above error message from msieve when it attempts to build the matrix. I believe this is the old “large dataset bug” biting him here. At last check Ryan is trying different TDs and various revs of msieve on an attempt to get past the bug. Does anyone remember if this big data bug was ever eliminated, or techniques developed to avoid it? Is using CADO for LA a practical option with a job of this size? 
20201231, 14:57  #10 
Apr 2020
193 Posts 
I've spent a fair amount of time browsing old threads here, and from what I recall, at least some cases of the bug were caused by the GGNFS siever occasionally producing relations with composite factors. Msieve now checks for this, but there is one caveat: a quick read of the msieve code suggests it only checks factors below 2^32 for primality, so a larger composite factor could potentially still cause problems. However, I doubt the CADO siever has such a bug; I've never seen "skipped x relations with composite factors" on a CADOsieved job.
The bug didn't appear in the filtering run with fewer relations, so it ought to be possible to binarysearch using "filter_maxrels" to discover exactly where the bug first appears. If it's due to a corrupt relation that somehow hasn't been caught, this would find it. Otherwise I've got no clue. CADO linear algebra would presumably work but would be slower and more memoryintensive than msieve. Last fiddled with by charybdis on 20201231 at 14:58 
20201231, 16:33  #11 
"Ed Hall"
Dec 2009
Adirondack Mtns
2×23×79 Posts 
What about trying remdups4 before filtering? Would that pick out composite relations?
