mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > XYYXF Project

Reply
 
Thread Tools
Old 2020-12-19, 03:03   #1
swellman
 
swellman's Avatar
 
Jun 2012

296110 Posts
Default C305_142_139

Ryan Propper is currently attempting to factor C305_142_139 via SNFS. This is the only composite in the XYYXF project (1 < y < x < 151) with no known factors.

This job ultimately required 3 LP on the rational side run as a 34-bit job as sieved by CADO (using las).

The invocation of las was (partially) as follows:
Code:
./las -poly ./c215.poly -q0 65210000 -q1 65220000 -I 16 -lim0 500000000 -lim1 500000000 -lpb0 34 -lpb1 34 -mfb0 67 -mfb1 98 -ncurves0 26 -ncurves1 21 -fb1 ./c215.roots.gz
With the c215.poly:
Code:
n: 20328474162529416647311982959537843499999146827155841098525615602167851195804786454820231727377603285082682529078490663751950024881559689820074466693727520628346302576066843023544023886449021575509433349099599424248755063838567414089807242454201325599982322964639626679563755005084965722261867838430023049
# 142^139+139^142, difficulty: 308.59, anorm: 3.31e+39, rnorm: 6.77e+56
# scaled difficulty: 311.48, suggest sieving rational side
# size = 5.823e-16, alpha = 0.000, combined = 4.242e-16, rroots = 0
type: snfs
size: 308
skew: 11.8319
c6: 1
c0: 2743582
Y1: -31814999504641997296916177121902819369397243609088
Y0: 2706170815394782263161480417494333876776014302110241
There some false starts and bumps along the way, as well as some sage advice from several members of the forum. Sieving ultimately took Ryan ~10 days.

Now in the home stretch, switching back to msieve for filtering and further post processing on a 64-core Xeon with TD=130 results in a ~116M matrix:
Code:
Thu Dec 17 05:17:14 2020  reduce to 400969081 relations and 396476986 ideals in 6 passes
Thu Dec 17 05:17:14 2020  max relations containing the same ideal: 150
Thu Dec 17 05:18:55 2020  relations with 0 large ideals: 144572
Thu Dec 17 05:18:55 2020  relations with 1 large ideals: 56456
Thu Dec 17 05:18:55 2020  relations with 2 large ideals: 667332
Thu Dec 17 05:18:55 2020  relations with 3 large ideals: 4638677
Thu Dec 17 05:18:55 2020  relations with 4 large ideals: 19434196
Thu Dec 17 05:18:55 2020  relations with 5 large ideals: 51810443
Thu Dec 17 05:18:55 2020  relations with 6 large ideals: 90047701
Thu Dec 17 05:18:55 2020  relations with 7+ large ideals: 234169704
Thu Dec 17 05:18:55 2020  commencing 2-way merge
Thu Dec 17 05:24:03 2020  reduce to 261402996 relation sets and 256910901 unique ideals
Thu Dec 17 05:24:03 2020  commencing full merge
Thu Dec 17 07:05:52 2020  memory use: 31315.8 MB
Thu Dec 17 07:06:18 2020  found 116397412 cycles, need 116255101
Thu Dec 17 07:07:06 2020  weight of 116255101 cycles is about 15113350070 (130.00/cycle)
Thu Dec 17 07:07:06 2020  distribution of cycle lengths:
Thu Dec 17 07:07:06 2020  1 relations: 6616255
Thu Dec 17 07:07:06 2020  2 relations: 8297414
Thu Dec 17 07:07:06 2020  3 relations: 9424454
Thu Dec 17 07:07:06 2020  4 relations: 9281085
Thu Dec 17 07:07:06 2020  5 relations: 9047337
Thu Dec 17 07:07:06 2020  6 relations: 8604374
Thu Dec 17 07:07:06 2020  7 relations: 7990075
Thu Dec 17 07:07:06 2020  8 relations: 7464905
Thu Dec 17 07:07:06 2020  9 relations: 6850961
Thu Dec 17 07:07:06 2020  10+ relations: 42678241
Thu Dec 17 07:07:06 2020  heaviest cycle: 28 relations
Thu Dec 17 07:07:35 2020  commencing cycle optimization
Thu Dec 17 07:11:22 2020  start with 980341833 relations

...

commencing linear algebra
read 113102881 cycles
cycles contain 384567336 unique relations
read 384567336 relations
using 20 quadratic characters above 4294917295
building initial matrix
memory use: 50442.4 MB
read 113102881 cycles
matrix is 113102703 x 113102881 (56946.4 MB) with weight 16093062952 (142.29/col)
sparse part has weight 13797140491 (121.99/col)
filtering completed in 2 passes
matrix is 113100600 x 113100774 (56946.3 MB) with weight 16093003071 (142.29/col)
sparse part has weight 13797122852 (121.99/col)
matrix starts at (0, 0)
matrix is 113100600 x 113100774 (56946.3 MB) with weight 16093003071 (142.29/col)
sparse part has weight 13797122852 (121.99/col)
saving the first 240 matrix rows for later
matrix includes 256 packed rows
matrix is 113100360 x 113100774 (53664.6 MB) with weight 12790337696 (113.09/col)
sparse part has weight 12258232914 (108.38/col)
using block size 8192 and superblock size 608256 for processor cache size 25344 kB
commencing Lanczos iteration (64 threads)
memory use: 74766.4 MB
linear algebra at 0.0%, ETA 2547h50m100774 dimensions (0.0%, ETA 2547h50m)    
checkpointing every 50000 dimensions100774 dimensions (0.0%, ETA 2480h38m)    
linear algebra completed 161067 of 113100774 dimensions (0.1%, ETA 2241h52m)
Anyone have access to a MPI cluster willing to help bring this job home?

Last fiddled with by swellman on 2020-12-19 at 03:05
swellman is offline   Reply With Quote
Old 2020-12-19, 03:36   #2
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

111058 Posts
Default

I'd sieve more until I got that matrix under 100M dimensions, but that won't change the solve time by more than 25%. With Ryan's resources, sieving is cheap compared to 12 weeks on the fancy Xeon... maybe one day of sieving to save 10-14 days of matrix?
VBCurtis is offline   Reply With Quote
Old 2020-12-20, 04:58   #3
ryanp
 
ryanp's Avatar
 
Jun 2012
Boulder, CO

5·53 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I'd sieve more until I got that matrix under 100M dimensions, but that won't change the solve time by more than 25%. With Ryan's resources, sieving is cheap compared to 12 weeks on the fancy Xeon... maybe one day of sieving to save 10-14 days of matrix?
I can certainly try to sieve a bit more, but I'm also finding the river of uniques drying up; I'm getting a higher percentage of dupes found. Perhaps I should try with "-I 17" instead of "-I 16"?
ryanp is offline   Reply With Quote
Old 2020-12-20, 05:13   #4
charybdis
 
Apr 2020

193 Posts
Default

Quote:
Originally Posted by ryanp View Post
I can certainly try to sieve a bit more, but I'm also finding the river of uniques drying up; I'm getting a higher percentage of dupes found. Perhaps I should try with "-I 17" instead of "-I 16"?
Looking at your command line, you've been sieving on the algebraic side, so going back to smaller q and sieving on the rational side (add -sqside 0 to your command line) should get you some more relations.
Moving up to -I 17 or -A 32 would help you get the most out of the improved yield for small q; -A 32 is a 2^16 x 2^16 sieve region so basically halfway between -I 16 and -I 17. Be warned that -I 17 will use something like 36GB of memory; -A 32 will be more like 18GB.
charybdis is offline   Reply With Quote
Old 2020-12-20, 05:45   #5
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3·1,559 Posts
Default

I concur- I'd go with side 0 and -A 32 over changing to I=17. Maybe at Q=200M?
Duplicate ratio will still be very high, but I think it'll be more efficient than going higher on -a side.
VBCurtis is offline   Reply With Quote
Old 2020-12-20, 18:05   #6
ryanp
 
ryanp's Avatar
 
Jun 2012
Boulder, CO

10916 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I concur- I'd go with side 0 and -A 32 over changing to I=17. Maybe at Q=200M?
Duplicate ratio will still be very high, but I think it'll be more efficient than going higher on -a side.
Hm, I already had -sqside 0 in the command, not sure why it wasn't included above. Re-running some more sieving with "-A 32".

Here's a sample command now:

Code:
./las -poly ./c215.poly -q0 186490000 -q1 186500000 -A 32 \
  -lim0 500000000 -lim1 500000000 -lpb0 34 -lpb1 34 \
  -mfb0 67 -mfb1 98 -ncurves0 26 -ncurves1 21 \
  -sqside 0 -fb1 ./c215.roots.gz -out out.186490000.c215.gz -t 4 -stats-stderr
ryanp is offline   Reply With Quote
Old 2020-12-20, 18:12   #7
charybdis
 
Apr 2020

19310 Posts
Default

Quote:
Originally Posted by ryanp View Post
Hm, I already had -sqside 0 in the command, not sure why it wasn't included above. Re-running some more sieving with "-A 32".

Here's a sample command now:

Code:
./las -poly ./c215.poly -q0 186490000 -q1 186500000 -A 32 \
  -lim0 500000000 -lim1 500000000 -lpb0 34 -lpb1 34 \
  -mfb0 67 -mfb1 98 -ncurves0 26 -ncurves1 21 \
  -sqside 0 -fb1 ./c215.roots.gz -out out.186490000.c215.gz -t 4 -stats-stderr
This is fine if you're running special-q below your initial starting point. Otherwise, you'll be wasting lots of effort, so you should switch to the algebraic side by removing -sqside 0 from the command line. Curtis's tests on this number showed that there wasn't much to choose between the two sides.
charybdis is offline   Reply With Quote
Old 2020-12-25, 17:33   #8
ryanp
 
ryanp's Avatar
 
Jun 2012
Boulder, CO

5·53 Posts
Default

Update: with ~1.34B uniques, I've managed to form a somewhat smaller matrix using target_density=140:

Code:
Thu Dec 24 13:17:39 2020  commencing in-memory singleton removal
Thu Dec 24 13:18:42 2020  begin with 1341300152 relations and 1130994057 unique ideals
Thu Dec 24 13:32:22 2020  reduce to 858598906 relations and 600312276 ideals in 16 passes
Thu Dec 24 13:32:22 2020  max relations containing the same ideal: 33
Thu Dec 24 13:33:07 2020  reading ideals above 720000
Thu Dec 24 13:33:08 2020  commencing singleton removal, initial pass
Thu Dec 24 15:14:35 2020  memory use: 21024.0 MB
Thu Dec 24 15:14:36 2020  reading all ideals from disk
Thu Dec 24 15:15:27 2020  memory use: 40880.6 MB
Thu Dec 24 15:17:33 2020  keeping 736080120 ideals with weight <= 200, target excess is 4817030
Thu Dec 24 15:19:59 2020  commencing in-memory singleton removal
Thu Dec 24 15:21:40 2020  begin with 858614950 relations and 736080120 unique ideals
Thu Dec 24 15:35:04 2020  reduce to 858474141 relations and 735936294 ideals in 10 passes
...
Fri Dec 25 04:35:04 2020  reduce to 229895200 relation sets and 224306331 unique ideals
Fri Dec 25 04:35:04 2020  commencing full merge
Fri Dec 25 05:52:10 2020  memory use: 27271.7 MB
Fri Dec 25 05:52:32 2020  found 99814363 cycles, need 99574531
Fri Dec 25 05:53:12 2020  weight of 99574531 cycles is about 13941235773 (140.01/cycle)
...
Fri Dec 25 06:29:10 2020  heaviest cycle: 28 relations
Fri Dec 25 06:33:03 2020  RelProcTime: 79459
Fri Dec 25 06:33:43 2020  elapsed time 22:05:00
Now in LA, waiting to get an estimate for time to completion...
ryanp is offline   Reply With Quote
Old 2020-12-31, 14:08   #9
swellman
 
swellman's Avatar
 
Jun 2012

32·7·47 Posts
Default

“matrix needs more columns than rows; try adding 2-3% more relations”

Ryan tells me he keeps getting the above error message from msieve when it attempts to build the matrix. I believe this is the old “large dataset bug” biting him here. At last check Ryan is trying different TDs and various revs of msieve on an attempt to get past the bug.

Does anyone remember if this big data bug was ever eliminated, or techniques developed to avoid it?

Is using CADO for LA a practical option with a job of this size?
swellman is offline   Reply With Quote
Old 2020-12-31, 14:57   #10
charybdis
 
Apr 2020

193 Posts
Default

I've spent a fair amount of time browsing old threads here, and from what I recall, at least some cases of the bug were caused by the GGNFS siever occasionally producing relations with composite factors. Msieve now checks for this, but there is one caveat: a quick read of the msieve code suggests it only checks factors below 2^32 for primality, so a larger composite factor could potentially still cause problems. However, I doubt the CADO siever has such a bug; I've never seen "skipped x relations with composite factors" on a CADO-sieved job.

The bug didn't appear in the filtering run with fewer relations, so it ought to be possible to binary-search using "filter_maxrels" to discover exactly where the bug first appears. If it's due to a corrupt relation that somehow hasn't been caught, this would find it. Otherwise I've got no clue.

CADO linear algebra would presumably work but would be slower and more memory-intensive than msieve.

Last fiddled with by charybdis on 2020-12-31 at 14:58
charybdis is offline   Reply With Quote
Old 2020-12-31, 16:33   #11
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

2×23×79 Posts
Default

What about trying remdups4 before filtering? Would that pick out composite relations?
EdH is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 10:26.

Sat Mar 6 10:26:29 UTC 2021 up 93 days, 6:37, 0 users, load averages: 1.73, 1.52, 1.57

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.