mersenneforum.org Poly select and test-sieving for RSA232
 Register FAQ Search Today's Posts Mark Forums Read

 2019-04-03, 00:57 #1 VBCurtis     "Curtis" Feb 2005 Riverside, CA 2×41×61 Posts Poly select and test-sieving for RSA232 The path for mere mortals to contribute to a factorization of RSA-896 or RSA-1024 may include getting an empirical sense for how our current tools scale for projects of 230+ digits. As a start on this path, I spent about two core-weeks on CADO poly select for RSA-232: Code: n: 1009881397871923546909564894309468582818233821955573955141120516205831021338528545374366109757154363664913380084917065169921701524733294389270280234380960909804976440540711201965410747553824948672771374075011577182305398340606162079 skew: 2344396.759 type: gnfs c0: -4444911653278229819370022568140110402793683027936 c1: -1601077621210143696221661846379560448537934 c2: 10503357361068234616583254671616966819 c3: -106292952091953896748493562554 c4: -2012141349873927583036795 c5: -21775742789901120 c6: -43338240 Y0: -48749394190388072676687368853974603055 Y1: 1395619071010682498582953 # MurphyE (Bf=5.498e+11,Bg=6.872e+10,area=8.590e+16) = 2.92e-08 # found by revision 78c3a74 I ran this through msieve: size 5.322e-17, alpha -10.267, combined = 5.391e-17 rroots = 6 On our best-polys table, an increase of 16 digits corresponds to an order-of-magnitude decrease in Murphy-E. ~4.7e-14 is the record for 184 digits, ~5.3e-15 is the record for 200 digits, and ~4.6e-16 (deg 6) is the record for 216 digits. So, this score at 5.4e-17 is a decent baseline polynomial to do some parameter-choice work. I tried to feed the polynomial from the RSA-768 factorization paper into msieve to compare scores, but it core-dumped and I couldn't figure out my mistake. I'll report some test-sieve results using 16f next.
 2019-04-03, 02:59 #2 axn     Jun 2003 515110 Posts Code: searching for 15-digit factors commencing number field sieve (232-digit input) R0: -1291187456580021223163547791574810881 R1: 34661003550492501851445829 A0: -277565266791543881995216199713801103343120 A1: -18185779352088594356726018862434803054 A2: 6525437261935989397109667371894785 A3: -46477854471727854271772677450 A4: -5006815697800138351796828 A5: 1276509360768321888 A6: 265482057982680 skew 44000.00, size 7.072e-017, alpha -7.298, combined = 7.024e-017 rroots = 6
 2019-04-03, 04:23 #3 VBCurtis     "Curtis" Feb 2005 Riverside, CA 10011100010102 Posts Thank you! It should be noted that axn posted the poly for RSA-768; its score is 30% better after 20 core-years of search (~eleven years ago) than my RSA-232 after 2 core-weeks. Last fiddled with by VBCurtis on 2019-04-03 at 04:24
 2019-04-04, 18:42 #4 VBCurtis     "Curtis" Feb 2005 Riverside, CA 2×41×61 Posts RSA768 was factored with the equivalent of I=16 on CADO (same sieve region as NFS@home's 16fV5 siever). I=17 may be faster, but has a memory footprint too large for most users (I failed to get a single instance to start on a 24GB system a few months ago). Alas, I haven't yet figured out how to test-sieve within CADO; so instead I tried a little test with the 16f siever. "top" reported 3.5GB memory use for rlim=980M, alim=1640M. I also tested rlim=alim=1070M, but memory use only dropped to 2.9GB while yield was more than 10% worse and sec/rel was little changed. My test-sieve used mfbr=71, lpbr=37, mfba=110, lpba=40. These settings are a bit tighter than those used by the group that factored RSA-768: They used smaller lim's to save memory but 40LP on both sides, mfbr=110, mfba=140 (4 large primes!). I initially tried 107/39 for a-side, but 110/40 had approximately twice the yield. Code: #37-40/71-110, lambda 2.7/3.7: dQ=500 #Q=100M 7736 rels, 0.296 sec/rel #Q=600M 5041 rels, 0.441 sec/rel #Q=1100M 5154 rels, 0.483 sec/rel #Q=1600M 3088 rels, 0.571 sec/rel #Q=2100M 1741 rels, 0.634 sec/rel #Q=2600M 2818 rels, 0.648 sec/rel #Q=3100M 2683 rels, 0.674 sec/rel #Q=3600M 3516 rels, 0.696 sec/rel #Q=4100M 1574 rels, 0.775 sec/rel The team that cracked RSA-768 used 40LP on both sides, 3 rational large primes and 4 algebraic large primes. They sieved 64G raw relations, and observed that this was substantial oversieving (the estimated by a factor of 2). So, moving down to 37LP and 2 large primes on r side, with 3 large primes on a side, suggests less than half as many relations would be required; say, somewhere between 25G and 30G relations. Yield is over 10 from 100M to 1200M, good for ~12G relations. Yield is over 5 for the rest of the Q-range, say 1200M to 3800M for ~15G relations. So, as a proof-of-concept, the 16f siever appears sufficient to crack RSA-232. If I had hundreds of cores at my disposal, I'd aim the large-memory cores to CADO with I = 17 to run Q=50M to 200M, and for medium-memory cores I'd use 16f with 3.5GB/core. Once I solve CADO test-sieving, I'll report I=16 memory use and yield. Sec/rel is measured on a Haswell i7-5820, a 6-core 3.3ghz CPU with 6 other tasks running. At half a second per relation, we're looking at 14 gigacoreseconds, or 450 core-years. This is massively less than the 2000 CPU-years that RSA-768 took, but those were measured on c. 2008 2.2ghz AMD processors. However, my CPU is not 4 times faster than theirs were, which suggests my test-sieving is flawed in some way. I don't see any problem with yields with just 2RLP/3ALP, while the RSA-768 team sieved Q=450M-11G using 3RLP/4ALP; again, this suggests my test-sieving is flawed. Then again, they used rlim=200M & alim =1100M to fit into 2GB/core, which may have impacted yield and efficiency more than I realize. Their matrix was ~190M by 190M, and better param choice + fewer large primes should lead to a yet-smaller matrix for RSA232. If anyone is interested in taking another baby step toward such a factorization, they are invited to find a better polynomial. Poly select has improved quite a lot in 10 years, so we should be able to beat the RSA-768 score by at least 5%. If we do improve on my poly by 20% or more, a more detailed test-sieve could demonstrate that we need fewer than 400 core-years for sieving, a level possible within this forum.
 2019-04-04, 18:51 #5 VBCurtis     "Curtis" Feb 2005 Riverside, CA 2×41×61 Posts In case anyone is silly enough to try some polyselect, here is what I ran in CADO for my small search: Code: tasks.polyselect.degree = 6 tasks.polyselect.P = 40000000 tasks.polyselect.admax = 1e5 tasks.polyselect.adrange = 120 tasks.polyselect.incr = 60 tasks.polyselect.nq = 7776 # this is 6^5 tasks.polyselect.nrkeep = 36 tasks.wutimeout = 36000 # required for rootsieve in degree 6 I'll have a GPU free this weekend, so I'll do a little msieve-GPU search too; say, 200-400k in degree 6. Last fiddled with by VBCurtis on 2019-04-04 at 18:51
 2019-04-04, 20:39 #6 pinhodecarlos     "Carlos Pinho" Oct 2011 Milton Keynes, UK 3·1,657 Posts I’ll help sieving and I can use up to 4GB/thread.
2019-04-06, 19:13   #7
Max0526

"Max"
Jun 2016
Toronto

38A16 Posts
RSA-768

Quote:
 Originally Posted by VBCurtis ... I tried to feed the polynomial from the RSA-768 factorization paper into msieve to compare scores, but it core-dumped and I couldn't figure out my mistake.
Usually because msieve doesn't properly process minuses copied from TEX/PDF documents ( − ). They should be manually changed into regular ones ( - ).

2019-04-06, 21:10   #8
RichD

Sep 2008
Kansas

D7916 Posts

Quote:
 Originally Posted by VBCurtis ... a level possible within this forum.
Thinking ahead, perhaps Greg can host a "hidden queue" for us Mersennians to work at our leisure. But who on earth could handle the post-processing?...

2019-04-06, 22:36   #9
Max0526

"Max"
Jun 2016
Toronto

2·3·151 Posts

Quote:
 Originally Posted by RichD Thinking ahead, perhaps Greg can host a "hidden queue" for us Mersennians to work at our leisure. But who on earth could handle the post-processing?...
People who develop and host CADO-NFS could do that for sure. They are always at least a step ahead.

2019-04-06, 23:15   #10
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

2×41×61 Posts

Quote:
 Originally Posted by RichD Thinking ahead, perhaps Greg can host a "hidden queue" for us Mersennians to work at our leisure.
Greg has in the past expressed disinterest in projects using LP>33 because of runaway storage needs. Indeed, even if we give up 10+% efficiency to drop LP to something like 35/37 we'll still need on the order of 8G raw relations, which roughly matches the relations count in the entire 14e and 15e queues!
I don't mind buying an extra disk for my workstation and running an internet-facing CADO server on that disk, which would allow us to collect relations at our leisure. Do 15G relations fit into 1GB? Do I need double the disk space (e.g. 2TB dedicated disk) of the expected relation set to handle things like uniqueing, compression, filtering?

Also, thanks to Max for pointing out the minus-sign problem with copy-paste! I did exactly as he suspected.

Last fiddled with by VBCurtis on 2019-04-06 at 23:16

2019-04-06, 23:22   #11
VBCurtis

"Curtis"
Feb 2005
Riverside, CA

2·41·61 Posts

Quote:
 Originally Posted by RichD But who on earth could handle the post-processing?...
I think this particular candidate falls into a no-man's-land, where the lack of possible record factorization acts against a possible XSEDE proposal yet it's too big for any single machine we have access to. I'm willing to spend the \$200 to upgrade my 20-core xeon to 128GB, which is likely enough to solve a matrix from a GNFS-215 or GNFS-220 project, but insufficient for RSA-232. Perhaps a pair of infiniband cards would allow me to solve RSA-232 personally in under a year, but that sounds more like a hope than a plan!

However, if we went after RSA-240, we might be able to get XSEDE to grant cluster time for the matrix. That should take on the order of 1000 core-years of sieving, way too much for forumites alone!
In the interests of gaining experience with big jobs, I've volunteered to postprocess a C206 for swellman that Greg is sieving on 16f; fivemack/Greg are doing the same for the C205 from Aliq 276. If we're serious as a group about doing some quixotic team-sieve, I suggest we find a GNFS in 210 to 218 range to team-factor, so that we can collectively iron out details of data management etc. If my time estimates are correct, we can knock one out with 50 to 100 core-years of sieve, and less than a machine-year of postprocessing; if they're not, I'd rather learn I'm wrong by 50 years than 500!

There are cloud-service machines with big memory that could be rented for this matrix, which would well and truly fit a publication of this factorization using "publicly available tools!"

If that goes well, we can decide whether to attack RSA-232 or RSA-240? In the meantime, there's no harm in polyselecting for either one. I have msieve aimed at RSA232 right now, and I'll start another thread for RSA-240 poly select sometime this month.

 Similar Threads Thread Thread Starter Forum Replies Last Post amphoria YAFU 22 2016-09-17 09:47 VBCurtis Msieve 0 2016-04-11 21:33 jux YAFU 5 2016-01-02 01:01 EdH Factoring 10 2013-10-14 20:00 nstaab1 Lounge 15 2013-03-06 13:48

All times are UTC. The time now is 20:07.

Mon Oct 25 20:07:27 UTC 2021 up 94 days, 14:36, 0 users, load averages: 2.12, 2.10, 2.14