-   Operation Kibibit (
-   -   Poly select and test-sieving for RSA232 (

VBCurtis 2019-04-03 00:57

Poly select and test-sieving for RSA232
The path for mere mortals to contribute to a factorization of RSA-896 or RSA-1024 may include getting an empirical sense for how our current tools scale for projects of 230+ digits.
As a start on this path, I spent about two core-weeks on CADO poly select for RSA-232:
[code]n: 1009881397871923546909564894309468582818233821955573955141120516205831021338528545374366109757154363664913380084917065169921701524733294389270280234380960909804976440540711201965410747553824948672771374075011577182305398340606162079
skew: 2344396.759
type: gnfs
c0: -4444911653278229819370022568140110402793683027936
c1: -1601077621210143696221661846379560448537934
c2: 10503357361068234616583254671616966819
c3: -106292952091953896748493562554
c4: -2012141349873927583036795
c5: -21775742789901120
c6: -43338240
Y0: -48749394190388072676687368853974603055
Y1: 1395619071010682498582953
# MurphyE (Bf=5.498e+11,Bg=6.872e+10,area=8.590e+16) = 2.92e-08
# found by revision 78c3a74[/code]
I ran this through msieve: size 5.322e-17, alpha -10.267, combined = 5.391e-17 rroots = 6
On our best-polys table, an increase of 16 digits corresponds to an order-of-magnitude decrease in Murphy-E. ~4.7e-14 is the record for 184 digits, ~5.3e-15 is the record for 200 digits, and ~4.6e-16 (deg 6) is the record for 216 digits. So, this score at 5.4e-17 is a decent baseline polynomial to do some parameter-choice work. I tried to feed the polynomial from the RSA-768 factorization paper into msieve to compare scores, but it core-dumped and I couldn't figure out my mistake.
I'll report some test-sieve results using 16f next.

axn 2019-04-03 02:59

[CODE]searching for 15-digit factors
commencing number field sieve (232-digit input)
R0: -1291187456580021223163547791574810881
R1: 34661003550492501851445829
A0: -277565266791543881995216199713801103343120
A1: -18185779352088594356726018862434803054
A2: 6525437261935989397109667371894785
A3: -46477854471727854271772677450
A4: -5006815697800138351796828
A5: 1276509360768321888
A6: 265482057982680
skew 44000.00, size 7.072e-017, alpha -7.298, combined = 7.024e-017 rroots = 6[/CODE]

VBCurtis 2019-04-03 04:23

Thank you!
It should be noted that axn posted the poly for RSA-768; its score is 30% better after 20 core-years of search (~eleven years ago) than my RSA-232 after 2 core-weeks.

VBCurtis 2019-04-04 18:42

RSA768 was factored with the equivalent of I=16 on CADO (same sieve region as NFS@home's 16fV5 siever). I=17 may be faster, but has a memory footprint too large for most users (I failed to get a single instance to start on a 24GB system a few months ago). Alas, I haven't yet figured out how to test-sieve within CADO; so instead I tried a little test with the 16f siever. "top" reported 3.5GB memory use for rlim=980M, alim=1640M. I also tested rlim=alim=1070M, but memory use only dropped to 2.9GB while yield was more than 10% worse and sec/rel was little changed.
My test-sieve used mfbr=71, lpbr=37, mfba=110, lpba=40.
These settings are a bit tighter than those used by the group that factored RSA-768: They used smaller lim's to save memory but 40LP on both sides, mfbr=110, mfba=140 (4 large primes!). I initially tried 107/39 for a-side, but 110/40 had approximately twice the yield.
[code]#37-40/71-110, lambda 2.7/3.7: dQ=500
#Q=100M 7736 rels, 0.296 sec/rel
#Q=600M 5041 rels, 0.441 sec/rel
#Q=1100M 5154 rels, 0.483 sec/rel
#Q=1600M 3088 rels, 0.571 sec/rel
#Q=2100M 1741 rels, 0.634 sec/rel
#Q=2600M 2818 rels, 0.648 sec/rel
#Q=3100M 2683 rels, 0.674 sec/rel
#Q=3600M 3516 rels, 0.696 sec/rel
#Q=4100M 1574 rels, 0.775 sec/rel[/code]
The team that cracked RSA-768 used 40LP on both sides, 3 rational large primes and 4 algebraic large primes. They sieved 64G raw relations, and observed that this was substantial oversieving (the estimated by a factor of 2). So, moving down to 37LP and 2 large primes on r side, with 3 large primes on a side, suggests less than half as many relations would be required; say, somewhere between 25G and 30G relations.
Yield is over 10 from 100M to 1200M, good for ~12G relations.
Yield is over 5 for the rest of the Q-range, say 1200M to 3800M for ~15G relations.
So, as a proof-of-concept, the 16f siever appears sufficient to crack RSA-232. If I had hundreds of cores at my disposal, I'd aim the large-memory cores to CADO with I = 17 to run Q=50M to 200M, and for medium-memory cores I'd use 16f with 3.5GB/core. Once I solve CADO test-sieving, I'll report I=16 memory use and yield.
Sec/rel is measured on a Haswell i7-5820, a 6-core 3.3ghz CPU with 6 other tasks running. At half a second per relation, we're looking at 14 gigacoreseconds, or 450 core-years. This is massively less than the 2000 CPU-years that RSA-768 took, but those were measured on c. 2008 2.2ghz AMD processors. However, my CPU is not 4 times faster than theirs were, which suggests my test-sieving is flawed in some way. I don't see any problem with yields with just 2RLP/3ALP, while the RSA-768 team sieved Q=450M-11G using 3RLP/4ALP; again, this suggests my test-sieving is flawed. Then again, they used rlim=200M & alim =1100M to fit into 2GB/core, which may have impacted yield and efficiency more than I realize. Their matrix was ~190M by 190M, and better param choice + fewer large primes should lead to a yet-smaller matrix for RSA232.
If anyone is interested in taking another baby step toward such a factorization, they are invited to find a better polynomial. Poly select has improved quite a lot in 10 years, so we should be able to beat the RSA-768 score by at least 5%. If we do improve on my poly by 20% or more, a more detailed test-sieve could demonstrate that we need fewer than 400 core-years for sieving, a level possible within this forum.

VBCurtis 2019-04-04 18:51

In case anyone is silly enough to try some polyselect, here is what I ran in CADO for my small search:
[code] = 6
tasks.polyselect.P = 40000000
tasks.polyselect.admax = 1e5
tasks.polyselect.adrange = 120
tasks.polyselect.incr = 60
tasks.polyselect.nq = 7776 # this is 6^5
tasks.polyselect.nrkeep = 36
tasks.wutimeout = 36000 # required for rootsieve in degree 6[/code]

I'll have a GPU free this weekend, so I'll do a little msieve-GPU search too; say, 200-400k in degree 6.

pinhodecarlos 2019-04-04 20:39

I’ll help sieving and I can use up to 4GB/thread.

Max0526 2019-04-06 19:13

I tried to feed the polynomial from the RSA-768 factorization paper into msieve to compare scores, but it core-dumped and I couldn't figure out my mistake.[/QUOTE]
Usually because msieve doesn't properly process minuses copied from TEX/PDF documents ( − ). They should be manually changed into regular ones ( - ).

RichD 2019-04-06 21:10

[QUOTE=VBCurtis;512689]... a level possible within this forum.[/QUOTE]

Thinking ahead, perhaps Greg can host a "hidden queue" for us Mersennians to work at our leisure. But who on earth could handle the post-processing?...

Max0526 2019-04-06 22:36

[QUOTE=RichD;512905]Thinking ahead, perhaps Greg can host a "hidden queue" for us Mersennians to work at our leisure. But who on earth could handle the post-processing?...[/QUOTE]
People who develop and host CADO-NFS could do that for sure. They are always at least a step ahead.

VBCurtis 2019-04-06 23:15

[QUOTE=RichD;512905]Thinking ahead, perhaps Greg can host a "hidden queue" for us Mersennians to work at our leisure.[/QUOTE]

Greg has in the past expressed disinterest in projects using LP>33 because of runaway storage needs. Indeed, even if we give up 10+% efficiency to drop LP to something like 35/37 we'll still need on the order of 8G raw relations, which roughly matches the relations count in the entire 14e and 15e queues!
I don't mind buying an extra disk for my workstation and running an internet-facing CADO server on that disk, which would allow us to collect relations at our leisure. Do 15G relations fit into 1GB? Do I need double the disk space (e.g. 2TB dedicated disk) of the expected relation set to handle things like uniqueing, compression, filtering?

Also, thanks to Max for pointing out the minus-sign problem with copy-paste! I did exactly as he suspected.

VBCurtis 2019-04-06 23:22

[QUOTE=RichD;512905]But who on earth could handle the post-processing?...[/QUOTE]
I think this particular candidate falls into a no-man's-land, where the lack of possible record factorization acts against a possible XSEDE proposal yet it's too big for any single machine we have access to. I'm willing to spend the $200 to upgrade my 20-core xeon to 128GB, which is likely enough to solve a matrix from a GNFS-215 or GNFS-220 project, but insufficient for RSA-232. Perhaps a pair of infiniband cards would allow me to solve RSA-232 personally in under a year, but that sounds more like a hope than a plan!

However, if we went after RSA-240, we might be able to get XSEDE to grant cluster time for the matrix. That should take on the order of 1000 core-years of sieving, way too much for forumites alone!
In the interests of gaining experience with big jobs, I've volunteered to postprocess a C206 for swellman that Greg is sieving on 16f; fivemack/Greg are doing the same for the C205 from Aliq 276. If we're serious as a group about doing some quixotic team-sieve, I suggest we find a GNFS in 210 to 218 range to team-factor, so that we can collectively iron out details of data management etc. If my time estimates are correct, we can knock one out with 50 to 100 core-years of sieve, and less than a machine-year of postprocessing; if they're not, I'd rather learn I'm wrong by 50 years than 500!

There are cloud-service machines with big memory that could be rented for this matrix, which would well and truly fit a publication of this factorization using "publicly available tools!"

If that goes well, we can decide whether to attack RSA-232 or RSA-240? In the meantime, there's no harm in polyselecting for either one. I have msieve aimed at RSA232 right now, and I'll start another thread for RSA-240 poly select sometime this month.

All times are UTC. The time now is 08:23.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.