20180424, 17:08  #23  
"Curtis"
Feb 2005
Riverside, CA
2×23×109 Posts 
Quote:
I had in mind that largememory folks would use I=17 on the smallish Q values, while others would use I = 16 over a much larger range (say, Q=2G to Q=12G). Since this is SNFS and a poly is obvious, I may test yields this weekend for the sake of this thread and my curiosity. 

20180425, 09:42  #24 
(loop (#_fork))
Feb 2006
Cambridge, England
3·19·113 Posts 
Because I am a curious person:
With 33bit large primes (3 on rational side), 16e siever, and the polynomial 8x^71, the yield is 29 relations in 2031 seconds for Q=4e8 .. 4e8+1e3 Which is not zero, but is obviously not enough; suggests that a few thousand CPUyears might actually suffice. (the septic is better than the sextic here, though the rational side is still generally a good deal larger than the algebraic; I haven't tried degree8 algebraic) I will leave further investigation to VBCurtis this weekend Last fiddled with by fivemack on 20180425 at 09:42 
20180426, 06:22  #25 
"Curtis"
Feb 2005
Riverside, CA
11626_{8} Posts 
I got curious, too, so I tested the same poly as fivemack:
16f siever, prime Q only, lambda = 3.8, alim=rlim=536M, 39bit LP: mfbr/a = 112 (3 large primes on each side) Q=400M 4062 relations 1.64 sec/rel Q=4000M 1757 relations 2.15 sec/rel At an average 2.5 sec/relation, 40G relations would take 100 gigathreadseconds, or roughly 3100 CPUyears. Sieve performance at large Q is unknown in CADO, and a factor of two or three is fairly likely. Still, 10k CPUyears for sieving should be a generous estimate. Changing mfba to 78 and alambda to 2.8 cut time to 1.0 sec/rel, but yield was 2000 or so at Q=400M. Over the 16f siever's Qrange of 50M to 4.3G (I believe Q is limited to 32 bits), an average yield that may be close to 3.0 means we can get roughly 12e9 raw relations, while 40e9 are likely needed. CADO has been used for Q up to 10G (see https://eprint.iacr.org/2012/369.pdf, where RSA704 was factored with I=15!!!!). I'll test some more mbfr/mfba and alim/rlim settings this weekend, and report back. If anyone has advice for testsieving with the CADO siever, please let me know! Edit to add: 16f memory use was around 1.7GB for one thread. Last fiddled with by VBCurtis on 20180426 at 06:39 
20180426, 06:55  #26 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2^{2}×1,481 Posts 
It sounds to me like it would be necessary to use CADO for this factorization with either I >16 or very high Qs.
Did the max q only increase from 2^31 to 2^32 between lasieve4 and lasieve5? Is it higher when composite q are considered? 
20180426, 18:22  #27 
Sep 2009
2,179 Posts 
Does 16f support j 16 ? And if it does how much memory does it use and how much does it raise yield?
Chris 
20180427, 05:09  #28 
"Curtis"
Feb 2005
Riverside, CA
2×23×109 Posts 
I tried that in past experiments, and J 16 increases yield 40%, as expected. However, it is quite unstable; something like 25% of the testsieves crashed (dQ=1000 per test). One candidate had no crashes, another had many, so it's possible that it is stable with certain polys.
This poly did not crash in a 30min test, and memory use was the same as 16f without the J 16 flag. So, it's worth testing further. sec/rel for the short time I tested was slightly better than regular 16f. Edit: Naturally, I post after one test, and the next test with J 16 crashes immediately. Last fiddled with by VBCurtis on 20180427 at 05:15 
20180506, 03:04  #29 
"Curtis"
Feb 2005
Riverside, CA
2×23×109 Posts 
I did some more testsieving, enough to figure out that lambda isn't quite what I thought it was: it seems that lambda is the multiple of the lim (in bits) that a cofactor is allowed to be, rather than a multiple of the LP bound as I thought it was.
As evidence, I tested 39LP with MFB of 112, 113, 114 all with lambda = 3.8, and all were within 1% of each other for time and relation count was identical (114 was 4063 rels, 112 was 4062). When I bumped lambda to 4.0 with mfb=114, time per relation doubled while yield improved by 910%. I also tried 4LP on the r side, by setting mfbr to 125 and lambda to 4.2; yield increased a fair bit, but at 10 seconds per relation (cough) vs 2.53 for most of my other tests. I also tried mfbr at 135 or so, too high for the siever (a single special Q took over 30 minutes, with about 60 relations in the .dat when I killed the process, and lots of errors about cofactors). rlim=800m, alim=1072M, 39LP, lambda 3.9 on both sides, mfbr=113, mfba = 112 produced 2.52 sec/rel and 5103 relations for a 1kQ test at 400M. This is 25% better yield in 50% more sec/rel than my original estimate a few posts ago. I ran out of curiosity before testing a range of Q values. 
20180514, 01:09  #30 
Jan 2013
2^{3}·7 Posts 
Octic is not supported in cado.

20210828, 17:43  #31 
Apr 2010
Over the rainbow
5140_{8} Posts 
I don't know much about gpuecm but does this https://www.mersenneforum.org/showthread.php?t=27103
could help the ecm phase? 
20210828, 18:14  #32 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
5^{2}·383 Posts 
Irrelevant fodder moved to "I am experimenting with heuristic sieves."

20210828, 20:52  #33  
"Seth"
Apr 2019
171_{16} Posts 
Quote:
CGBN based ECM is possible helpful but only reduces overall ecm time by 1.52x so still many GPU/CPUs years left for t80. Using the advice from paul's params.html t80 is 265K curves @ B1=25e9 for t80. Assuming we want to do 30% (50% of a t80  20% already done). that's 80K curves. Using a tuned version for 1536 BITS my GPU (1080ti). processes 1792 curves @ B1=1e5 in 17.3 seconds. Using a tuned version for 1312 BITS takes 16.4 seconds. Using a tuned version for 1408 BITS takes 16.0 seconds. Using a tuned version for 1280 BITS takes 12.0 seconds. (CGBN seems to need 4 carry bits, but maybe I could squeeze it down to 2) Scaling is very linear so I project CGBN based ECM could theoretically finish the 80K curves in 80000/1792 * 25e9/1e5 * 12 seconds = 4.24 1080ti GPUyears! Stage 2 on my system uses B2=2.5e15, I timed 2e5e12 at 362 seconds and used 4.2x per 10 larger B2, so an estimated 26788 seconds. Naively this is 80,000 * 26788 = 68 CPU core years. This is a 16 CPU/GPU ratio so it's conceivable that a single machine could finish 30% of t80 in 4 years (or <2 years with a top end current gen GPU). I heard 7% chance of a factor. 30% of 7% is 2.1% chance of factoring. At ~400 watts * 4 years = $1600 or an amortized $75,000 to factor. Last fiddled with by SethTro on 20210828 at 20:53 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Do you know this method to factorize?  Godzilla  Miscellaneous Math  28  20171031 18:14 
mathematica7.0 can easily factorize 10^67+1111  aaa120  Factoring  14  20081207 13:14 