I've been working on tuning CADO parameters, and have found settings that run 1530% faster than the 2.3.0 release defaults on numbers from c95 to c125. With these faster settings, I timed CADO 2.3.0 vs factmsieve.py on RSA120.
CADO took 73,000 CPUseconds on a 6core i7 using 12 threads for all stages. Wallclock time was roughly 10,500 seconds (I neglected to set a timer, so accurate within a couple minutes). Dividing the two times shows that hyperthreading is worth approx. one core's worth of work, as CPU time is 7x wall clock time.
CADO spent 3500 threadseconds on poly select, so I allowed msieve the same time on a singlethreaded process. I don't have an msievefunctional GPU presently. I then set factmsieve with 12 threads of sieve and 6 threads of postprocessing; 80 minutes of sieve and 20 min postprocessing later, the factorization was complete. If we imagine that poly select could be conveniently run 6threaded, that's 110 minutes for GGNFS vs 175 minutes CADO.
From 97 to 133 digits, a sample of bestresult CADO times shows the software scaling as every 5.9 digits = a doubling of total time. I haven't recorded a similar map of GGNFS times, though the rule of thumb has always been 5 digits = doubling of sieve time, so perhaps CADO catches up in speed at higher difficulties.
I am not confident that I've found strong parameters for C140 on CADO yet, but I may repeat the test on RSA140 once I think I've made the best of CADO.
