Quote:
Originally Posted by jasonp
I realized last week that the arbitraryprecision library needed by msieve's algebraic square root code could be put to other uses, in particular it can be used to implement Bernstein's batch factoring algorithms to speed up the sieving stage. As a proofofconcept I modified msieve's line siever to use three rational and/or algebraic large primes, but to defer actually trying to factor bi and tricomposites until a few hundred thousand of them have accumulated. Once that happens, batch factoring isolates the ~2.5% of relations whose cofactors actually need processing.
As an example, I used a recent C135 factored by Hallstein using GNFS. His run used 28bit large primes, and my mods used the product of all primes < 2^26 to perform batch factoring. Sieve reports with remaining cofactors smaller than 2^81 get batched and submitted to Bernstein's algorithm, and bi and tricomposites that do not contain at least one factor below 2^26 are aborted.
The results are really encouraging. The current code finds twice as many relations and only takes 1.5x longer to do so, for a net speedup of 25%. This is because the vast majority of tricomposites need no explicit factoring at all, and only 1% of the tricomposites that need factoring actually need to be split into three primes (the input would need all three large primes < 2^26, which is extremely rare). The speedup approaches 50% as the norms increase with larger b values, when it becomes feasible to use three large primes on both sides. The extra time needed seems to be split evenly between the batch factoring and the factoring of a much larger number of sieve reports.
This isn't going to make a line siever competitive with a good lattice siever, but the same batch factoring techniques can be used with a lattice siever and could conceivably gain the same kinds of speedups for large jobs.
jasonp

I considered this at one time for my lattice siever. However, I concluded
that memory requirements were too great. Furthermore, my code spends
so little time splitting the cofactors, that the speed increase offered by
batch factoring did not seem worth it. Optimizing something that takes
less than 1% of the runtime is generally not productive,.
BTW, I split my cofactors with a 'tiny' QS implementation fine tuned
for 63bit cofactors. Extending this to say 93 bits would be easy, and the
increase in time would not be great.