Quote:
Originally Posted by Prime95
Have you played with improving PRAC?
I recently did and gained one or two percent. This may be because prime95 can store data FFTed which may change the costing functions. A double costs 10 transforms, an add costs 12 transforms. Also, prime95 is usually dealing with larger numbers and can afford to spend a little more time searching for the best chain.
Glad to share the C code for the changes.

I tried to adjust the balance of time optimizing chains, without notable success. I'd like to see your code, thanks for sharing.
I'm also looking into gmpecm's different parameterizations. "param 1" seems to be the default for me in most cases. It looks like they are using small parameters so that the multiply by (A+2)/4 is a single limb (saving a multiply when duplicating). Then not using PRAC at all, instead using Montgomery's ladder, which has a larger percentage of doublings, on a premultiplied set of B1 primes. Eliminating a multiply could save 10% or so. What I don't understand yet is how it can be singlelimb if they are using montgomery multiplies (redc).