Thanks, it worked. I tested it in an EC2 instance with 1 core / 2 threads (Xeon Cascadelake) on a 402 bit prime number:
160 curves at B1=1,000,000:
GMP-ECM:
300.6 sec
AVX-ECM 1 thread:
179.5 sec
AVX-ECM 2 threads:
144.8 sec
I created a 402 bit composite number with a 30 digit (97 bit) factor and tested it 10 times with B1=1,000,000 (35 digit level) to see how fast the factor was found with GMP-ECM and AVX-ECM:
Code:
GMP-ECM AVX-ECM
curve 23 curves 0-15
curve 34 curves 16-31
curve 48 curves 32-47
curve 81 curves 48-63 (found twice)
curve 89 curves 64-79
curve 129 curves 144-159
curve 135 curves 144-159
curve 168 curves 160-175 (found twice)
curve 195 curves 384-399
curve 264 curves 416-431
So based on the speed from the first test GMP-ECM found the factor 10 times in 2191 seconds and AVX-ECM found the factor 10 (12) times in 1419 seconds.
I took the 12 sigmas values for which AVX-ECM found the factor and tested in GMP-ECM, and it found the factor as well with all the 12 sigmas.