Hi,
Next on my todo list:
 more flexible input (e.g. the exponents are still hardcoded in the hostcode): DONE!
 presieving of factor candidates: 50%
 reduce the amount of data transfered from/to device: DONE!
 interleave host/device code: DONE!
Raw speed on my GTX 275: ~34 million factor candidates per second for a 26bit exponent.
presieving is supported in the code now, what is missing is a fast siever. The current code does "sieving" by calculation the remainder of each factor candidate which isn't fast, offcourse.
So with presieving factor candidates with primes <= 11 *lol* I can factor M66xxxxxx to 60 bits in 80 seconds. This removes 2/3 of the factor candidates, with a good siever this should increase ;) (the prime95 help notes that ~95% of the candidates are removed with the siever). If I can do the same this will be speedup of ~56 (~33% vs. ~5% remaining factor candidates) compared to the current version. :)
There is one known bug (introduced in the current version): if 2 (or more) factors are found at the same dataset often only one factor is returned.
It is even possible that the returned factor is wrong (mixup of both factors) in this case. I haven't noticed a mixup so far but I'm sure it can occur. :(
Last fiddled with by TheJudger on 20091203 at 16:11
