2) The only other way we can speed up things is using SPH with BSGS. If I write the code for Chinese Remainder Theorem and some of the other things that the program will need, do you think we can make the program faster?
I think you have a much better understanding of the algorithms than I do, wouldn't it would make more sense for you to implement the main algorithm? I could then help speed up some of the lower level routines if necessary. If I were to try to implement SPH I would probably start the programming from scratch anyway, the srsieve code has become a bit complicated.

Probably the quickest way to get a boost in sieve productivity in the short term would be to figure out how to compile srsieve or jjsieve for 64-bit Windows machines.
