Thread: mtsieve enhancements View Single Post
 2020-05-27, 04:26 #23 Citrix     Jun 2003 1,571 Posts gcwsieve enhancements Code: // We're trying to determine if n*b^n (mod p) = +1/-1. This requires an two // operations, an exponentiation (b^n) and a multiplication (n). We can make a // change to eliminate one of those operations and convert the second one into addition which // should be faster. Doing this requires us to // compute the multiplicative inverse of b. The multiplicative inverse of // b (which I will call B) is the number such at b*B = 1 mod p. Here is how // the algorithm works. // // We start with our number: // n*b^n (mod p) = +1/-1 // We then generate B such that b*B=1 (mod p) // // To get then next term (which I will call N) where N>n , we need to // check the following // N=n+r // N*b^N(mod p) = +1/-1 // (n+r)*b^(n+r) (mod p) = +1/-1 // We multiply both sides by B^r // (n+r)*b^(n+r))B^r (mod p) = +B^r/-B^r // (n+r)*b^n (mod p) = +B^r/-B^r // n*b^n + r*b^n (mod p) = +B^r/-B^r // We already know n*b^n and b^n are from the previous step. // (Array 1) Now we pre-compute +-B^r for a range of r since we will be using it multiple times (1 to rmax = M*sqrt(nmax-nmin)) // (Array 2) for each n*b^n in the loop we pre-compute r*b^n for small consecutive values of r (1 to rdiffmax) – can be done by modular addition // We can calculate each successive candidate by just adding a term from Array 2 and checking it with corresponding term from Array 1. M for size of Array 1 can be adjusted for maximum efficiency depending on efficiency of the loops. // once we reach rmax then we multiply last term by b^rmax and b^n by b^rmax n*b^n + rmax*b^n (mod p) (n+rmax)* b^n (mod p) (n+rmax)* b^n*b^rmax(mod p) n'*b^n'(mod p) and also generate b^n'=b^n*b^rmax. (mod p) // We can now repeat the loop again starting at n'. Other suggestions Your current code takes into account n values left mod 2 but not for other higher primes. Code can be made faster if you only look at subsequences left for mod 6 or higher number. BestQ code from srsieve can be used here. Depending on size of subsequence being sieved we can adjust size of the loops for maximum efficiency. On a GPU if memory storage is a concern Array 1 can be reduced in size and Array 2 might not be worth saving. This will be still faster than current code.