I am posting a simpler explanation of the algorithm here - in case some can come up with a faster implementation

Code:

Take all the remaining candidates in a range Nmin to Nmax
Let b be the base and B be the multiplicative inverse such that b*B = 1 mod p.
We choose a C= sqrt(Nmax-Nmin)
We then divide all n values into C different groups
In each group G_r={n1,n2,n3...,} such that n1=n2=n3=r (mod C)
We want to sieve each of these C groups separately.
For G_r group
since n1=r (mod C) then n1=k1*C+r
since n2=r (mod C) then n2=k2*C+r
The first term in this group is n1*b^n1+-1
The second term in this group is n2*b^n2+-1
We multiply all these terms by their corresponding B^(k*C) value
The first term
n1*b^n1= +-1 (mod p)
n1*b^n1*B^(k1*C)= +-B^(k1*C) (mod p)
Since we know that n1=k1*C+r we get
n1*b^r=+-B^(k1*C) (mod p)
The second term
n2*b^n2= +-1 (mod p)
n2*b^n2*B^(k2*C)= +-B^(k2*C) (mod p)
Since we know that n2=k2*C+r we get
n2*b^r=+-B^(k2*C) (mod p)
Note that B^(k*C) values will be used in all the C groups and we can save on this work instead of repeatedly calculating it.
Also note n2*b^r - n1*b^r = (k2-k1)*C*b^r
If we know C*b^r (which we can calculate for each group) we can generate all these terms by addition only.
If n1*b^r=-B^(k1*C) (mod p) then n1*b^n1+1 is divisible by p
If n1*b^r=B^(k1*C) (mod) p then n1*b^n1-1 is divisible by p