20130118, 16:18  #1 
Oct 2012
2·41 Posts 
Quick smoothness test
I've started coding the MPQS from scratch, trying to modify my original code just didn't work.
I was wondering if there was a quicker way of testing if a number is smooth than I'm currently doing it. Here's my algorithm: 1. For each prime p in the factor base: 1.1 Calculate X mod p 1.2 While the result is 0: 1.2.1 X = X / p 1.2.2 Recalculate X mod p, update result 1.2.3 If X = 1, break out of loop I've profiled my code, 91.7% of the CPU time is spent on line 1.1, the division on line 1.2.1 accounts for only 0.2% of the CPU time. Are there any clever ways of doing this? How could I improve this algorithm? 
20130118, 16:56  #2 
"Ben"
Feb 2007
D4A_{16} Posts 
What is X? The full precision (ax + b)^2  N value?

20130118, 17:47  #3 
Oct 2012
2·41 Posts 

20130118, 18:13  #4 
"Ben"
Feb 2007
2×3^{5}×7 Posts 
One approach to make this step a lot faster is the following:
When you are done sieving, carry the final sieve step of each prime into the trial division stage. I.e., if the blocksize is B and prime p exits the sieve interval at B+k, take the value B+k into the trial division routine. Then, rather than checking if X % p is 0, you can check if (B+k  x) % p is 0, where x is the sieve index you are currently looking at (that is used to compute X = (ax + b)^2  N). This changes the operation from a multipleprecision division involving X to a single precision division that will be a lot faster. 
20130118, 19:30  #5 
Oct 2012
2·41 Posts 
As a small example, let's say the block size is 10, with the elements running 0  9. Starting at index 2, each value is divisible by 3, so the prime p exits the array at element 11.
B+k = 11 Then say I'm checking to see if the 5th element is divisible by 3 (which it should be!), I calculate (11  5) % 3 = 6 % 3 = 0. The 7th element shouldn't be, just to check: (11  7) % 3 = 4 % 3 = 1 Have I understood correctly, or just got lucky with the numbers I picked? 
20130118, 19:54  #6 
"Ben"
Feb 2007
2·3^{5}·7 Posts 
You got it.
Also handy is the fact that k is the starting offset of the prime in the next sieve block. So once you are done sieving and trial dividing block B1, subtract B from all of the values in your (B+k) array and the results are where you start sieving block B2. Last fiddled with by bsquared on 20130118 at 19:57 
20130118, 20:41  #7 
Oct 2012
2·41 Posts 
How much speed is to be gained from splitting into blocks of 32kb? In my old implementation larger arrays tended to be faster than smaller arrays, but my old implementation was pretty slow anyway.

20130118, 21:22  #8 
"Ben"
Feb 2007
6512_{8} Posts 
Difficult to say  it will be very implementation dependent. However, if you are frequently reading/modifying/writing a small data of data (i.e., a sieve block), it generally pays to keep that information close to the processor. You can't get much closer than L1 cache, and that's why 32kB is a popular choice for the size of the sieve block.

20130118, 22:36  #9 
Oct 2012
2·41 Posts 
I was thinking, I have two arrays (soln1 and soln2), which hold the start index for the sieving, instead, could I just compute (x  soln1) % p, where x is the sieve array index, and if the result is 0, do the division?
Is creating the blocks as simple as calculating 32768/sizeof(size_t), and using that many per block? 
20130118, 22:43  #10 
If I May
"Chris Halsall"
Sep 2002
Barbados
2^{4}·5·7·17 Posts 

20130118, 23:27  #11 
Oct 2012
1010010_{2} Posts 

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Quick primality test?  guptadeva  Miscellaneous Math  12  20171228 00:09 
Inverse of Smoothness Probability  paul0  Math  6  20170725 16:41 
A quick question  Pegos  Information & Answers  6  20160811 14:39 
Quick & Dirty  storm5510  Programming  37  20090908 06:52 
Smoothness test  Washuu  Math  12  20050627 12:19 