mersenneforum.org  

Go Back   mersenneforum.org > Math Stuff > Computer Science & Computational Number Theory

Reply
 
Thread Tools
Old 2009-07-25, 04:11   #1
SPWorley
 
Jul 2009

31 Posts
Default Fast isPrime() for n < 2^32

I needed an isPrime() test for generating hash table sizes. If you google for efficient prime testing code, you usually end up at Chris Caldwell's great page.

I ended up making my own test that's pretty efficient for n<2^32. Instead of using 3 or 4 SPRP tests, I use only one test, but I choose the base of the test based on a hashed value of N itself. A 1KB pregenerated lookup table then gives the base needed for the single SPRP test to classify the primality any number < 2^32 properly.

Simple code is below. It can definitely be tweaked based on your CPU, language, or compiler, especially how many test divides you want to prescreen with.

I could probably speed this up slightly by replacing the test divides with an Euler GCD and then Bloom filter, but that's likely only a few percentage points speed boost. Of course I haven't tried it yet.

Code:
/*
 * Copyright 2009 Steve Worley < m a t h g e e k  @(my last name).com >
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
 * WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
 * AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR
 * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS
 * OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
 * NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
 * CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

/* isPrime() by Steve Worley

   Efficiently computes whether a 32 bit integer is prime or not.
   Strategy: Do first screening with a few trial divisions to identify
   easy composites. Next use a strong-probable-prime test. Normally
   several tests would be necessary but this code hashes the number
   into a bin which has a precomputed table listing the SPRP test base
   which is sufficient for the numbers in that bin.  Thus, you can
   prove (guaranteed 100%) whether your number is prime or not with a
   single SPRP test. More discussion at
   www.worley.com/mathgeek/isprime.html

This algorithm is deliberately unpatented. The BSD license above also
lets you use it even in commercial code.
*/

static const unsigned char
SPRPlookup[1024]={5,6,11,15,76,15,11,26,10,35,29,13,
22,6,14,3,21,24,30,13,6,11,12,6,11,11,5,11,19,3,13,
14,3,6,17,26,6,13,13,13,2,6,5,2,11,15,7,3,10,37,43,
18,33,5,37,13,5,3,37,14,26,5,6,15,19,45,11,24,2,6,
23,2,11,10,2,15,5,5,15,15,15,55,5,92,10,5,18,10,2,
66,22,21,2,42,13,5,17,3,2,6,2,3,2,13,3,14,14,42,6,7,
39,6,15,3,24,5,3,11,14,22,11,5,26,22,14,2,28,6,19,7,
7,10,2,45,40,11,31,2,11,17,33,12,23,2,14,10,13,15,
13,6,3,3,3,38,7,23,18,7,6,10,10,5,11,5,6,2,2,11,6,
12,21,5,43,6,11,15,21,55,38,13,2,2,17,10,13,13,13,7,
10,5,13,3,29,14,11,23,18,5,2,7,2,14,10,2,2,20,10,7,
18,7,6,5,3,10,15,57,18,10,26,12,13,6,17,6,3,3,10,13,
15,44,35,30,26,13,2,13,2,31,37,5,10,31,2,15,5,7,17,
33,17,2,21,23,7,5,17,6,40,17,11,56,30,30,10,3,15,11,
2,15,3,5,2,2,17,22,2,10,2,7,12,13,5,22,5,20,17,7,6,
18,18,6,19,6,5,21,24,2,10,2,2,11,19,42,2,5,17,7,2,42,
7,41,5,5,23,15,11,2,34,10,24,11,12,14,11,2,15,30,28,
19,6,10,12,5,6,45,7,13,11,12,15,13,11,13,2,23,2,33,
15,15,11,10,24,26,13,31,13,7,11,23,5,12,17,14,7,29,
29,15,22,19,14,11,15,19,2,15,2,28,13,20,14,15,60,5,
2,12,69,24,20,7,18,2,13,5,6,10,14,6,24,3,11,23,5,13,
88,13,14,19,3,5,3,21,2,11,6,31,42,18,73,2,7,3,2,3,23,
11,6,6,5,6,14,11,2,11,7,10,7,38,39,5,15,2,19,5,7,22,
13,7,11,24,10,20,14,3,2,10,15,3,13,15,29,11,10,3,6,5,
30,13,26,6,6,17,29,10,19,15,19,21,46,6,17,3,14,3,2,6,
17,26,34,2,11,5,29,19,3,2,19,7,11,14,29,15,7,33,14,21,
12,7,17,15,5,5,7,14,35,21,15,11,15,3,18,3,11,17,7,21,
20,3,15,11,10,3,24,47,10,5,10,5,3,42,6,2,10,21,3,2,2,
5,14,7,3,2,14,15,11,5,12,13,15,15,5,11,10,6,5,7,6,19,
5,2,6,15,5,6,3,2,24,10,10,11,14,11,7,13,3,7,3,18,20,
30,33,17,30,17,3,35,5,5,17,46,6,26,6,20,40,5,10,12,7,
17,5,12,2,6,5,10,5,7,7,7,3,17,11,10,34,10,15,2,6,3,11,
23,13,6,2,18,14,10,7,6,10,7,10,77,6,17,10,5,5,18,17,5,
15,2,11,3,10,13,28,21,20,23,6,7,14,11,3,6,18,5,5,10,6,
23,3,3,2,6,2,15,6,10,5,15,21,7,6,3,7,12,5,3,13,17,20,
21,15,7,2,14,40,20,5,11,7,7,10,10,31,11,13,21,43,3,14,
22,10,22,23,2,10,10,3,7,3,20,7,5,15,7,10,2,30,52,10,
56,14,14,2,2,3,7,5,5,10,3,10,7,15,13,3,89,21,11,5,14,
13,13,11,40,23,21,6,3,20,14,24,6,2,28,6,5,5,6,5,20,7,
13,29,26,31,5,11,3,24,5,7,5,29,38,3,19,13,6,2,22,10,7,
22,5,29,7,11,6,3,2,37,7,7,6,19,12,2,91,7,2,6,56,52,7,
11,10,31,2,6,11,78,34,15,2,53,14,22,31,22,41,2,23,6,
39,6,31,15,15,10,7,30,2,19,6,43,11,38,6,6,3,26,7,10,
11,5,2,11,5,22,22,3,15,2,5,6,24,10,17,7,10,11,2,23,13,
5,11,11,6,21,18,3,21,10,21,2,11,5,7,44,43,31,10,13,3,
2,10,3,7,2,47,13,12,2,13,5,11,2,5,7,11,19,5,10,17,22,
26,2,2,7,2,2,6,6,3,6,22,15,13,57,13,12,2,20,3,5,13,3,
13,15,6,15,2,21,11,10,11,15,2,22,23,10,6,44,7,2,10,5,
31,6,2,5,22,15,5,17,11,2,6,6,13,23,6,5,11,7,28,29,22,
2,7,11,6,35,24,2,3,2,17,13,3,6,2,14,5,11,11,7,23,13,11,
2,13,13,10,12,23,10,18,5,60,10,5,26,5,29,3,20,7,21,47};


/* Computes a^n mod m. Classic binary power ladder evaluation. 
It may be possible to optimize this based on your architecture, depending especially on its integer % speed, and/or support for native 64 bit ints. */
unsigned long powmod(unsigned long a, unsigned long n, unsigned long m)
{
  unsigned long long result=n&1 ? a : 1;
  unsigned long long aa=a; 
  n=n>>1;

  while (n) {
    // note 64 bit math used for intermediate answers before mod.
    aa=(aa*aa)%m; 
    if (n&1) result=(aa*result)%m;
    n=n>>1;
  }
  return (unsigned long)result;
}

static int isSPRP(unsigned long n, unsigned long a)
{
  unsigned long d=n-1;  
  unsigned long ad;
  long s=0;

  // break down n-1 into d*(2^s). Linux ffs() can be better
  while (0==(d&1)) {
    ++s; d>>=1;
  }

  ad=(powmod(a, d, n)); // (a^d) mod n
  
  if (ad==1) return 1; // 1 == a^d mod n 
  if (s>0 && ad==(n-1)) return 1; // -1 == a^d mod n (tests (a^d)^(2^0))
  for (unsigned long r=1; r<s; ++r) {
    // ad is currently ad^(2^(r-1)) so we square it to get ad^(2^r));
    ad=(((unsigned long long)ad)*ad)%n;
    if (ad==(n-1)) return 1; //tests (a^d)^(2^(r+1)))
  }    
  return 0; // false, composite
}

int isPrime(unsigned long v)
{
  /* this initial screening test is OPTIONAL, and on many platforms will
  give faster rejection of most composites without a full SPRP test. */
  if (0==v%2) return v==2;
  if (0==v%3) return v==3;
  if (0==v%5) return v==5;
  if (0==v%7) return v==7;
  if (v<121) return true;
  
  // Hash the value to find the lookup table index for a single SPRP test.
  return isSPRP(v, (unsigned long)SPRPlookup[(0x3AC69A35*v)>>22]);  
}

#include <stdio.h>
int main(int argc, char **argv)
{
  unsigned long i;

  for (i=2; i<300; ++i) 
    if (isPrime(i)) printf("%ld is prime.\n", i);
  return 0;
}
SPWorley is offline   Reply With Quote
Old 2011-04-16, 20:25   #2
zerothbase
 
Mar 2011

2×5 Posts
Default

Attached is a code to extend this principle to 1e15 (using William Galway's list) with only 2 miller-rabin tests.

It uses a SPRP-2, and a hash function to get a value from a table of 2048 "magic" values. Just for fun, the hash uses a very pi-like prime and a HHGTTG-shift.

Since it uses 64-bit numbers, powmod is required to get into 128 bit math for each multiplication to use the basic Right-to-left Binary Method. Thus I have two powmod methods: one that uses GMP, and one that uses a mulmod to do the multiplications. The code, as written, just makes use of the GMP method (since it far faster than using powmod2/mulmod) and the powmod2 is simply ignored. This is definitely the weakest part of the entire code. Anyone have a faster way to do 64-bit powmod in 64-bits with basic C code? I know the entire code would be sped up significantly if I just kept all the numbers in GMP/mpz_t's, but I was trying to isolate GMP as much as possible so it could be removed if I found a short/fast 64-bit powmod.

The code translates directly to Java as well (using BigInteger for powmod and long for ullint). The hash results are the same in Java as in C (even though Java uses signed 64-bit longs).

--Jim Sinclair
Attached Files
File Type: txt isprime1e15.c.txt (14.1 KB, 396 views)
zerothbase is offline   Reply With Quote
Old 2011-04-16, 21:01   #3
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

133468 Posts
Default

Have you thought of extending this to 2^64 with Feitsma's data?
http://gilchrist.ca/jeff/factoring/pseudoprimes.html
CRGreathouse is offline   Reply With Quote
Old 2011-04-16, 22:02   #4
WraithX
 
WraithX's Avatar
 
Mar 2006

23·59 Posts
Default

Quote:
Originally Posted by zerothbase View Post
Attached is a code to extend this principle to 1e15 (using William Galway's list) with only 2 miller-rabin tests.

It uses a SPRP-2, and a hash function to get a value from a table of 2048 "magic" values. Just for fun, the hash uses a very pi-like prime and a HHGTTG-shift.

Anyone have a faster way to do 64-bit powmod in 64-bits with basic C code? I know the entire code would be sped up significantly if I just kept all the numbers in GMP/mpz_t's, but I was trying to isolate GMP as much as possible so it could be removed if I found a short/fast 64-bit powmod.
--Jim Sinclair
I am very curious how you found the multiplier and the list of 2048 magic values. Can you share? I was working on Feitsma's database to try and create a 2-Miller-Rabin test just like you have created. But, I stuck with primes, and only the first couple of hundred at that.

Here is the code that I use to do mulmod64 and powmod64. I haven't actually tested it against a GMP implementation, but I think it should be faster since it just uses the native machine type. As long as you have a 64-bit C compiler that supports inline asm, this should work just fine for you.

Code:
u64_t mulmod64(u64_t a, u64_t b, u64_t c)
{
  u64_t d; /* to hold the result of a*b mod c */

  /* calculates a*b mod c, stores result in d */
  asm ("mov %1, %%rax;"   /* put a into rax */
       "mul %2;"          /* mul a*b -> rdx:rax */
       "div %3;"          /* (a*b)/c -> quot in rax remainder in rdx */
       "mov %%rdx, %0;"  /* store result in d */
       :"=r"(d)                 /* output */
       :"r"(a), "r"(b), "r"(c)  /* input */
       :"%rax", "%rdx"          /* clobbered registers */
      );
  return d;
}/* method mulmod64 */

/*================================================================================*/

/* powmod64 from Geoff Reynolds code at           */
/* http://www.geocities.com/g_w_reynolds/srsieve/ */
u64_t powmod64(u64_t b, u64_t n, u64_t p)
{
  u64_t a=1;
  goto tst;

 mul:
  a = mulmod64(a,b,p);

 sqr:
  b = mulmod64(b,b,p);
  n >>= 1;

 tst:
  if (n & 1)
    goto mul;
  if (n > 0)
    goto sqr;

  return a;
}/* method powmod64 */
WraithX is offline   Reply With Quote
Old 2011-04-17, 00:31   #5
zerothbase
 
Mar 2011

128 Posts
Default

Quote:
Originally Posted by CRGreathouse
Have you thought of extending this to 2^64 with Feitsma's data?
Ya, but I haven't seen a straight download link for it the whole thing. Am I missing it? Has Feitsma completed it yet? The statistics page shows red (incomplete) for the >10^17. After having looked at 1e15, I suspect that I'd need more miller-rabin tests though. Perhaps 4 or 5 bases? I'd have to play with it to know for sure.

Quote:
Originally Posted by WraithX
how you found the multiplier
Just made it up. I just took a few digits of Pi and found the next prime number after it (turned out to be the next number after 3141592653589793238 lol). Basically you can use pretty much any large prime number; it's related to Pseudorandom LCG. My choice doesn't follow all the advice on a perfect LCG, but it distributes the numbers well enough.

Quote:
Originally Posted by WraithX
and the list of 2048 magic values
Basically we know that 2 is a good enough witness for every number <1e15 other than those on the list of 419489 SPRPs.

First step was to create the hash function (just using whatever) and verify that it distributes the 419489 SPRP-2's fairly evenly regardless of what i use for bucket size (it does).

Step 2, I started with 64 buckets, and ran all the numbers in bucket 1 (a random bucket) through using 3, 4, 5... up to about a million as the second witness before I killed it. For each witness I kept track of the number of SPRPs that failed. The count of SPRPs that failed varied a lot but seemed to have a bell curve of some kind. With 64 buckets there was no witness less than 1 million that "solved" bucket 1, or got anywhere close. There were usually hundreds left in bucket 1 unsolved. So I needed to break the buckets down into smaller buckets.

Step 3, double the number of buckets (128...256...) and repeat step 2 until I could reasonably expect to get a solved bucket for every bucket. That is, when the average number of liars for a given witness was something like 5 to 10. With 1024 buckets it is close, but 2048 was guaranteed. If I played more with the hashing function and such, I could probably narrow it down, but I was being lazy and did this yesterday afternoon just messing around.

Step 4, for all 2048 buckets of SPRPs, find the least number that "solved" the bucket (0 liars left). This is your "magic" value for each bucket.

Step 5 verify that the code gets every answer correct up to the highest witness in the 2048 list (which is 1,095,558). The reason for this is that there is a small chance that a random number could hash to a value that is equal to itself or a multiple of itself. For instance, 1219 hashed to a bucket with 1219 in it; this would obviously fail. There were about 5 of these values that had to be replaced out of the 2048. I just did step 4 again and used a different value for that bucket (and reverified).

After this, we are good. We know that miller-rabin always returns the answer "prime" when a number is prime (assuming the number isn't tested with itself or a multiple). So all prime numbers <1e15 will return correctly. We know the composite SPRP-2 liars, and I verified them all with the hashed 2048 magic numbers.

Quote:
Originally Posted by WraithX
u64_t mulmod64(u64_t a, u64_t b, u64_t c)
I love it - works great. Probably just my environment, but I seem to get random floating point exceptions when I try to use the method generically for 33-bit to 64-bit numbers. I get them more frequently when I use numbers closer to 64-bit (on an exponential curve). So for numbers <1e15, I haven't seen an exception yet. The function is, of course blazing fast (about 30x faster than GMP in my env). I'm using Linux Mint 10 in Virtual Box 4.0.4.

I'd still like to see a C-only implementation of mulmod though for maximum portability. That is, without basically building an entire MP-library in place or using an Intel/AMD specific instruction. I guess I just suspect that there must be a number-theoretic way of doing this easily. I've tried reading some of web info on Montgomery reduction (don't know if that would even help), but it was beyond my Friday afternoon skills for number theory. Perhaps another day.
zerothbase is offline   Reply With Quote
Old 2011-04-17, 02:37   #6
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

133468 Posts
Default

Quote:
Originally Posted by zerothbase View Post
Ya, but I haven't seen a straight download link for it the whole thing. Am I missing it? Has Feitsma completed it yet? The statistics page shows red (incomplete) for the >10^17. After having looked at 1e15, I suspect that I'd need more miller-rabin tests though. Perhaps 4 or 5 bases? I'd have to play with it to know for sure.
Yes, I think Will Galway has the full list on his site -- presumably Jan doesn't have the bandwidth. I downloaded it some months ago. (If you can't otherwise locate it, I suppose I could get it to you, but it's a *lot* of data and if you can get it from Galway so much the better.)
CRGreathouse is offline   Reply With Quote
Old 2011-04-17, 02:59   #7
zerothbase
 
Mar 2011

2·5 Posts
Default

Found it (with some creative Googling) and downloading now. thanks! I'll take a look and play with it a little bit.
zerothbase is offline   Reply With Quote
Old 2011-04-19, 23:00   #8
zerothbase
 
Mar 2011

1010 Posts
Default Fast isPrime for n < 2^64

Fast isPrime for n < 2^64 attached.

Three static SPRP/Miller-Rabin tests w/ a hash to find the 4th test.

It seemed a little too easy to find this. I think with more computation time I could find a way to reduce either the number of tests to 3 total or the size of the hash (just a hunch). But it's a good starting point to get a 100% guaranteed test for 64-bit numbers (assuming the completeness of Feitsma's data).

Note: mulmod is Intel 64-bit assembly. Substitute as needed for cross platform/architecture.

Thanks WraithX for the mulmod implementation, and SPWorley for the inspiration.
Attached Files
File Type: txt isPrime64Bit.c.txt (11.8 KB, 411 views)
zerothbase is offline   Reply With Quote
Old 2011-04-20, 01:04   #9
CRGreathouse
 
CRGreathouse's Avatar
 
Aug 2006

2×3×977 Posts
Default

I would really like to see a version that reduces the tests to 3 -- or perhaps just 2 plus some trial division? The latter would need a really big hash...
CRGreathouse is offline   Reply With Quote
Old 2011-04-20, 05:18   #10
geoff
 
geoff's Avatar
 
Mar 2003
New Zealand

13·89 Posts
Default

Here is some SPRP code I put together for the AP26 project. You can find the full source at http://sites.google.com/site/geoffre.../programs/ap26, look in the PrimeQ_x86.h header.

The mulmod function only works up to 2^62 (or 2^52 on some machines), but it is easy to replace with another one that works up to 2^64 (There are much faster methods than the mul/div asm version). If you are doing multiple SPRP tests on the same number then it would probably be worthwhile writing a vector version of the function that tests all bases in parallel.

Code:
/* Returns r = a*b (mod p) in the range 0 <= r < p.
   Assumes 0 <= a,b < p < 2^LDBL_MULMOD_LIMIT.
   Assumes inv = 1.0L/p.
*/
static int64_t mulmod(int64_t a, int64_t b, int64_t p, long double inv)
{
  int64_t q, r;
  long double x, y;

  /* ab (mod p) = ab-pq, where q = floor(ab/p) */

  x = a;
  y = b;
  q = x*y*inv;
  r = a*b - q*p;

  /* Because the default FPU rounding mode is round-to-nearest the result
     could be out by +/-p. One of these branches could be eliminated by
     using a non-default rounding mode. (Round-to-zero is ideal I think).
  */
  if (r < 0)
    r += p;
  else if (r >= p)
    r -= p;

  return r;
}


/* Returns 0 only if N is composite.
   Otherwise N is a strong probable prime to base a.
   Assumes N odd, a < N < 2^LDBL_MULMOD_LIMIT.
 */
static int strong_prp(int64_t a, int64_t N)
{
  long double inv;
  int64_t r;
  uint64_t d;
  unsigned int s, t;

  /* If N is prime and N = d*2^t+1, where d is odd, then either
     1.  a^d = 1 (mod N), or
     2.  a^(d*2^s) = -1 (mod N) for some s in 0 <= s < t */

  inv = 1.0L/N;

#ifdef __GNUC__
  t = __builtin_ctzll(N-1);
  d = (uint64_t)N >> t;
#else
  for (d = (uint64_t)N >> 1, t = 1; !(d & 1); d >>= 1, t++)
    ;
#endif

  /* r <-- a^d mod N, assuming d odd */
  for (r = a, d >>= 1; d > 0; d >>= 1)
  {
    a = mulmod(a,a,N,inv);
    if (d & 1)
      r = mulmod(r,a,N,inv);
  }

  /* Clause 1. and s = 0 case for clause 2. */
  if (r == 1 || r == N-1)
    return 1;

  /* 0 < s < t cases for clause 2. */
  for (s = 1; s < t; s++)
    if ((r = mulmod(r,r,N,inv)) == N-1)
      return 1;

  return 0;
}

Last fiddled with by geoff on 2011-04-20 at 05:21 Reason: fix url
geoff is offline   Reply With Quote
Old 2011-04-20, 14:06   #11
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

2×3×17×73 Posts
Default

Quote:
Originally Posted by CRGreathouse View Post
I would really like to see a version that reduces the tests to 3 -- or perhaps just 2 plus some trial division? The latter would need a really big hash...
You really should consider just a SINGLE MR test followed by either a
Lucas test or a Frobenius-Grantham. There is no known pseudoprime
for a single MR followed by a single Lucas test. (Preda Mihailescu
and I looked for one without success).

If the objective is proved primes, one can just trial divide N-1 and N+1
to N^(1/6), and apply the Selfridge-Lehmer-Brillhart test that allows
one to prove primality if enough factors of N-1 and N+1 are known.
R.D. Silverman is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Do normal adults give themselves an allowance? (...to fast or not to fast - there is no question!) jasong jasong 35 2016-12-11 00:57
How to Check if Non-Mersenne Number isPrime? FloatingPoint Operation Billion Digits 39 2015-10-21 02:15
How fast is the dog? Andi47 Puzzles 20 2009-04-01 02:35
I wonder how fast this is... ixfd64 Hardware 1 2005-11-21 21:28
Fast way to square??? maheshexp Math 2 2004-05-29 01:54

All times are UTC. The time now is 03:39.

Sat Jul 4 03:39:28 UTC 2020 up 101 days, 1:12, 1 user, load averages: 1.25, 1.18, 1.24

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.