mersenneforum.org SQUFOF implementation
 User Name Remember Me? Password
 Register FAQ Search Today's Posts Mark Forums Read

 2010-04-11, 01:45 #2 bsquared     "Ben" Feb 2007 13·257 Posts Are you using multipliers? I didn't see that you were, but maybe I missed where they were implemented or maybe they are already in N when this routine is called. Racing several small multipliers (by keeping a small amount of state memory for each multipler and running smallish batches of iterations on each) is a good optimization. Msieve and yafu both have implementations which do this. The step 4 optimization I believe is called fast return. Search for papers by Steven McMath on this subject - he researched it extensively and also published some code, I think... yes, it's called "Parallel integer factorization using quadratic forms". I never got around to implementing fast return, so if you are successful I'd be very interested in seeing it.
 2010-04-11, 02:01 #3 alpertron     Aug 2002 Buenos Aires, Argentina 31×43 Posts I have not implemented multipliers in the code. First I wanted to be sure that the code worked. There were two errors in the algorithm (section 3.3 of the Gower and Wagstaff paper) that I corrected in the implementation. From what I read in http://hal.inria.fr/docs/00/45/16/04...draft_02_1.pdf it appears that the "fast return" is really faster when factoring numbers greater than 20 digits, but I think that in this case SIQS will be faster.
 2010-04-11, 11:57 #4 jasonp Tribal Bullet     Oct 2004 1101101000012 Posts What corrections did you have to make? It's been a long time, but the SQUFOF code in msieve was taken directly from Gower's description, and I didn't understand enough of the underlying math to make corrections. Also, it was quite helpful in C to handle a special case of q=1 before doing the divisions at the beginning of the even and odd cycles. It probably wouldn't help in Java though. Last fiddled with by jasonp on 2010-04-11 at 11:59
2010-04-11, 12:16   #5
R.D. Silverman

Nov 2003

26×113 Posts

Quote:

Code:
squfof(n,fac1,fac2)
int n[];
int *fac1, *fac2;

{   /* start of squfof: factor n as fac1*fac2  faster in FP?????*/

int temp[MPDIM], temp1[MPDIM];
register int iq,ll,l2,p,pnext,q,qlast,r,s,t,i;
int jter,iter;

if (SQUFOF) squfof_calls++;
#define   DEBUG_QSC   0
if (DEBUG_QSC) { (void) printf("trying qs on: "); write_num(n,numstr); fflush(stdout); }
jter = small_qs(n,fac1,fac2);
if (DEBUG_QSC)
{
(void) printf("qs returns %d\n",jter);
(void) printf("tried qs on "); write_num(n, numstr);
(void) printf("fac1, fac2 = %d %d, squfof returning %d\n", *fac1, *fac2, jter);
fflush(stdout);
}         /* -1 should be very rare; QS found no non-trivial GCD's   */
if (jter != -1) return(jter);

/*   Notes: could replace calls to the MP routines [on Pentium] by using*/
/*  __int64  for temp,temp1, convert n to __int64 etc. But the MP stuff */
/*  is not inside a loop. It is not worth doing                         */

qlast = 1;
mpsqrt(n,temp);

s = LAST(temp);
p = s;
mult(temp,temp,temp1);
subt(n,temp1,temp);                    /* temp = n - floor(sqrt(n))^2   */
if (SIZE(temp) <= SINGLE && LAST(temp) == 0)
{                                   /* Here n is a square            */
*fac1 = s;
*fac2 = s;
return(1);
}

q = LAST(temp);              /* q = excess of n over next smaller square */
ll = 1 + 2*(int)sqrt((double)(p+p));
l2 = ll/2;
qpoint = 0;

/*   In the loop below, we need to check if q is a square right before   */
/*  the end of the loop.  Is there a faster way? The current way is      */
/*   EXPENSIVE! (many branches and double prec sqrt)                     */

for (jter=0; jter < 800000; jter++)      /* I see no way to speed this   */
{                                     /*  main loop                   */
iq = (s + p)/q;
pnext = iq*q - p;
if (q <= ll)
{
if ((q & 1) == 0) enqu(q/2,&jter);
else if (q <= l2) enqu(q,&jter);
if (jter < 0)
{                        /* queue overflow: try pollard rho   */
if (Q_OVERFLOW) overflow_count++;    /* should be rare         */
return(pollardrho2(n,fac1,fac2));   /* also try squfof(3N)?   */
}
}
t = qlast + iq*(p - pnext);
qlast = q;
q = t;
p = pnext;                          /* check for square; even iter   */
if (jter & 1) continue;             /* jter is odd:omit square test  */
//   if (t = (q & 3) > 1) continue;    /* skip t = 2,3 mod 4            */
//   if (q & 7 == 5) continue;         /* skip q = 5 mod 8              */
//   if (t == 0 && (q & 15) > 5) continue;   /* skip 8, 12 mod 16       */
//   if (s3465[q % 3465] == 'n') continue;   /* skip QNR's mod 3465     */
r = (int)sqrt((double)q);                 /* r = floor(sqrt(q))      */
if (q != r*r) continue;
if (qpoint == 0) goto gotit;
for (i=0; i<qpoint-1; i+=2)      /* treat queue as list for simplicity*/
{
if (r == qqueue[i]) goto contin;
if (r == qqueue[i+1]) goto contin;
}
if (r == qqueue[qpoint-1]) continue;
goto gotit;
contin:;
}   /* end of main loop */

void sqfprep()
{
int i;
for (i=0; i<3465; i++) s3465[i] = 'n';
for (i=0; i<1733; i++) s3465[(i*i) % 3465] = 'r';
}

void enqu(q,iter)
int q;
int *iter;
{
qqueue[qpoint] = q;
if (++qpoint > 50) *iter = -1;
}

2010-04-11, 13:40   #6
alpertron

Aug 2002
Buenos Aires, Argentina

31·43 Posts

Quote:
 Originally Posted by jasonp What corrections did you have to make? It's been a long time, but the SQUFOF code in msieve was taken directly from Gower's description, and I didn't understand enough of the underlying math to make corrections. Also, it was quite helpful in C to handle a special case of q=1 before doing the divisions at the beginning of the even and odd cycles. It probably wouldn't help in Java though.
In Step 3 the paper says: $Q \leftarrow (N-P*P)/Q'$ , but it must be $Q \leftarrow (D-P*P)/Q'$

In Step 4a the paper says Set $q \leftarrow \lfloor (S+P)/Q'\rfloor$ but it must be $q \leftarrow \lfloor (S+P)/Q\rfloor$

The last one should be easy to catch because even the example provided does not work using the step 4a as written on the paper.

Last fiddled with by alpertron on 2010-04-11 at 14:39

 2010-04-11, 14:37 #7 alpertron     Aug 2002 Buenos Aires, Argentina 31·43 Posts Acording to the paper, the implementation of the queue as a list as written by Bob above is slower than the queue because some proper square forms could be skipped.
 2010-04-11, 16:49 #8 bsquared     "Ben" Feb 2007 13×257 Posts comparisons I went ahead and did some comparisons. Bob, I had to port your code so that it works outside your lattice siever, in my test routine. If anything it should be faster, since it uses native 64 bit integers rather than your multiple precision routines, when necessary. I also got rid of the calls to pollard rho, and your mini-qs, so that the comparison is between squfof routines only. Here is your modified code: Code: int qqueue[100]; int qpoint; void enqu(int q,int *iter) { qqueue[qpoint] = q; if (++qpoint > 100) *iter = -1; } int squfof_rds(int64 n,int *fac1,int *fac2) { /* start of squfof: factor n as fac1*fac2 faster in FP?????*/ int64 temp, temp1; register int iq,ll,l2,p,pnext,q,qlast,r,s,t,i; int jter,iter; qlast = 1; s = (int)sqrt(n); p = s; temp1 = s * s; temp = n - temp1; /* temp = n - floor(sqrt(n))^2 */ if (temp == 0) { /* Here n is a square */ *fac1 = s; *fac2 = s; return(1); } q = (int)temp; /* q = excess of n over next smaller square */ ll = 1 + 2*(int)sqrt((double)(p+p)); l2 = ll/2; qpoint = 0; /* In the loop below, we need to check if q is a square right before */ /* the end of the loop. Is there a faster way? The current way is */ /* EXPENSIVE! (many branches and double prec sqrt) */ for (jter=0; jter < 800000; jter++) /* I see no way to speed this */ { /* main loop */ iq = (s + p)/q; pnext = iq*q - p; if (q <= ll) { if ((q & 1) == 0) enqu(q/2,&jter); else if (q <= l2) enqu(q,&jter); if (jter < 0) { return 0; } } t = qlast + iq*(p - pnext); qlast = q; q = t; p = pnext; /* check for square; even iter */ if (jter & 1) continue; /* jter is odd:omit square test */ r = (int)sqrt((double)q); /* r = floor(sqrt(q)) */ if (q != r*r) continue; if (qpoint == 0) goto gotit; for (i=0; i
 2010-04-11, 17:12 #9 jasonp Tribal Bullet     Oct 2004 3·1,163 Posts Actually the pseudocode in Gower's thesis did not look familiar, I probably used McMath's papers as pointed to in the SQUFOF wiki page.
 2010-04-12, 02:14 #10 alpertron     Aug 2002 Buenos Aires, Argentina 31×43 Posts By unrolling both loops so there are two copies for each, the code is now 15% faster, according to my measurements.
2010-04-12, 15:22   #11
R.D. Silverman

Nov 2003

11100010000002 Posts

Quote:
 Originally Posted by bsquared I went ahead and did some comparisons. Bob, I had to port your code so that it works outside your lattice siever, in my test routine. If anything it should be faster, since it uses native 64 bit integers rather than your multiple precision routines, when necessary. I also got rid of the calls to pollard rho, and your mini-qs, so that the comparison is between squfof routines only. Here is your modified code: Code: int qqueue[100]; int qpoint; void enqu(int q,int *iter) { qqueue[qpoint] = q; if (++qpoint > 100) *iter = -1; } int squfof_rds(int64 n,int *fac1,int *fac2) { /* start of squfof: factor n as fac1*fac2 faster in FP?????*/ int64 temp, temp1; register int iq,ll,l2,p,pnext,q,qlast,r,s,t,i; int jter,iter; qlast = 1; s = (int)sqrt(n); p = s; temp1 = s * s; temp = n - temp1; /* temp = n - floor(sqrt(n))^2 */ if (temp == 0) { /* Here n is a square */ *fac1 = s; *fac2 = s; return(1); } q = (int)temp; /* q = excess of n over next smaller square */ ll = 1 + 2*(int)sqrt((double)(p+p)); l2 = ll/2; qpoint = 0; /* In the loop below, we need to check if q is a square right before */ /* the end of the loop. Is there a faster way? The current way is */ /* EXPENSIVE! (many branches and double prec sqrt) */ for (jter=0; jter < 800000; jter++) /* I see no way to speed this */ { /* main loop */ iq = (s + p)/q; pnext = iq*q - p; if (q <= ll) { if ((q & 1) == 0) enqu(q/2,&jter); else if (q <= l2) enqu(q,&jter); if (jter < 0) { return 0; } } t = qlast + iq*(p - pnext); qlast = q; q = t; p = pnext; /* check for square; even iter */ if (jter & 1) continue; /* jter is odd:omit square test */ r = (int)sqrt((double)q); /* r = floor(sqrt(q)) */ if (q != r*r) continue; if (qpoint == 0) goto gotit; for (i=0; i
I never bothered to do more than trivial optimizations to the squfof code.

(1) NFS spends so little time inside this routine, that optimizing it will not
have much effect.

(2) MPQS is a lot faster for numbers of this size.

 Similar Threads Thread Thread Starter Forum Replies Last Post Maximus Software 9 2012-01-30 22:41 rdotson Hardware 12 2006-03-26 22:58 Leith Miscellaneous Math 4 2005-01-18 23:14 flava Programming 12 2004-10-26 03:51 Chris Card Factoring 9 2004-09-24 14:19

All times are UTC. The time now is 16:40.

Mon Nov 23 16:40:08 UTC 2020 up 74 days, 13:51, 2 users, load averages: 1.73, 1.89, 1.81