mersenneforum.org Nice Result! M971
 Register FAQ Search Today's Posts Mark Forums Read

2004-09-21, 14:07   #34
geoff

Mar 2003
New Zealand

13·89 Posts

Quote:
 Originally Posted by garo The maple program output is approximately 5% lower than the the estimates posted by geoff and 5% higher than the estimates posted by Alex post Brent-Suyama extension.
Some of the earlier numbers I used were too low because I used p=45 instead of p=45.5 in the maple program. For numbers where I can combine stage one with mprime and split stage two with gmp-ecm I have settled on these limits:

45 digits: B1=11e6, B2=4e9, reported as 1.33 standard curves.
50 digits: B1=44e6, B2=26e9, reported as 1.5 standard curves.

(I have rounded the reporting rate down to a nice integer ratio so I believe that I am being a little conservative in the work I report). These are based on an average cofactor size of about 200 digits, doing stage one on a Celeron and stage two by running two instances of gmp-ecm 5.0.3 on a hyperthreaded P4. I wouldn't be surprised if the optimal B2 was quite a bit smaller when mprime can combine four numbers into a single stage one step, especially with a larger cofactor size, but in any case I think it would be worthwhile to spend at least as long doing stage two as mprime does.

At the moment I am working on numbers from the 2+ tables where I don't get the free stage one for the associated 2- mate, so I use these 45 digit curves:

Medium (c200) cofactor: B1=11e6, B2=8e9, reported as 1.5 standard curves.
Small (c160) cofactor: B1=11e6, B2=11e9, reported as 1.66 standard curves.

In most cases the stage two time works out at 60-80% of the stage one time. (after dividing the stage one time by the number of cofactors it splits into for stage two). This is similar to what I would get if I ran both stages of the curve using gmp-ecm with default settings.

 2004-09-21, 15:13 #35 garo     Aug 2002 Termonfeckin, IE 24×173 Posts Thanks Geoff. Your post really helps. Just what I was looking for. For all practical purposes, all one needs to know is the optimal B2 (optimizing against expected time per factor found) and then how many curves to report it as. Getting into the details requires reading the Wagstaff/Silverman paper and parts of a dissertation which I will do this weekend. But that is mostly academic. Putting on an engineering hat, I wouldn't save that much more time by going into the details and computing the numbers of curves required for a given B2 with a finer precision than we already have.
2004-09-21, 16:06   #36
geoff

Mar 2003
New Zealand

13·89 Posts

Quote:
 Originally Posted by garo Getting into the details requires reading the Wagstaff/Silverman paper and parts of a dissertation which I will do this weekend.
I will be interested to hear what you work out. I don't know enough to get very far into the theory, especially with anything to do with probability where my intuition is so unreliable :-)

I have recently tried the ks-multiply patch with gmp-ecm 5.0.3 from this thread http://www.mersenneforum.org/showthread.php?t=2923, which I gather will be in the new gmp-ecm when it is released. The stage two improvement is quite impressive, especially with large B2, but the memory usage is much higher which will have a big effect on the strategy for choosing B2.

Here are a few sample curve times (Machine: P4 Northwood at 2.9GHz, OS: Debian Linux, SMP kernel 2.6, libgmp: gmp 4.1.3 configured for P4 architecture.)
Code:
                               unpatched      patched
size  B2 range    polynomial  memory time   memory time  improvement
--------------------------------------------------------------------
c151  44e6-180e9  dickson 30  150MB  706s   520MB  479s   47%
44e6-44e9   dickson 30   74MB  275s   258MB  212s   30%
c163  11e6-26e9   dickson 12   59MB  190s   212MB  146s   30%
11e6-11e9   dickson 12   38MB  108s   131MB   89s   21%
c213  11e6-26e9   dickson 12   74MB  247s   272MB  196s   26%
11e6-11e9   dickson 12   48MB  143s   167MB  121s   18%

 2004-09-21, 17:10 #37 garo     Aug 2002 Termonfeckin, IE 24×173 Posts Thanks geoff for the pointer to the patch. However, it does not work as it complains about the BITS_PER_MP_LIMB constant. Also, the file gmp-impl.h is referred to but is not included in the patch. Here is what I get ks-multiply.c:22:23: gmp-impl.h: No such file or directory ks-multiply.c: In function kronecker_schonhage': ks-multiply.c:70: BITS_PER_MP_LIMB' undeclared (first use in this function) ks-multiply.c:70: (Each undeclared identifier is reported only once ks-multiply.c:70: for each function it appears in.)
2004-09-21, 17:31   #38
geoff

Mar 2003
New Zealand

13·89 Posts

Quote:
 Originally Posted by garo ks-multiply.c:22:23: gmp-impl.h: No such file or directory
gmp-impl.h is part of libgmp, some distributions (e.g. Debian) don't install it. With gmp-impl.h installed, the only compiler warning was about KS_MULT_THRESHOLD being redefined, but I assume that that is an optimisation issue only. I tested the patched version on a few known curves and it refound every expected factor, but I am not sure whether to start using it or to wait for the official release.

(BTW the times in the previous post were for stage two only).

 2004-09-22, 09:03 #39 garo     Aug 2002 Termonfeckin, IE 53208 Posts I copied all the .h files from the gmp directory to the ecm directory and it compiled straight away. And boy it speeded up. Stage 2 with the gmp-ecm default of 25.7e9 went about 20% faster on my P4. Alex, any comments on whether this patched version can be used? I'll stick with the regular till I hear from you.
 2004-09-22, 09:31 #40 akruppa     "Nancy" Aug 2002 Alexandria 2,467 Posts I didn't get involved much with the KS code, Dave and Paul did that. They made some changes to the code, but I think they were mostly speedups and reducing memory allocation. Perhaps Dave can comment? Alex
 2004-09-22, 11:36 #41 dave_dm   May 2004 24×5 Posts gmp-impl.h wasn't included as it declares some cpu-dependent constants and macros. But apart from BITS_PER_MP_LIMB, I don't think which gmp-impl.h you use affects the ks code much. This file isn't needed in the current version of ks as Paul Z doesn't like including it - instead all of the relevant bits are copied into ecm-gmp.h. Wrt memory usage - the original ks memory allocation didn't work as it intended and ended up mallocing all over the place. This has been fixed. It still uses more than the old multiplication, maybe twice as much, maybe less, but certainly not the 3-4x factor Geoff's table shows. (I haven't done any analysis on the current code myself). Since the ks patch, the default B2 and the default k values obviously need to be updated - I think Paul Z is handling/has handled this. Dave
 2004-09-22, 11:42 #42 dave_dm   May 2004 24·5 Posts garo: the patched version can certainly be used, as long as you don't mind the excessive memory allocation. Run test.ecm to check it works.
 2004-09-22, 12:19 #43 garo     Aug 2002 Termonfeckin, IE 24·173 Posts Thanks. For the moment then, I'll just copy the BITS_PER_MP_LIMB variable to ecm.h and get rid of all the other gmp include files. And run test.ecm before using it. PS: As you may have seen above, the default k-value was not always the fastest on my P4 and my Athlon for several values of B2.
2004-09-23, 07:53   #44
garo

Aug 2002
Termonfeckin, IE

276810 Posts

So I compiled the version with the ks-multiply-patch and ran it on range of numbers. It seems to work fine and runs faster than plain GMP-ECM5.0.3 for higher B2 values.

However, I've run into a problem with a specific save file from mprime. I'm attaching it below, but when I run it with k=4 and B2=4e10 it does a segmentation fault at the stage of computing 1/F. The plain ECM-5.0.3 does nto crash for the same values.

With the same parameters and a different residue file it does not crash. With same residue and B2 bound and k=5 it does nto crash. And with same residue, k=4 and B2=4e9 it does not crash.

Here is the residue file attached. I hope someone, particularly dave can have a look at it.

Oh and here is the stackdump:
Code:
Exception: STATUS_STACK_OVERFLOW at eip=00420023
eax=00148490 ebx=00000019 ecx=00032800 edx=00000000 esi=000D0D24 edi=000D0D0B
ebp=0022E838 esp=0022E7F8 program=F:\cygwin\home\ecm\ecm.exe, pid 488, thread main
cs=001B ds=0023 es=0023 fs=0038 gs=0000 ss=0023
Stack trace:
Frame     Function  Args
0022E838  00420023  (14BB53CC, 141FAD18, 000D0D0B, 0FB71124)
0022E898  0040FDFA  (0A129BC8, 1155D064, 0A051500, 00003DE0)
0022E8E8  0040537B  (1152E9F0, 0A051500, 00007BBF, 0A129BC8)
0022E9C8  0040B40D  (0022F000, 0022EA30, 0022EA60, 00000000)
0022EAD8  00403025  (0022F000, 0022F040, 0022F030, 0022F010)
0022F068  00406EAE  (00000003, 61782CBC, 0A0500A8, 0022F0C0)
0022F0A8  61005F34  (0022F0C0, 77F56BB3, 77DD0000, 77F5BCD4)
0022FF88  6100614B  (00000000, 00000000, 00000000, 00000000)
End of stack trace
Attached Files
 save8.txt (1.2 KB, 153 views)

Last fiddled with by garo on 2004-09-23 at 08:01

 Similar Threads Thread Thread Starter Forum Replies Last Post schickel FactorDB 29 2012-07-18 17:03 Dubslow Forum Feedback 0 2012-05-02 02:13 fivemack Factoring 84 2011-04-26 10:22 philmoore Lounge 5 2004-09-14 20:18 Xyzzy Lounge 4 2003-06-28 13:37

All times are UTC. The time now is 14:37.

Sat Jan 28 14:37:28 UTC 2023 up 163 days, 12:06, 0 users, load averages: 0.99, 1.15, 1.07