![]() |
![]() |
#1 |
May 2009
Russia, Moscow
2·3·461 Posts |
![]()
I compared two GMP-ECM 6.3 builds under Linux. One compiled with GMP 5.0.1 and another with GMP 4.1.4
I got several strange results. In overall GMP 5.0.1 is better by 5-15% but with B1=11e6 with some ranges (tested 100-300digits) 4.1.4 was better. Some examples follows. Code:
1. C121 from near-repdigits GMP-ECM 6.3 [configured with GMP 4.1.4 and --enable-asm-redc] [ECM] Input number is 1800485013924273616277080302416213714297702488568072032612888194660755338496630976045963259724803581322873645120627538429 (121 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=334640802 Step 1 took 36869ms Step 2 took 19737ms GMP-ECM 6.3 [configured with GMP 5.0.1 and --enable-asm-redc] [ECM] Input number is 1800485013924273616277080302416213714297702488568072032612888194660755338496630976045963259724803581322873645120627538429 (121 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=2340904304 Step 1 took 35097ms Step 2 took 33626ms GMP 5.0.1 is significantly slower again on step 2. 2. C156 from aliquot seq 283752:i7004 GMP-ECM 6.3 [configured with GMP 4.1.4 and --enable-asm-redc] [ECM] Input number is 150334450606011724019777200211010468220565590046299234402254345532711750018652367487259651931850319063498312781804011647293058067263942651704486104870980321 (156 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=4153245810 Step 1 took 55526ms Step 2 took 26975ms GMP-ECM 6.3 [configured with GMP 5.0.1 and --enable-asm-redc] [ECM] Input number is 150334450606011724019777200211010468220565590046299234402254345532711750018652367487259651931850319063498312781804011647293058067263942651704486104870980321 (156 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=2955949299 Step 1 took 57614ms Step 2 took 39257ms Again step 2 with GMP 5.0.1 is much slower. 3. C209 from near-repdigits GMP-ECM 6.3 [configured with GMP 4.1.4 and --enable-asm-redc] [ECM] Input number is 99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999899999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 (209 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=2560444052 Step 1 took 75055ms Step 2 took 36402ms GMP-ECM 6.3 [configured with GMP 5.0.1 and --enable-asm-redc] [ECM] Input number is 99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999899999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 (209 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=3908589128 Step 1 took 76103ms Step 2 took 46634ms Step 2 with GMP 5.0.1 is slower by 10sec. With B1=3e6 all is OK - 5.0.1 is slightly better than 4.1.4 1. C121 Step 1 took 9562ms Step 2 took 4803ms vs. Step 1 took 10009ms Step 2 took 6219ms 2. C156 Step 1 took 15440ms Step 2 took 6315ms vs. Step 1 took 15102ms Step 2 took 8532ms 3. C209 Step 1 took 20846ms Step 2 took 8188ms vs. Step 1 took 20306ms Step 2 took 11598ms Compile options: --enable-openmp --with-gmp=/usr/local/ --enable-shellcmd --enable-sse2 --enable-asm-redc Test system: Xeon E5620 2.40GHz Centos 5.5 x86_64 on 2.6.18 kernel Last fiddled with by unconnected on 2011-03-31 at 12:23 |
![]() |
![]() |
![]() |
#2 |
Sep 2008
Krefeld, Germany
2×5×23 Posts |
![]()
Thats exactly what I figured out some time ago. Especially on step 2 GMP 4.x is a lot faster - and I have no idea why.
The fastest combination for my Phenom 2 1090T is GMP 4.3.2 combined with GMP-ECM 6.3, all compiled with --march=barcelona and, of cause, linked statically. For large numbers > ~ 400 digits linking against gwnum gave a huge speedup. Table attached: All times in ms, mesaured on Phenom 2, 3.6Ghz, Linux kernel 2.6.35, 64 bit Last fiddled with by Syd on 2011-04-01 at 02:10 |
![]() |
![]() |
![]() |
#3 | |
May 2009
Russia, Moscow
2×3×461 Posts |
![]() Quote:
I decided to recomplile binaries from scratch and there are some questions again. Why ecm-params.h.athlon64 is used instead of ecm-params.h.core2 ? Why SSE2 instructions were not used in NTT code? Code:
config.status: linking ecm-params.h.athlon64 to ecm-params.h config.status: linking mul_fft-params.h.athlon64 to mul_fft-params.h config.status: executing depfiles commands config.status: executing libtool commands configure: Configuration: configure: Build for host type x86_64-unknown-linux-gnu configure: CC=gcc -std=gnu99, CFLAGS=-W -Wall -Wundef -O2 -pedantic -m64 -mtune=core2 -march=core2 configure: Linking GMP with /usr/local//lib/libgmp.a configure: Using asm redc code from directory x86_64 configure: Not using SSE2 instructions in NTT code |
|
![]() |
![]() |
![]() |
#4 |
Tribal Bullet
Oct 2004
5·709 Posts |
![]() |
![]() |
![]() |
![]() |
#5 | |
Einyen
Dec 2003
Denmark
CF016 Posts |
![]()
How are you compiling GMP 4.3.2 for 64bit?
I get this error: Quote:
./configure CC=gcc CFLAGS="-O2 -pedantic -m64 -std=gnu99 -mtune=core2 -march=core2" ABI=64 --build=x86_64-w64-mingw32 I also tried just: ./configure ABI=64 and variations. I read on GMP website: "Gcc 4.3.2 miscompiles GMP on 64-bit machines", but I'm using gcc 4.6.0. Last fiddled with by ATH on 2011-04-02 at 16:49 |
|
![]() |
![]() |
![]() |
#6 |
Einyen
Dec 2003
Denmark
24×32×23 Posts |
![]()
Here is my 32bit test of GMP 4.3.2 vs 5.0.1 and MPIR: gmp4test.html
I can't see the effect you describe. On a core2 the GMP 4.3.2 binary is alot slower than both GMP 5.0.1 and MPIR 2.3.0/2.2.1. On a pentium4 its only slightly slower than GMP 5.0.1 and faster than MPIR. If you have a link to GMP 4.1.4 I'm willing to test it. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Looking for benchmarking help with a Phenom or PhenomII X6 | mrolle | Software | 25 | 2012-03-14 14:15 |
Benchmarking dual-CPU machines | garo | Software | 2 | 2010-09-27 20:33 |
Benchmarking suite discussion | Mystwalker | GMP-ECM | 7 | 2006-06-11 10:08 |
Benchmarking problem with Prime95 | jasong | Factoring | 6 | 2006-03-23 05:12 |
Benchmarking challenge! | Xyzzy | Software | 17 | 2003-08-26 15:43 |