mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2021-09-05, 07:01   #78
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

2·191 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
That fails:
[code]
And 'git pull' does nothing:
Code:
chris@4core:~/CGBN> git pull
Already up to date.
Unless I'm not using it correctly.

Ignore this, but for completion sake you can probably clone my copy of CGBN with `git clone -b cgbn_swap https://github.com/sethtroisi/CGBN.git`

The top entry from `git log` should be

Code:
commit 1595e543801bcbffd2c36cbf978baff843c09876 (HEAD -> gpu_integration, origin/gpu_integration)
Author: Seth Troisi <sethtroisi@google.com>
Date:   Sat Sep 4 20:26:30 2021 -0700

    reverted the cgbn_swap change till that is accepted
If so you should be able to build. If it's not try `git fetch` then `git pull origin gpu_integration`
SethTro is offline   Reply With Quote
Old 2021-09-05, 15:44   #79
chris2be8
 
chris2be8's Avatar
 
Sep 2009

219610 Posts
Default

I'm still stuck. I re-downloaded everything from scratch and re-ran autoreconf -si, ./configure and make. But make still fails
Code:
...
libtool: link: ( cd ".libs" && rm -f "libecm.la" && ln -s "../libecm.la" "libecm.la" )
/bin/sh ./libtool  --tag=CC   --mode=link gcc-9  -g -I/usr/local/cuda/include -g -O2 -DWITH_GPU -R /usr/local/cuda/lib64  -o ecm ecm-auxi.o ecm-b1_ainc.o ecm-candi.o ecm-eval.o ecm-main.o ecm-resume.o ecm-addlaws.o ecm-torsions.o ecm-getprime_r.o aprtcle/ecm-mpz_aprcl.o ecm-memusage.o libecm.la -lgmp -lrt -lm -lm -lm -lm -lm   
libtool: link: gcc-9 -g -I/usr/local/cuda/include -g -O2 -DWITH_GPU -o ecm ecm-auxi.o ecm-b1_ainc.o ecm-candi.o ecm-eval.o ecm-main.o ecm-resume.o ecm-addlaws.o ecm-torsions.o ecm-getprime_r.o aprtcle/ecm-mpz_aprcl.o ecm-memusage.o  ./.libs/libecm.a -L/usr/local/cuda/lib64 -lcudart -lgmp -lrt -lm -Wl,-rpath -Wl,/usr/local/cuda/lib64
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: ./.libs/libecm.a(cgbn_stage1.o): in function `cgbn_ecm_stage1':
tmpxft_00007e39_00000000-6_cgbn_stage1.cudafe1.cpp:(.text+0x8b3): undefined reference to `operator delete(void*)'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: tmpxft_00007e39_00000000-6_cgbn_stage1.cudafe1.cpp:(.text+0x196e): undefined reference to `operator delete(void*)'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: ./.libs/libecm.a(cgbn_stage1.o): in function `void std::vector<unsigned int, std::allocator<unsigned int> >::_M_realloc_insert<unsigned int>(__gnu_cxx::__normal_iterator<unsigned int*, std::vector<unsigned int, std::allocator<unsigned int> > >, unsigned int&&)':
tmpxft_00007e39_00000000-6_cgbn_stage1.cudafe1.cpp:(.text._ZNSt6vectorIjSaIjEE17_M_realloc_insertIJjEEEvN9__gnu_cxx17__normal_iteratorIPjS1_EEDpOT_[_ZNSt6vectorIjSaIjEE17_M_realloc_insertIJjEEEvN9__gnu_cxx17__normal_iteratorIPjS1_EEDpOT_]+0x50): undefined reference to `operator new(unsigned long)'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: tmpxft_00007e39_00000000-6_cgbn_stage1.cudafe1.cpp:(.text._ZNSt6vectorIjSaIjEE17_M_realloc_insertIJjEEEvN9__gnu_cxx17__normal_iteratorIPjS1_EEDpOT_[_ZNSt6vectorIjSaIjEE17_M_realloc_insertIJjEEEvN9__gnu_cxx17__normal_iteratorIPjS1_EEDpOT_]+0xc8): undefined reference to `operator delete(void*)'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: ./.libs/libecm.a(cgbn_stage1.o):(.data.rel.local.DW.ref.__gxx_personality_v0[DW.ref.__gxx_personality_v0]+0x0): undefined reference to `__gxx_personality_v0'
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:973: ecm] Error 1
make[2]: Leaving directory '/home/chris/ecm-cgbn/gmp-ecm'
make[1]: *** [Makefile:1903: all-recursive] Error 1
make[1]: Leaving directory '/home/chris/ecm-cgbn/gmp-ecm'
make: *** [Makefile:783: all] Error 2
Any ideas?
chris2be8 is offline   Reply With Quote
Old 2021-09-05, 15:59   #80
paulunderwood
 
paulunderwood's Avatar
 
Sep 2002
Database er0rr

3,929 Posts
Default

Did you install with YaST the dev package of libstdc++?
paulunderwood is online now   Reply With Quote
Old 2021-09-05, 18:38   #81
chris2be8
 
chris2be8's Avatar
 
Sep 2009

22×32×61 Posts
Default

Success!

The vital bit of info came from putting "__gxx_personality_v0" into duckduckgo. That told me it's provided by libstdc++ which is the g++ runtime. After installing gcc9-g++ and its run time libstdc++6-devel-gcc9 everything works.

This has been an educational experience. Next step is to benchmark cgbn on my GPU.
chris2be8 is offline   Reply With Quote
Old 2021-09-06, 16:02   #82
chris2be8
 
chris2be8's Avatar
 
Sep 2009

219610 Posts
Default

Benchmark results:
Code:
chris@4core:~/ecm-cgbn/gmp-ecm> date;echo "(2^499-1)/20959" | ./ecm -gpu -gpucurves 3584 -sigma 3:1000 20000 0;date
Sun  5 Sep 19:42:42 BST 2021
GMP-ECM 7.0.5-dev [configured with GMP 5.1.3, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^499-1)/20959 (146 digits)
Using B1=20000, B2=0, sigma=3:1000-3:4583 (3584 curves)
GPU: Using device code targeted for architecture compile_52
GPU: Ptx version is 52
GPU: maxThreadsPerBlock = 1024
GPU: numRegsPerThread = 31 sharedMemPerBlock = 24576 bytes
GPU: Block: 32x32x1 Grid: 112x1x1 (3584 parallel curves)
Computing 3584 Step 1 took 190ms of CPU time / 20427ms of GPU time
Sun  5 Sep 19:43:03 BST 2021

chris@4core:~/ecm-cgbn/gmp-ecm> date;echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0;date
Sun  5 Sep 19:43:29 BST 2021
GMP-ECM 7.0.5-dev [configured with GMP 5.1.3, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^499-1)/20959 (146 digits)
Using B1=20000, B2=0, sigma=3:1000-3:4583 (3584 curves)
GPU: Using device code targeted for architecture compile_52
GPU: Ptx version is 52
GPU: maxThreadsPerBlock = 640
GPU: numRegsPerThread = 93 sharedMemPerBlock = 0 bytes
Computing 3584 Step 1 took 30ms of CPU time / 3644ms of GPU time
Sun  5 Sep 19:43:33 BST 2021

chris@4core:~/ecm-cgbn/gmp-ecm> date;echo "(2^997-1)" | ./ecm -gpu -sigma 3:1000 20000 0;date
Sun  5 Sep 19:44:25 BST 2021
GMP-ECM 7.0.5-dev [configured with GMP 5.1.3, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^997-1) (301 digits)
Using B1=20000, B2=0, sigma=3:1000-3:1831 (832 curves)
GPU: Using device code targeted for architecture compile_52
GPU: Ptx version is 52
GPU: maxThreadsPerBlock = 1024
GPU: numRegsPerThread = 31 sharedMemPerBlock = 24576 bytes
GPU: Block: 32x32x1 Grid: 26x1x1 (832 parallel curves)
Computing 832 Step 1 took 188ms of CPU time / 4552ms of GPU time
Sun  5 Sep 19:44:30 BST 2021

chris@4core:~/ecm-cgbn/gmp-ecm> date;echo "(2^997-1)" | ./ecm -gpu -cgbn -sigma 3:1000 20000 0;date
Sun  5 Sep 19:44:41 BST 2021
GMP-ECM 7.0.5-dev [configured with GMP 5.1.3, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^997-1) (301 digits)
Using B1=20000, B2=0, sigma=3:1000-3:1831 (832 curves)
GPU: Using device code targeted for architecture compile_52
GPU: Ptx version is 52
GPU: maxThreadsPerBlock = 640
GPU: numRegsPerThread = 93 sharedMemPerBlock = 0 bytes
Computing 832 Step 1 took 8ms of CPU time / 1995ms of GPU time
Sun  5 Sep 19:44:44 BST 2021
So about 5 times faster for (2^499-1)/20959 and about twice as fast for 2^997-1. But these are all small cases.

But my overall throughput won't increase much because my CPU can't do stage 2 as fast as the GPU can do stage 1 now. But that's not your fault. And any speedup is nice. Thanks.


Other lessons learnt:
autoreconf -si creates symlinks to missing files while autoreconf -i copies them. Using -si saves space, but if you upgrade to a new level of automake you can get hanging symlinks:
Code:
lrwxrwxrwx 1 chris users 32 Nov 12  2015 INSTALL -> /usr/share/automake-1.13/INSTALL
lrwxrwxrwx 1 chris users 35 Nov 12  2015 ltmain.sh -> /usr/share/libtool/config/ltmain.sh
They needed updating to:
Code:
lrwxrwxrwx 1 chris users 32 Sep  4 19:20 INSTALL -> /usr/share/automake-1.15/INSTALL
lrwxrwxrwx 1 chris users 38 Sep  4 19:20 ltmain.sh -> /usr/share/libtool/build-aux/ltmain.sh
Not a common issue though.


And suggestions for the install process:
INSTALL-ecm should tell users to run autoreconf -i (or -si) before running ./configure (which is created by autoreconf -i).

./configure compiles several small programs and runs them to check things. If the compile fails it should put out a message saying the compile failed, not one saying it found different levels of run time library etc. If the compile normally produces no output then letting any output it does produce go to the screen would be informative (eg when it can't find -lstdc++).

Chris
chris2be8 is offline   Reply With Quote
Old 2021-09-07, 06:25   #83
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

38210 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
Success!
I'm glad we finally got here!

2.2x speedup for the 1024 bit case is almost exactly what everyone else is seeing (except bsquared maybe because newer card?).

You can often improve overall throughput by adjust to 1.2*B1 and 1/2*B2 (and checking that expected curves stays roughly the same). This can especially help if Stage 1 time < Stage 2 time / cores.

I'll reflect on your notes and see if I can improve the documentation / configure script.
SethTro is offline   Reply With Quote
Old 2021-09-07, 15:41   #84
chris2be8
 
chris2be8's Avatar
 
Sep 2009

22·32·61 Posts
Default

Quote:
Originally Posted by SethTro View Post
I'll reflect on your notes and see if I can improve the documentation / configure script.
How about updating INSTALL-ecm like this:
Code:
diff -u INSTALL-ecm INSTALL-ecm.new
--- INSTALL-ecm	2021-09-05 12:13:55.613439408 +0100
+++ INSTALL-ecm.new	2021-09-07 16:37:42.903291304 +0100
@@ -19,6 +19,7 @@
 
 1) check your configuration with:
 
+   $ autoreconf -i
    $ ./configure
 
    The configure script accepts several options (see ./configure --help).
That's a minimum change to get new users started.
chris2be8 is offline   Reply With Quote
Old 2021-09-07, 18:08   #85
WraithX
 
WraithX's Avatar
 
Mar 2006

48610 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
How about updating INSTALL-ecm like this:
Code:
diff -u INSTALL-ecm INSTALL-ecm.new
--- INSTALL-ecm	2021-09-05 12:13:55.613439408 +0100
+++ INSTALL-ecm.new	2021-09-07 16:37:42.903291304 +0100
@@ -19,6 +19,7 @@
 
 1) check your configuration with:
 
+   $ autoreconf -i
    $ ./configure
 
    The configure script accepts several options (see ./configure --help).
That's a minimum change to get new users started.
That document describes what users should do when they have downloaded an official release. When building an official release, you do not need to run autoreconf -i. You only need to run autoreconf -i when you download a development version with git or svn. I don't think adding autoreconf -i to this document is a good idea.

Looking at the various documents, I see that README.dev has the advice of running autoreconf -i.
WraithX is offline   Reply With Quote
Old 2021-09-08, 15:30   #86
chris2be8
 
chris2be8's Avatar
 
Sep 2009

22·32·61 Posts
Default

How about having INSTALL-ecm tell users to run autoreconf -i if they don't have a ./configure in the directory?

And if people get an official release would the files that would be created by autoreconf -i be correct for their OS etc?
chris2be8 is offline   Reply With Quote
Old 2021-09-08, 15:41   #87
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

3×372 Posts
Default

@Chris: Did you get your sm_30 card working or just the higher arch one?
EdH is offline   Reply With Quote
Old 2021-09-09, 15:40   #88
chris2be8
 
chris2be8's Avatar
 
Sep 2009

22·32·61 Posts
Default

Just the higher arch one (sm_52). Sorry.

PS. Does CGBN increase the maximum size of number that can be handled? I'd try it, but I'm tied up catching up with ECM work I delayed while I was getting ecm-cgbn working.
chris2be8 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
NTT faster than FFT? moytrage Software 50 2021-07-21 05:55
PRP on gpu is faster that on cpu indomit Information & Answers 4 2020-10-07 10:50
faster than LL? paulunderwood Miscellaneous Math 13 2016-08-02 00:05
My CPU is getting faster and faster ;-) lidocorc Software 2 2008-11-08 09:26
Faster than LL? clowns789 Miscellaneous Math 3 2004-05-27 23:39

All times are UTC. The time now is 23:03.


Sat Nov 27 23:03:10 UTC 2021 up 127 days, 17:32, 0 users, load averages: 0.75, 0.82, 1.03

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.