mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-08-10, 13:23   #1
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

54548 Posts
Default CUDA ECM

Has anyone ever given this any thought? I was told by another member that running ECM's is like throwing darts at a dartboard. Perhaps the throwing process could be sped up.
storm5510 is offline   Reply With Quote
Old 2018-08-10, 13:41   #2
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3·5·251 Posts
Default

Yes

In fact there is a thread already with 450+ posts on this topic. Enjoy!
bsquared is offline   Reply With Quote
Old 2018-08-10, 14:42   #3
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

24×3×163 Posts
Default

As I understand it, there are two issues with ecm on CUDA in the GIMPS context:

1) The existing code handles thousands of bits, not millions. See also page 2 near bottom, of the attachment at http://www.mersenneforum.org/showpos...91&postcount=2
2) ECM is only effective / worthwhile up to around 20 million, so is not relevant to current GIMPS wavefront exponents for first time or double check assignments, which are >80 million and >40 million respectively. Primenet will issue ECM assignments but they are for possibly finding factors of exponents <20 million.

If that's wrong, or things have changed somehow, please comment with sources (links).

Last fiddled with by kriesel on 2018-08-10 at 14:44
kriesel is online now   Reply With Quote
Old 2018-08-10, 15:52   #4
GP2
 
GP2's Avatar
 
Sep 2003

2·5·7·37 Posts
Default

As I understand it, if your goal is finding Mersenne primes, and therefore also identifying and eliminating Mersenne composites, then ECM is never effective.

ECM is only useful if you feel like finding some largish factors of Mersenne numbers that can't be found any other way. But for determining the status of an exponent, you'd spend less time doing TF, P−1 and LL. So by the time ECM is applied to an exponent, it's just for fun, and the status of that exponent has already been settled.
GP2 is offline   Reply With Quote
Old 2018-08-10, 16:14   #5
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

3·5·251 Posts
Default

Ah, I missed that this post was in the GIMPS forum. No, the existing CUDA ECM code is limited to about a thousand bits and won't work on candidate Mersenne's.
bsquared is offline   Reply With Quote
Old 2018-08-15, 20:19   #6
chris2be8
 
chris2be8's Avatar
 
Sep 2009

25×7×11 Posts
Default

Hello,

I've got another system with a GTX 460 in it and I'm trying to get ecm-gpu running on it. But I'm having problems.

The system runs Devuan Linux with CUDA V8.0.44 (the latest version in their repository) installed. But it has gcc 6.3.0 which nvcc (the CUDA compiler) does not support.

nvcc 8.0 will work with clang 3.8 so I've installed that (version 3.8.1-24) and I'm trying to compile ecm-gpu with it. But I can't work out how to get the configure script to use clang.
Code:
chris@rigel:~/ecm-gpu/trunk$ ./configure --enable-gpu=sm21 CC=clang
...
checking that CUDA Toolkit version and runtime version are the same... (8.0/8.0) yes
checking for nvcc... /usr/bin/nvcc
checking for compatibility between gcc and nvcc... no
configure: error: gcc version is not compatible with nvcc
config.log is attached.

Is this supposed to be possible? If not I'll have to try to install gcc 5.x which nvcc should support.

I'm having similar problems with msieve gpu support. That will compile everything with clang but fails at run time:
Code:
Msieve v. 1.54 (SVN 1022)
Wed Aug 15 15:16:01 2018
random seeds: 6a24db06 bcc6d56f
factoring 1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139 (100 digits)
no P-1/P+1/ECM available, skipping
commencing number field sieve (100-digit input)
commencing number field sieve polynomial selection
polynomial degree: 4
max stage 1 norm: 1.36e+17
max stage 2 norm: 3.19e+15
min E-value: 9.14e-09
poly select deadline: 1286
time limit set to 0.36 CPU-hours
expecting poly E from 1.49e-08 to > 1.71e-08
searching leading coefficients from 1 to 8000
using GPU 0 (GeForce GTX 460)
selected card has CUDA arch 2.1
cannot load library '/home/chris/msieve-svn1022.cuda/trunk/cub/sort_engine.so': /home/chris/msieve-svn1022.cuda/trunk/cub/sort_engine.so: undefined symbol: _ZNSt8ios_base4InitD1Ev
error: failed to load GPU sorting engine from "/home/chris/msieve-svn1022.cuda/trunk/cub/sort_engine.so"
I tried with an older version of msieve that used b40c for sorting, no luck:
Code:
Msieve v. 1.52 (SVN 956)
Wed Aug 15 19:36:11 2018
random seeds: 33923afe 8cb05b12
factoring 1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139 (100 digits)
no P-1/P+1/ECM available, skipping
commencing number field sieve (100-digit input)
commencing number field sieve polynomial selection
polynomial degree: 4
max stage 1 norm: 1.36e+17
max stage 2 norm: 3.19e+15
min E-value: 9.14e-09
poly select deadline: 1286
time limit set to 0.36 CPU-hours
expecting poly E from 1.49e-08 to > 1.71e-08
searching leading coefficients from 1 to 8000
using GPU 0 (GeForce GTX 460)
selected card has CUDA arch 2.1
cannot load library '/home/chris/msieve-svn.cuda/trunk/b40c/sort_engine_sm20.so': /home/chris/msieve-svn.cuda/trunk/b40c/sort_engine_sm20.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE
error: failed to load GPU sorting engine
Has anyone any ideas how to get things going?

Chris
chris2be8 is offline   Reply With Quote
Old 2018-08-15, 20:53   #7
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013
https://pedan.tech/

C7016 Posts
Default

You could always compile gcc and install it under /usr/local/.

Or you could start a virtual machine with a supported distro and do your compiles in there. nvcc works without an Nvidia card installed.
Mark Rose is offline   Reply With Quote
Old 2018-08-16, 11:52   #8
Gimarel
 
Apr 2010

4078 Posts
Default

Try to append something like NVCCFLAGS="-ccbin /usr/bin/clang-3.8" to configure.
Gimarel is offline   Reply With Quote
Old 2018-08-16, 15:51   #9
chris2be8
 
chris2be8's Avatar
 
Sep 2009

25×7×11 Posts
Default

I've managed to get ./configure --enable-gpu=sm21 --with-cuda-compiler=clang-3.8 to work. But I had to update configure line 15477 from:
NVCCFLAGS=" --compiler-bindir $cuda_compiler NVCCFLAGS"
to
NVCCFLAGS=" --compiler-bindir $cuda_compiler $NVCCFLAGS"

I think that line originally comes from acinclude.m4.

Then ./configure worked.

But when I tried to run make:
Code:
...
/bin/bash ./libtool  --tag=CC   --mode=link x86_64-linux-gnu-gcc  -g -W -Wall -Wundef -pedantic -g -O2 -fdebug-prefix-map=/build/gmp-OYLkie/gmp-6.1.2+dfsg=. -fstack-protector-strong -Wformat -Werror=format-security -O3 -DWITH_GPU -DECM_GPU_CURVES_BY_BLOCK=16   -o ecm ecm-auxi.o ecm-b1_ainc.o ecm-candi.o ecm-eval.o ecm-main.o ecm-resume.o ecm-addlaws.o ecm-torsions.o ecm-getprime_r.o aprtcle/ecm-mpz_aprcl.o ecm-memusage.o libecm.la -lgmp -lrt -lm -lm -lm -lm -lm  
libtool: link: x86_64-linux-gnu-gcc -g -W -Wall -Wundef -pedantic -g -O2 -fdebug-prefix-map=/build/gmp-OYLkie/gmp-6.1.2+dfsg=. -fstack-protector-strong -Wformat -Werror=format-security -O3 -DWITH_GPU -DECM_GPU_CURVES_BY_BLOCK=16 -o ecm ecm-auxi.o ecm-b1_ainc.o ecm-candi.o ecm-eval.o ecm-main.o ecm-resume.o ecm-addlaws.o ecm-torsions.o ecm-getprime_r.o aprtcle/ecm-mpz_aprcl.o ecm-memusage.o  ./.libs/libecm.a -lcudart -lstdc++ -lgmp -lrt -lm
/usr/bin/ld: ./.libs/libecm.a(cudakernel.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
Makefile:965: recipe for target 'ecm' failed
make[2]: *** [ecm] Error 1
make[2]: Leaving directory '/home/chris/ecm-gpu/trunk'
Makefile:1896: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/chris/ecm-gpu/trunk'
Makefile:776: recipe for target 'all' failed
make: *** [all] Error 2
Is it possible to link together code some of which is compiled with gcc and some with clang?

I'll try compiling all of gmp-ecm with clang next. But that's clutching at straws.

Chris
chris2be8 is offline   Reply With Quote
Old 2018-08-16, 18:23   #10
chris2be8
 
chris2be8's Avatar
 
Sep 2009

25×7×11 Posts
Default

Starting from a fresh download, running autoreconf -si, fixing configure as above and compiling it all with clang: ./configure --enable-gpu=sm21 --with-cuda-compiler=clang CC=clang | tee -a configure.out
make

It seems to work:
Code:
$ make check | tee -a make.check.out
Making check in x86_64
make[1]: Entering directory '/home/chris/ecm-3031/trunk/x86_64'
make[1]: Nothing to be done for 'check'.
make[1]: Leaving directory '/home/chris/ecm-3031/trunk/x86_64'
make[1]: Entering directory '/home/chris/ecm-3031/trunk'
make  ecm \
  test.pp1 test.pm1 test.ecm test.gpuecm
make[2]: Entering directory '/home/chris/ecm-3031/trunk'
make[2]: Nothing to be done for 'test.pp1'.
make[2]: Nothing to be done for 'test.pm1'.
make[2]: Nothing to be done for 'test.ecm'.
make[2]: Nothing to be done for 'test.gpuecm'.
make[2]: Leaving directory '/home/chris/ecm-3031/trunk'
make  check-TESTS
make[2]: Entering directory '/home/chris/ecm-3031/trunk'
make[3]: Entering directory '/home/chris/ecm-3031/trunk'
PASS: test.pp1
PASS: test.pm1
PASS: test.ecm
PASS: test.gpuecm
============================================================================
Testsuite summary for ecm 7.0.5-dev
============================================================================
# TOTAL: 4
# PASS:  4
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0
============================================================================
make[3]: Leaving directory '/home/chris/ecm-3031/trunk'
make[2]: Leaving directory '/home/chris/ecm-3031/trunk'
make[1]: Leaving directory '/home/chris/ecm-3031/trunk'
Now I just need to check how well the GPU works.

Chris
chris2be8 is offline   Reply With Quote
Old 2018-08-17, 16:45   #11
chris2be8
 
chris2be8's Avatar
 
Sep 2009

9A016 Posts
Default

The GPU works reasonably well for its age/price:
Code:
GMP-ECM 7.0.5-dev [configured with GMP 6.1.2, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is 1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139 (100 digits)
Using B1=250000, B2=1, sigma=3:3499317761-3:3499317872 (112 curves)
GPU: Block: 32x16x1 Grid: 7x1x1 (112 parallel curves)
Computing 112 Step 1 took 2812ms of CPU time / 36610ms of GPU time
But I still can't get msieve to work on the GPU, even after recompiling it all with clang.

@jasonp, can you suggest anything?

Chris
chris2be8 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
CUDA 5.5 ET_ GPU Computing 2 2013-06-13 15:50
AVX CPU LL vs CUDA LL nucleon GPU Computing 11 2012-01-04 17:52
Best CUDA GPU for the $$ Christenson GPU Computing 24 2011-05-01 00:06
CUDA P-1? nucleon GPU Computing 2 2010-11-17 17:52
CUDA? Xentar Conjectures 'R Us 6 2010-03-31 07:43

All times are UTC. The time now is 15:03.


Fri Jul 7 15:03:04 UTC 2023 up 323 days, 12:31, 0 users, load averages: 1.71, 1.35, 1.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔