![]() |
After looking carefully at the assembly code produced by CUDA 5.0, I found a bug in CUDA 5.0 and report it to Nvidia people. They are looking into it. In the meantime [U][B]I recommend not to use CUDA 5.0 to compile the GPU version of GMP-ECM.[/B][/U] (if you manage to succefully run make check with a version compile with CUDA 5.0, i'll be interested to hear about it).
Cyril |
I noticed that on [url]https://gforge.inria.fr/scm/viewvc.php/trunk/?root=ecm[/url] there is the ecm /trunk relative to version 6.4.2 of source code, while on [url]http://www.loria.fr/~zimmerma/records/ecmnet.html[/url] the stable version is still 6.2.2, [URL="https://gforge.inria.fr/frs/?group_id=135&release_id=7362#ecm-_6.4.3b-title-content"]This[/URL] page reports the source code of version 6.4.3 [U]without[/U] GPU extensions, and there are other precompiled executables over the forum.
What should I do if I need the complete 6.4.3 source code [U]with[/U] the GPU extensions? Luigi |
The last release is the 6.4.3 but it does not contain any GPU code. The GPU version of GMP-ECM is only, for now, in the development version of GMP-ECM. If you want to try it you should download the svn repository. But be aware that being a development version, it should not be used for important computations but only for test.
Cyril |
[QUOTE=Cyril;324284]The last release is the 6.4.3 but it does not contain any GPU code. The GPU version of GMP-ECM is only, for now, in the development version of GMP-ECM. If you want to try it you should download the svn repository. But be aware that being a development version, it should not be used for important computations but only for test.
Cyril[/QUOTE] Thank you Cyril, I understand. Luigi |
[QUOTE=Cyril;324247]After looking carefully at the assembly code produced by CUDA 5.0, I found a bug in CUDA 5.0 and report it to Nvidia people. They are looking into it. In the meantime [U][B]I recommend not to use CUDA 5.0 to compile the GPU version of GMP-ECM.[/B][/U] (if you manage to succefully run make check with a version compile with CUDA 5.0, i'll be interested to hear about it).
Cyril[/QUOTE] From revision 2342, CUDA 5.0 can be used to compile GMP-ECM for GPU. The "bug" was that I use the carry flag inside assembly statement not protected by __volatile__. This did not raise any problem with CUDA 4.2, but was incorrect when compile with CUDA 5.0. Cyril |
[QUOTE=xilman;322498]A very long standing problem with the GPU version on my machine is that the second test in "make check" fails by finding the input number. I returned to the issue today.
I've not fixed the bug but have characterized it better and probably have a work-around. The failure appears to be in how stage2 is set up after running stage1 on the GPU. If only stage1 is run and a save file created, that file can be used successfully to complete the factorization. The proposed work-around should now be obvious. When I better understand the stage1 to stage2 conversion routines I'll try to fix the bug properly. My best guess is that something is not being initialised properly from a default zero value. Another trivial bug was also fixed in the latest SVN --- 2310 --- which prevented compilation under CUDA. Cyril has been informed about these developments.[/QUOTE] Can you try to run make check with the GPU code enable with SVN --- 2396 --- ? Does the error still happen ? Cyril |
[QUOTE=Cyril;329114]Can you try to run make check with the GPU code enable with SVN --- 2396 --- ? Does the error still happen ?
Cyril[/QUOTE] I just did it, SVN 2478. I'm afraid the bug is still there :sad: It works fine when a factor is found in Step 1: [code] luigi@luigi-ubuntu:~/luigi/CUDA/gpu-ecm/trunk$ echo 2432902008176640001 | ./ecm -gpu -v 1000 GMP-ECM 7.0-dev [configured with MPIR 2.5.1, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM] Running on luigi-ubuntu Input number is 2432902008176640001 (19 digits) Using MODMULN [mulredc:0, sqrredc:0] Computing batch product (of 1438 bits) of primes below B1=1000 took 0ms GPU: compiled for a NVIDIA GPU with compute capability 2.0. GPU: will use device 0: GeForce GTX 580, compute capability 2.0, 16 MPs. GPU: Selection and initialization of the device took 116ms Using B1=1000, B2=51606, sigma=3:2200948026-3:2200948537 (512 curves) dF=32, k=6, d=240, d2=7, i0=-2 Expected number of curves to find a factor of n digits: 35 40 45 50 55 60 65 70 75 80 1.3e+11 Inf Inf Inf Inf Inf Inf Inf Inf Inf Computing 512 Step 1 took 72ms of CPU time / 2528ms of GPU time Throughput: 202.555 curves by second (on average 4.94ms by Step 1) ********** Factor found in step 1: 20639383 Found probable prime factor of 8 digits: 20639383 Probable prime cofactor 117876683047 has 12 digits ********** Factor found in step 1: 117876683047 Found input number N [/code] But when I lower the B1 parameter, I got this: [code] luigi@luigi-ubuntu:~/luigi/CUDA/gpu-ecm/trunk$ echo 2432902008176640001 | ./ecm -gpu 20 GMP-ECM 7.0-dev [configured with MPIR 2.5.1, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM] Input number is 2432902008176640001 (19 digits) Using B1=20, B2=210, sigma=3:2107982176-3:2107982687 (512 curves) Computing 512 Step 1 took 72ms of CPU time / 157ms of GPU time ********** Factor found in step 2: 2432902008176640001 Found input number N [/code] Luigi |
If anyone is still looking for windows binaries of the gpu version I posted some at [url]http://www.mersenneforum.org/showpost.php?p=338530&postcount=9[/url].
|
[QUOTE=mklasson;338532]If anyone is still looking for windows binaries of the gpu version I posted some at [URL]http://www.mersenneforum.org/showpost.php?p=338530&postcount=9[/URL].[/QUOTE]
Thanks for the upload! I'm having trouble. It finds the input number in stage 2 sometimes, but it never finds a factor. Rarely it will find a factor in stage 1. It doesn't seem to be using stage 2 right. What am I doing wrong? [CODE]gpu_ecm.exe -n -v -gpu -gpucurves 32 -one -c 32 11000 <factorme.txt >> log.txt pause[/CODE]Finding input number in stg2: [CODE]GMP-ECM 7.0-dev [configured with MPIR 2.6.0, --enable-gpu] [ECM] Input number is (3^467-1)/2 (223 digits) Using MODMULN [mulredc:0, sqrredc:1] Computing batch product (of zu bits) of primes below B1=0 took 0ms GPU: compiled for a NVIDIA GPU with compute capability 2.0. GPU: will use device 0: GeForce GT 440, compute capability 2.1, 2 MPs. GPU: Selection and initialization of the device took 0ms Using B1=11000, B2=1873422, sigma=3:3900211286-3:3900211317 (32 curves) dF=256, k=3, d=2310, d2=13, i0=-8 Expected number of curves to find a factor of n digits: 35 40 45 50 55 60 65 70 75 80 4298501 2.8e+008 2.2e+010 Inf Inf Inf Inf Inf Inf Inf Computing 32 Step 1 took 93ms of CPU time / 2143ms of GPU time Throughput: 14.934 curves by second (on average 66.96ms by Step 1) Using 27 small primes for NTT Estimated memory usage: 1800K Initializing tables of differences for F took 0ms Computing roots of F took 0ms Building F from its roots took 16ms Computing 1/F took 0ms Initializing table of differences for G took 0ms Computing roots of G took 0ms Building G from its roots took 0ms Computing roots of G took 0ms Building G from its roots took 15ms Computing G * H took 0ms Reducing G * H mod F took 16ms Computing roots of G took 0ms Building G from its roots took 0ms Computing G * H took 0ms Reducing G * H mod F took 16ms Computing polyeval(F,G) took 15ms Computing product of all F(g_i) took 0ms Step 2 took 78ms ********** Factor found in step 2: 3270362983146927377028671682671960437107912062834199545091118347495738861506779345734946890846481108479144446587334081489280966903453000172689482880397175599299632714862220456046073976859568442978416930175676229727557533293 Found input number N [/CODE]Finding a factor in stg1: [CODE]GMP-ECM 7.0-dev [configured with MPIR 2.6.0, --enable-gpu] [ECM] Input number is (3^467-1)/2 (223 digits) Using MODMULN [mulredc:0, sqrredc:1] Computing batch product (of zu bits) of primes below B1=0 took 0ms GPU: compiled for a NVIDIA GPU with compute capability 2.0. GPU: will use device 0: GeForce GT 440, compute capability 2.1, 2 MPs. GPU: Selection and initialization of the device took 0ms Using B1=11000, B2=1873422, sigma=3:2535707131-3:2535707162 (32 curves) dF=256, k=3, d=2310, d2=13, i0=-8 Expected number of curves to find a factor of n digits: 35 40 45 50 55 60 65 70 75 80 4298501 2.8e+008 2.2e+010 Inf Inf Inf Inf Inf Inf Inf Computing 32 Step 1 took 140ms of CPU time / 2143ms of GPU time Throughput: 14.933 curves by second (on average 66.97ms by Step 1) ********** Factor found in step 1: 27836167022857 Found probable prime factor of 14 digits: 27836167022857 Probable prime cofactor ((3^467-1)/2)/27836167022857 has 210 digits [/CODE] |
Oh, right, I noticed some full Ns in stage 2 as well, but figured I was just unlucky...
You might be right that there's some problem though. Alas, I have no idea if it's specific to my build, or how to fix it if it is. |
[QUOTE=mklasson;338546]Oh, right, I noticed some full Ns in stage 2 as well, but figured I was just unlucky...
You might be right that there's some problem though. Alas, I have no idea if it's specific to my build, or how to fix it if it is.[/QUOTE] It is definitely not specific to your build. I ran into the same "feature" with my Linux build, and told the thread about it. Luigi |
| All times are UTC. The time now is 04:22. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.