mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GMP-ECM (https://www.mersenneforum.org/forumdisplay.php?f=55)
-   -   ECM for CUDA GPUs in latest GMP-ECM ? (https://www.mersenneforum.org/showthread.php?t=16480)

Ralf Recker 2012-02-15 19:52

The usage is simple (type ./gpu_ecm -h for help). Try:

./gpu_ecm -n 1 11000 < in

B1 is the last parameter. gpu_ecm will run of course more than one curve anyway ;)

BTW: Prime numbers don't factor very well :smile:

ET_ 2012-02-15 20:04

[QUOTE=jasonp;289470]That directory was not the trunk, [url="https://gforge.inria.fr/scm/viewvc.php/trunk/?root=ecm"]this[/url] is, complete with lots of readme files.[/QUOTE]

Thank you Jason, I skimmed around there when I noticed I had some problem...

It wasn't at all a remark to your pointing. :smile:

Luigi

ET_ 2012-02-15 20:11

[QUOTE=Ralf Recker;289473]The usage is simple (type ./gpu_ecm -h for help). Try:

./gpu_ecm -n 1 11000 < in

B1 is the last parameter. gpu_ecm will run of course more than one curve anyway ;)

BTW: Prime numbers don't factor very well :smile:[/QUOTE]

Then we definitely have different versions :sad:

When I try that command I get a "Error in call function: wrong number of arguments." message.
The "usage" reports "./gpu_ecm N B1 [ -s firstsigma ] [ -n number of curves ] [ -d device ]".

Thank you anyway, I'm doing my own tests to get acquainted with the new wonderful program. :smile:

Luigi

Ralf Recker 2012-02-15 20:15

[QUOTE=ET_;289479]Then we definitely have different versions :sad:

When I try that command I get a "Error in call function: wrong number of arguments." message.
The "usage" reports "./gpu_ecm N B1 [ -s firstsigma ] [ -n number of curves ] [ -d device ]".

Thank you anyway, I'm doing my own tests to get acquainted with the new wonderful program. :smile:

Luigi[/QUOTE]
I used the program in the gpu_ecm subdirectory, not that in the gpu_ecm_cc13 subdirectory.

The command line for the other version would be:

./gpu_ecm 65798732165875434667 11000

but like I said: I would try to factor another number ;)

ET_ 2012-02-15 20:21

[QUOTE=Ralf Recker;289480]I used the program in the gpu_ecm subdirectory, not that in the gpu_ecm_cc13 subdirectory.[/QUOTE]

I guessed it when I noticed that your work was done on cc=2.0 ans cc=2.1, but for some reason I thought that the repository was updated with the same code, apart from the cc details...

Thank you all, I (think I) understood how the program works. I have been a bit naif in thinking that an alpha version should maintain the same user interface of the trunk. The problem was on my side, between the monitor and the chair... Next time I will turn on my gray cells before writing.

(and please excuse me for the multiple postings)

Luigi

Ralf Recker 2012-02-15 20:46

[QUOTE=ET_;289481]I guessed it when I noticed that your work was done on cc=2.0 ans cc=2.1, but for some reason I thought that the repository was updated with the same code, apart from the cc details...[/QUOTE]
You should be able to compile and run both versions on your CC 1.3 capable card (you can type make cc=2 in the gpu_ecm subdirectory, if you want a fermi build).

debrouxl 2012-02-15 21:18

I've switched to CC 2.0 compilation as well, and the default number of curves has raised from 32 to 64 - same change as xilman above.

I haven't yet seen a mention of non-power of 2 NB_DIGITS in this thread... therefore, I tried it, even if I have no idea whether it should work :smile:
Well, at least, it does not seem to fail horribly:
* the resulting executable doesn't crash;
* the size of the executable is between the size of the 512-bit version and the size of the 1024-bit version;
* on both a C211 and a C148, the 768-bit version is faster than the 1024-bit-arithmetic version:

[code]$ echo 7666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666663 | ./gpu_ecm_24 -vv -save 76663_210_ecm24_3e6 3000000
#Compiled for a NVIDIA GPU with compute capability 2.0.
#Will use device 0 : GeForce GT 540M, compute capability 2.1, 2 MPs.
#s has 4328086 bits
Precomputation of s took 0.252s
Input number is 7666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666663 (212 digits)
Using B1=3000000, firstinvd=435701810, with 64 curves
...
gpu_ecm took : 1444.690s (0.000+1444.686+0.004)
Throughput : 0.044

$ echo 7666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666663 | ./gpu_ecm_32 -vv -save 76663_210_ecm32_3e6 3000000
...
gpu_ecm took : 1814.801s (0.000+1814.797+0.004)
Throughput : 0.035[/code]

[code]for i in 16 24 32; do echo 3068628376360794912078530386432442844396649484227245118385713667577336042284107359110543525586164007547649873239035755922916136752709773803297694127 | "./gpu_ecm_$i" -vv -save "80009_213_ecm${i}_3e6" 3000000; done
...
gpu_ecm took : 865.578s (0.000+865.574+0.004)
Throughput : 0.074
...
gpu_ecm took : 1707.302s (0.000+1707.298+0.004)
Throughput : 0.037
...
gpu_ecm took : 2044.451s (0.000+2044.447+0.004)
Throughput : 0.031
[/code]


Comparison against CPU GMP-ECM running on 1 hyperthread of a SandyBridge i7, whose other 7 hyperthreads are used to the max as well:
[code]$ echo 7666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666663 | ecm -c 1 3e6
GMP-ECM 6.5-dev [configured with GMP 5.0.90, --enable-asm-redc, --enable-assert] [ECM]
Input number is 7666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666663 (211 digits)
Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=1718921992
Step 1 took 34590ms
Step 2 took 11536ms[/code]

[code]$ echo 3068628376360794912078530386432442844396649484227245118385713667577336042284107359110543525586164007547649873239035755922916136752709773803297694127 | ecm -c 1 3e6
GMP-ECM 6.5-dev [configured with GMP 5.0.90, --enable-asm-redc, --enable-assert] [ECM]
Input number is 3068628376360794912078530386432442844396649484227245118385713667577336042284107359110543525586164007547649873239035755922916136752709773803297694127 (148 digits)
Using B1=3000000, B2=5706890290, polynomial Dickson(6), sigma=3766168691

Step 1 took 21521ms
Step 2 took 8016ms[/code]

For composites of those sizes, the GT 540M can beat one hyperthread of i7-2670QM if the CPU is busy, but not if the CPU is idle.

frmky 2012-02-15 23:08

Looking at the source, does NB_DIGITS really need to be a power of two? I haven't thought carefully about the memory access patterns, but it doesn't seem to need to be. And does it really need to be a compile-time constant? If the answer to both is no, then the code can adjust NB_DIGITS to the minimum needed for a particular number. Without doing any profiling, for numbers much smaller than the max allowed for a particular NB_DIGITS, I suspect a lot of time is spent spinning in the comparison function.

firejuggler 2012-02-24 21:56

i must have done something wrong... (ubuntu 10.04 on a virtualbox, can't install nvidia driver)
[code]
./gpu_ecm 155 11000 -s 11 -n 5
#Compiled for a NVIDIA GPU with compute capability 1.3.
#Will use device -1 : �@, compute capability 18927808.0 (you should compile the program for this compute capability to be more efficient), 0 MPs.
#gpu_ecm launched with :
N=155
B1=11000
curves=5
firstsigma=11

#Begin GPU computation...
#All kernels launched, waiting for results...
#All kernels finished, analysing results...
#Looking for factors for the curves with sigma=11
xfin=4
zfin=1
xunif=4
#No factors found. You shoud try with a bigger B1.
#Looking for factors for the curves with sigma=12
xfin=132
zfin=1
xunif=132
#No factors found. You shoud try with a bigger B1.
#Looking for factors for the curves with sigma=13
xfin=153
zfin=1
xunif=153
#No factors found. You shoud try with a bigger B1.
#Looking for factors for the curves with sigma=14
xfin=1
zfin=1
xunif=1
#No factors found. You shoud try with a bigger B1.
#Looking for factors for the curves with sigma=15
xfin=80
zfin=1
xunif=80
#No factors found. You should try with a smaller B1
#Results : No factor found

#Temps gpu : 0.050 init&copy=0.000 computation=0.050
[/code]

how come zfin is always=1?

Cyril 2012-02-24 23:13

Hey

As said above the more recent GPU-ECM program is in the gpu_ecm subdirectory (and NOT gpu_ecm_cc13 even for cards of compute compatibility 1.3).

The NB_DIGITS stuff is still highly experimental and will most of the time either crash or return wrong results. Only the 1024 arithmetic (the default case) is, for now, working. But every feedbacks from experiments with NB_DIGITS are welcome.

Cyril

firejuggler 2012-02-25 04:14

1 Attachment(s)
I searched on the web a little more and I found that the virtualbox can't handle cuda. So I installed a proper ubuntu, and now it seem to work.
another problem I found is that a found factor can be repeated... multiple time

a short excerpt
running ./gpu_ecm -n 100 -vv 1000 < c50 >test.txt
[code]
#Compiled for a NVIDIA GPU with compute capability 2.0.
#Will use device 0 : GeForce GTX 560, compute capability 2.1, 7 MPs.
#s has 1438 bits
Precomputation of s took 0.000s
Input number is 35969183562720316973971642318240294003662279400539 (50 digits)
Using B1=1000, firstinvd=3505000919, with 96 curves
8+64*d=2290718356455159082183201030488769570180030093771
8+64*d=7718354218174420648608739771694329402521941399016
#Begin GPU computation...
Block: 32x32x1 Grid: 3x1x1

#Looking for factors for the curves with (d*2^32) mod N = 3505000919
xfin=29784982820960298529336575761344388540459270300965
zfin=34069908628801445601848672255228786699573682666789
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505000919
xunif=29784982820960298529336575761344388540459270300965
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505000920
********** Factor found in step 1: 3
Found probable prime factor of 1 digits: 3
Composite cofactor 11989727854240105657990547439413431334554093133513 has 50 digits
Factor found with (d*2^32) mod N = 3505000921
********** Factor found in step 1: 159
Found composite factor of 3 digits: 159
Composite cofactor 226221280268681238830010329045536440274605530821 has 48 digits

[snip]
[/snip]Factor found with (d*2^32) mod N = 3505001007
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505001008
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505001009
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505001010
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505001011
********** Factor found in step 1: 78069
Found composite factor of 6 digits: 78069
Composite cofactor 460735805027864030203687024532660774490031631 has 45 digits
Factor found with (d*2^32) mod N = 3505001012
********** Factor found in step 1: 1473
Found composite factor of 4 digits: 1473
Composite cofactor 24418997666476793600795412300231021047971676443 has 47 digits
Factor found with (d*2^32) mod N = 3505001013

#Looking for factors for the curves with (d*2^32) mod N = 3505001014
xfin=27705706019293510342398168717956449375083997353432
zfin=7141519035842313136767879128655301632201227912190
********** Factor found in step 1: 159
Found composite factor of 3 digits: 159
Composite cofactor 226221280268681238830010329045536440274605530821 has 48 digits
Factor found with (d*2^32) mod N = 3505001014
xunif=27705706019293510342398168717956449375083997353432
gpu_ecm took : 0.800s (0.000+0.800+0.000)
Throughput : 120.000
[/code]Another question : is sigma random?


All times are UTC. The time now is 19:27.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.