mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GMP-ECM (https://www.mersenneforum.org/forumdisplay.php?f=55)
-   -   ECM for CUDA GPUs in latest GMP-ECM ? (https://www.mersenneforum.org/showthread.php?t=16480)

Karl M Johnson 2012-01-27 21:27

ECM for CUDA GPUs in latest GMP-ECM ?
 
Greetings.

While working on gmp-ecm, I discovered a "gpu" folder inside latest svn.
Can somebody shed some light on it?
Is it a fully-working gmp-ecm implementation ?
Or partially gpu-accelerated ?
Any info will do!

xilman 2012-01-27 21:47

[QUOTE=Karl M Johnson;287461]Greetings.

While working on gmp-ecm, I discovered a "gpu" folder inside latest svn.
Can somebody shed some light on it?
Is it a fully-working gmp-ecm implementation ?
Or partially gpu-accelerated ?
Any info will do![/QUOTE]I'm playing with it. Not yet found a new factor with it. Current activity is to work out what's going on and then to see whether I can contribute to the effort. So far I've found areas where I may be able to help Cyril with development.

So far, it is Stage 1 only (as one should expect) and each bit in the product of prime powers up to B1 requires a separate kernel call. It would be easy enough to make the entire sequence of elliptic curve arithmetic operations a single kernel but not obviously a good idea (think about it).

The code currently uses fixed kilobit arithmetic and so is limited to factoring integers somewhat under that size. One of the areas I may be able to help is to add flexibility in that regard. Another is improve the underlying arithmetic primitives. A third is to reduce the (presently extortionate IMO) amount of cpu time used by busy-waiting for the kernels to complete.


Paul

Karl M Johnson 2012-01-28 06:55

Oh, alright.
It cant substitute gmp-ecm yet.

Stage2 on GPUs is not very possible(except Teslas) because it requires a lot of RAM ?

xilman 2012-02-10 15:35

[QUOTE=Karl M Johnson;287497]Oh, alright.
It cant substitute gmp-ecm yet.[/QUOTE]Actually it can. I just found my first factor with gpu-ecm :surprised:[code]

Resuming ECM residue saved by pcl@anubis.home.brnikat.com with GPU-ECM 0.1
Input number is 27199999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 (275 digits)
Using B1=110000000-0, B2=110000000-776278396540, polynomial Dickson(30), A=112777948379516601562499999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999998
Step 1 took 0ms
Step 2 took 176954ms
********** Factor found in step 2: 7219650603145651593481420276356225303436099
Found probable prime factor of 43 digits: 7219650603145651593481420276356225303436099
Composite cofactor 3767495339476507669490528975999036413199084363632138583182977316467718682572689799651214383863597597569613063674073098348731470273762390945944125050829513627232967008803673015815350396060271580465332444072090424114938478394067646101 has 232 digits[/code]
A total of 1792 curves were run at B1=110M and the factor found on the 1349th second stage. I'm running the remainder in case another factor can be found.

Each stage one took 70 seconds on a GTX-460. The latest ECM takes 679 seconds per stage 1 on a single core of a 1090T clocked at 3264MHz, so the GPU version is close to 10 times faster in this situation.

Paul

Brain 2012-02-10 15:46

GPU ECM 0.1
 
Could anybody provide a (link to a) (Windows 64) binary for GPU-ECM 0.1?

xilman 2012-02-10 16:08

[QUOTE=xilman;288906]Actually it can. I just found my first factor with gpu-ecm[/QUOTE]Surprise is no longer adequate, and I'm forced to resort to astonishment :shock:[code]

Using B1=110000000-0, B2=110000000-776278396540, polynomial Dickson(30), A=113233923912048339843749999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999998
Step 1 took 0ms
Step 2 took 176408ms
********** Factor found in step 2: 2315784840580190375316972295830305082761
Found probable prime factor of 40 digits: 2315784840580190375316972295830305082761
Composite cofactor 11745478044145667127387304121681137574169838448099828222193156280305822325912582394902125137121744003039884215185197839759773264126893037178852537820600462333360764644042755609744438099938488940900666235577910910744169710591419476281159 has 236 digits[/code] took another 18 curves.

If I was unlucky not to find the p43 earlier than this, given the amount of ECM work performed, I was especially unlucky not to find the p40. The candidate number, GW(10,272) had no previously known factors despite having had a complete t40 run through the ECMNET server here and clients around the world.

The cofactor is now c193 so it's worth seeing whether the remaining curves will find anything.


Paul

xilman 2012-02-10 16:23

[QUOTE=Brain;288909]Could anybody provide a (link to a) (Windows 64) binary for GPU-ECM 0.1?[/QUOTE]Not me.

pinhodecarlos 2012-02-10 16:25

[QUOTE=xilman;288913]Not me.[/QUOTE]

And for linux?:bow:

xilman 2012-02-10 16:56

[QUOTE=pinhodecarlos;288914]And for linux?:bow:[/QUOTE]If you have Linux you can build from the SVN sources as easily as I can.

The process really is very straightforward and you'll end up with something which doesn't carry the risk of the Linux equivalent of DLL-hell.

If you really are not lazy(*) enough to build your own, I could make available the binary I use. No guarantees that it will work, or even run, on any other Linux system. It almost certainly won't work optimally unless you have exactly the same environment as me.

Paul

* Sometimes it's much better to do some work ahead of time to remove the need to do much more work later. That's true laziness.

Karl M Johnson 2012-02-10 21:30

Windows binary wanted:smile:

R.D. Silverman 2012-02-11 00:07

[QUOTE=xilman;288906]Actually it can. I just found my first factor with gpu-ecm :surprised:[code]

Resuming ECM residue saved by pcl@anubis.home.brnikat.com with GPU-ECM 0.1
Input number is 27199999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 (275 digits)
Using B1=110000000-0, B2=110000000-776278396540, polynomial Dickson(30), A=112777948379516601562499999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999998
Step 1 took 0ms
Step 2 took 176954ms
********** Factor found in step 2: 7219650603145651593481420276356225303436099
Found probable prime factor of 43 digits: 7219650603145651593481420276356225303436099
Composite cofactor 3767495339476507669490528975999036413199084363632138583182977316467718682572689799651214383863597597569613063674073098348731470273762390945944125050829513627232967008803673015815350396060271580465332444072090424114938478394067646101 has 232 digits[/code]
A total of 1792 curves were run at B1=110M and the factor found on the 1349th second stage. I'm running the remainder in case another factor can be found.

Each stage one took 70 seconds on a GTX-460. The latest ECM takes 679 seconds per stage 1 on a single core of a 1090T clocked at 3264MHz, so the GPU version is close to 10 times faster in this situation.

Paul[/QUOTE]

Awesome.

Is the code specific to a particular GPU? How portable is it?


All times are UTC. The time now is 18:46.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.