mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   Sieving k * 2^n +- c with Nvidia GPU's for fixed k (https://www.mersenneforum.org/showthread.php?t=21506)

diep 2016-08-15 21:42

Sieving k * 2^n +- c with Nvidia GPU's for fixed k
 
Status update.

Basically comes down to doing the BSGS algorithm to crack the discrete algorithm on the GPU.

Past month and a little bit more developing at Nvidia GTX580.

Though testing it currently for k * 2^n - 1, should work with minor changes for similar formula's with a single k.

Speeds up more and more yet goes step by step to get it faster. Initially was slower than newpgen.

Right for n-range is 7 million, it's about 17x faster than newpgen at a single CPU core here. For smaller n-ranges it's linear faster. For n-range about 4 million it'll be a 30x faster than newpgen. This is at a GTx580.

Trying to speed it up. Basically busy saving out cache usage. Hope to post code within a few weeks, maybe sooner that works a little.

At a remote GTX980 in the States i tested a little - yet will require total new kernel. Those pictures they draw of the Kepler on homepages online are marketing pictures. Right now is considerable slower than GTX580 - yet with special kernel doing 128 streamcores in a single kernel, instead of 32, should speedup nearly factor 4, though for now factor 2 would be nice...

Fermi, Maxwell and every GPU generation will require its own kernel.

Right now is Fermi kernel. That means it runs of course at all those GPU's, yet doesn't benefit from the architecture of Maxwell right now. Will come!

As Fermi (4xx and 5xx series) has 32 streamcores in a single multiprocessor,
the 6xx series has 192 streamcores in a single multiprocessor (big problem)
and Maxwell has 128 streamcores in a single multiprocessor and where Fermi and Maxwell have similar L1 datacache, the 6xx series has weirdo design of its own.

Using: primesieve and intrinsics from TheJudger.

To be continued. What's a good spot to upload working source codes to so everyone can download it?

Regards,
Vincent Diepeveen

diep 2016-08-29 23:27

Looking for someone with Kepler series GPU to benchmark.

New kernel regrettably not faster at my GTX580 than old kernel - yet it is a whopping 3x faster nearly now at a GTX980 - old kernel was slower there.

frmky 2016-08-31 21:03

I can try it on a K20c...

diep 2016-08-31 22:06

[QUOTE=frmky;441247]I can try it on a K20c...[/QUOTE]

Cool give me an email address i didn't figure out yet how to attach files here. it's 'nearly' in production state now.

or give me a mail at diep at s4all dot nl then i return the source.
consider GPL3 above it. Spread the word.

Joe O 2016-09-23 16:26

I can run it on a GTX 750 TI if you send it to me.

diep 2016-09-23 19:19

Oops my email address is: diep @ xs4all . nl

Forgot to write the x here before - apologies for that...


All times are UTC. The time now is 22:17.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.