![]() |
|
|
#1 |
|
Sep 2006
The Netherlands
3·269 Posts |
Status update.
Basically comes down to doing the BSGS algorithm to crack the discrete algorithm on the GPU. Past month and a little bit more developing at Nvidia GTX580. Though testing it currently for k * 2^n - 1, should work with minor changes for similar formula's with a single k. Speeds up more and more yet goes step by step to get it faster. Initially was slower than newpgen. Right for n-range is 7 million, it's about 17x faster than newpgen at a single CPU core here. For smaller n-ranges it's linear faster. For n-range about 4 million it'll be a 30x faster than newpgen. This is at a GTx580. Trying to speed it up. Basically busy saving out cache usage. Hope to post code within a few weeks, maybe sooner that works a little. At a remote GTX980 in the States i tested a little - yet will require total new kernel. Those pictures they draw of the Kepler on homepages online are marketing pictures. Right now is considerable slower than GTX580 - yet with special kernel doing 128 streamcores in a single kernel, instead of 32, should speedup nearly factor 4, though for now factor 2 would be nice... Fermi, Maxwell and every GPU generation will require its own kernel. Right now is Fermi kernel. That means it runs of course at all those GPU's, yet doesn't benefit from the architecture of Maxwell right now. Will come! As Fermi (4xx and 5xx series) has 32 streamcores in a single multiprocessor, the 6xx series has 192 streamcores in a single multiprocessor (big problem) and Maxwell has 128 streamcores in a single multiprocessor and where Fermi and Maxwell have similar L1 datacache, the 6xx series has weirdo design of its own. Using: primesieve and intrinsics from TheJudger. To be continued. What's a good spot to upload working source codes to so everyone can download it? Regards, Vincent Diepeveen |
|
|
|
|
|
#2 |
|
Sep 2006
The Netherlands
80710 Posts |
Looking for someone with Kepler series GPU to benchmark.
New kernel regrettably not faster at my GTX580 than old kernel - yet it is a whopping 3x faster nearly now at a GTX980 - old kernel was slower there. |
|
|
|
|
|
#3 |
|
Jul 2003
So Cal
A6716 Posts |
I can try it on a K20c...
|
|
|
|
|
|
#4 |
|
Sep 2006
The Netherlands
3×269 Posts |
|
|
|
|
|
|
#5 |
|
Aug 2002
52510 Posts |
I can run it on a GTX 750 TI if you send it to me.
|
|
|
|
|
|
#6 |
|
Sep 2006
The Netherlands
3×269 Posts |
Oops my email address is: diep @ xs4all . nl
Forgot to write the x here before - apologies for that... |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| fbncsieve - a new fixed n sieve | rogue | Software | 38 | 2018-02-11 00:08 |
| A siever for K (b, n, c fixed)? | pepi37 | Software | 7 | 2015-07-10 04:42 |
| Sieving k*2^n-1 With Fixed n | c10ck3r | Riesel Prime Search | 14 | 2013-02-03 00:19 |
| User interface bug fixed on LLR V3.8.4 | Jean Penné | Software | 0 | 2011-01-22 16:47 |
| KEP is reporting computer fixed | KEP | Twin Prime Search | 3 | 2007-02-13 18:29 |