mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > Riesel Prime Search

Reply
 
Thread Tools
Old 2013-09-25, 17:54   #12
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

10101000102 Posts
Default

holy smoke!

Even if i would have the time to get something going at the Tesla's here i wouldn't get to those ranges any quick :)

Toying with the alphabet now, especially (un)abcd :)
diep is offline   Reply With Quote
Old 2013-09-25, 23:43   #13
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2A216 Posts
Default

Moved it from sieving to testing.

Using sllr64 here right now at CPU hardware (Xeon L5420),
tested as fastest at the CPU hardware.

I remember Jean Penne busy with some gpgpu software, how did that progress lately; has Riesel Prime Search already a public version of that?

Got some Tesla's here. They idle now :)

Last fiddled with by diep on 2013-09-25 at 23:44
diep is offline   Reply With Quote
Old 2013-09-26, 04:04   #14
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

22·7·149 Posts
Default

CUDA-LLR is available, and in my experience stable. It only uses power-of-2 FFT sizes, and speed improves with larger exponents. The main FFT jump we care about is just over 3M for k=69, so your Teslas would be most useful in the upper 2M range, or over 5M (relative to CPU workers, that is).

Check in the hardware/GPU computing forum- I didn't see the thread when I glanced, but I've been running the program for over a year, even found a prime for k=5 with it in the 3-megabit range.
-Curtis
VBCurtis is offline   Reply With Quote
Old 2013-09-27, 01:19   #15
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2·337 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
CUDA-LLR is available, and in my experience stable. It only uses power-of-2 FFT sizes, and speed improves with larger exponents. The main FFT jump we care about is just over 3M for k=69, so your Teslas would be most useful in the upper 2M range, or over 5M (relative to CPU workers, that is).

Check in the hardware/GPU computing forum- I didn't see the thread when I glanced, but I've been running the program for over a year, even found a prime for k=5 with it in the 3-megabit range.
-Curtis
Thx Curtis, i downloaded it. Will try to get it to work!

Is that power of 2 the only 'disadvantage' over the IBDWT in SSE2 i got running currently?
I tend to remember how my own FFT implementation that also used power of 2 had another few disadvantages (let's say it polite) :)

The tesla's i got here are 0.5 Tflop in theory (of course that's always 2x more than it can do in terms of instructions, they always assume you can use multiply-add, not sure whether this FFT can), looking forward benchmarking it for this code!

Note it would be possible at Nvidia to run at each SIMD a different code stream. I don't know whether it still can deliver 0.5 Tflop doing that, yet if it can, should be easier to get rid of that power of 2 sized FFT? Maybe?

Last fiddled with by diep on 2013-09-27 at 01:26
diep is offline   Reply With Quote
Old 2013-09-27, 02:55   #16
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

22·7·149 Posts
Default

I don't recall what msft (user name, not company) said about the limitations of his code- I believe he stopped development shortly after he got it working, in favor of an OpenCL version for the other half of the GPUniverse.

I happen to have plenty of work available near 3M, so I haven't considered alternatives.
VBCurtis is offline   Reply With Quote
Old 2014-01-05, 15:03   #17
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

12428 Posts
Default

hi,

I found a prime, maybe some want to verify it is prime.
How to properly report it?

69 * 2 ^ 2649939 - 1 was found prime here!

Thanks,
Vincent

diep@xs4all.nl in case i don't respond quickly at forum.
diep is offline   Reply With Quote
Old 2014-01-05, 15:56   #18
Kosmaj
 
Kosmaj's Avatar
 
Nov 2003

2·1,811 Posts
Default

Hi diep

Congratulations!
To report it please create a new prover's code including RPS, Psieve, Srsieve and the software you used to prove it prime like LLR.

Thanks!
Kosmaj is offline   Reply With Quote
Old 2014-01-05, 21:11   #19
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

67410 Posts
Default

Tried all that, let me know if worked out ok. Thanks!

Paul Underwood verified with pfgw and confirms in meantime.
diep is offline   Reply With Quote
Old 2014-01-05, 21:24   #20
pinhodecarlos
 
pinhodecarlos's Avatar
 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK

33×132 Posts
Default

Quote:
Originally Posted by diep View Post
Tried all that, let me know if worked out ok. Thanks!

Paul Underwood verified with pfgw and confirms in meantime.
It's correct.
http://primes.utm.edu/primes/page.php?id=116841

Last fiddled with by pinhodecarlos on 2014-01-05 at 21:25
pinhodecarlos is offline   Reply With Quote
Old 2014-01-05, 22:20   #21
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

10101000102 Posts
Default

thanks for verifying!
diep is offline   Reply With Quote
Old 2014-01-14, 17:21   #22
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

2·337 Posts
Default

At the L5420 Xeon machines i have here at home, i had seen a pretty big jump in testing time moving up from roughly 2.74Mbit to 2.76 mbit

Testing times increased roughly from 6123 seconds to 7689 seconds.

Each CPU has 12 MB L2 cache.
So to speak 3MB a core
Seems it's the transform causing it, not the hardware.

Not sure about transform size internal.

If it stores 2.75M bits and assume 18 bits per double then it would require
an array sized 2.75mbits * 64 / (18 * 8 bits per byte) = 2.75 * 8 / 18 = 1.2 MB

Even double that would easily fit in L2.

At what mbit level can i again expect a big dang like that?

Is that at double this size at 5.5 Mbit?
diep is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 06:48.

Wed May 27 06:48:46 UTC 2020 up 63 days, 4:21, 0 users, load averages: 1.67, 1.39, 1.33

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.