mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-05-16, 00:30   #771
KyleAskine
 
KyleAskine's Avatar
 
Oct 2011
Maryland

12216 Posts
Default

I am going to start it tonight, and will let everyone know at around 5am tomorrow morning (EDT).
KyleAskine is offline   Reply With Quote
Old 2013-05-16, 09:18   #772
KyleAskine
 
KyleAskine's Avatar
 
Oct 2011
Maryland

2·5·29 Posts
Default

Cayman (6950 with Bios changed to 6970):

Selftest statistics
number of tests 318416
successful tests 318416

selftest PASSED!
KyleAskine is offline   Reply With Quote
Old 2013-05-16, 09:31   #773
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

59710 Posts
Default

Quote:
Originally Posted by KyleAskine View Post
Cayman (6950 with Bios changed to 6970):

Selftest statistics
number of tests 318416
successful tests 318416

selftest PASSED!
Very nice , thanks!

I will do some "paperwork" and packaging for releasing v0.13. Until then, 0.13pre5 should be stable enough for "productive work".
Bdot is offline   Reply With Quote
Old 2013-05-16, 10:23   #774
KyleAskine
 
KyleAskine's Avatar
 
Oct 2011
Maryland

12216 Posts
Default

Also, just so you know, it passed on my 6570 as well (but it took 15 hours(!!))!


Selftest statistics
number of tests 318416
successful tests 318416

selftest PASSED!

Last fiddled with by KyleAskine on 2013-05-16 at 10:24
KyleAskine is offline   Reply With Quote
Old 2013-05-16, 15:06   #775
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

216810 Posts
Default

Passed on my 7770.

Selftest statistics
number of tests 287358
successful tests 287358

selftest PASSED!

EDIT: Strange, did I miss some tests?

Last fiddled with by kracker on 2013-05-16 at 15:06
kracker is offline   Reply With Quote
Old 2013-05-16, 18:41   #776
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by kracker View Post
Passed on my 7770.

Selftest statistics
number of tests 287358
successful tests 287358

selftest PASSED!

EDIT: Strange, did I miss some tests?
Thanks also for your results. No, you did not miss anything. You tested with GPU sieving, Kyle tested with CPU sieving. There are 3 less relevant kernels that I did not enable for GPU sieving. One is used for TF from 2^58 to 2^60, which is the only range where my mongomery implementation shines. The others are older kernels that are too slow to be selected in any real TF. I should probably remove them.

BTW, I successfully tested both GPU and CPU sieving for both HD5770 and HD7850.

On the 5770, I found GPUSievePrimes of 60k-70k to be the optimum. The 7850 seems to peak at ~110k. This also shows that the GPU sieve on VLIW5 is not very efficient yet (only 40% occupation on average). But this is something for the next mfakto version(s).
Bdot is offline   Reply With Quote
Old 2013-05-16, 18:47   #777
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by Bdot View Post
Thanks also for your results. No, you did not miss anything. You tested with GPU sieving, Kyle tested with CPU sieving. There are 3 less relevant kernels that I did not enable for GPU sieving. One is used for TF from 2^58 to 2^60, which is the only range where my mongomery implementation shines. The others are older kernels that are too slow to be selected in any real TF. I should probably remove them.

BTW, I successfully tested both GPU and CPU sieving for both HD5770 and HD7850.

On the 5770, I found GPUSievePrimes of 60k-70k to be the optimum. The 7850 seems to peak at ~110k. This also shows that the GPU sieve on VLIW5 is not very efficient yet (only 40% occupation on average). But this is something for the next mfakto version(s).
On my APU, ~60k seems to be best, I still yet have to tinker on m 7770... Are you planning more improvements or should we do benchmark time?
kracker is offline   Reply With Quote
Old 2013-05-16, 19:07   #778
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by kracker View Post
On my APU, ~60k seems to be best, I still yet have to tinker on m 7770... Are you planning more improvements or should we do benchmark time?
I found a cheap small thing for VLIW5 (e.g. your APU), but this is the last change I will do to the TF part. I'm now checking a few of the other features and already fixed the wait time display when CPU sieving.

You can start tweaking your 7770 and send the benchmark to James , the last build is just around the corner and will look the same for GCN.
Bdot is offline   Reply With Quote
Old 2013-05-16, 20:15   #779
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

87816 Posts
Default

Quote:
Originally Posted by Bdot View Post
I found a cheap small thing for VLIW5 (e.g. your APU), but this is the last change I will do to the TF part. I'm now checking a few of the other features and already fixed the wait time display when CPU sieving.

You can start tweaking your 7770 and send the benchmark to James , the last build is just around the corner and will look the same for GCN.
Looks like 10k is best for me for my 7770. Although, do I need to set it to 82485 for the benchmark? on James's benchmark form it says: "(set GPUSievePrimes=82485)"
kracker is offline   Reply With Quote
Old 2013-05-16, 21:04   #780
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3×199 Posts
Default

Quote:
Originally Posted by kracker View Post
Looks like 10k is best for me for my 7770. Although, do I need to set it to 82485 for the benchmark? on James's benchmark form it says: "(set GPUSievePrimes=82485)"
10k or 110k? If it's really 10k, then something is wrong ... which TF test are you doing, what are GPUSieveSize and GPUSieveProcessSize and what's the reported GHz? I think you should be getting around 140-145 GHzdays/day for M62... 270 to 273, or 125-130 for M62... 273 to 282. Yes, GCN has a 10% performance step right there.

And for the reporting to James ... I don't see why you should not use the fastest possible value. Would be crippling devices that have their optimal point somewhere else ... Maybe James can comment on that.

Last fiddled with by Bdot on 2013-05-16 at 21:18 Reason: James' benchmark
Bdot is offline   Reply With Quote
Old 2013-05-16, 21:11   #781
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by Bdot View Post
10k or 110k? If it's really 10k, then something is wrong ... which TF test are you doing, what are GPUSieveSize and GPUSieveProcessSize and what's the reported GHz? I think you should be getting around 140-145 GHzdays/day for M62... 270 to 273, or 125-130 for M62... 273 to 282. Yes, GCN has a 10% performance step right there.
100k. Sorry.

And yes, at ~M62, I'm getting 148 GHz days.

EDIT: When I go to the 332M range 73 to 74 bits, it drops to ~120 GHz...

Last fiddled with by kracker on 2013-05-16 at 21:14
kracker is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2718 2021-07-06 18:30
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 07:45.


Mon Aug 2 07:45:52 UTC 2021 up 10 days, 2:14, 0 users, load averages: 2.06, 1.53, 1.41

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.