mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-09-23, 14:26   #166
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by LaurV View Post
it would be a 4THzD difference, which might take time (talking about DC, not about TF where thousands of GHzD/D are possible), even if I work full power and you are sleeping...
That's what I mean. I'm not planning on stopping anytime soon! Also, should be in top 2 in DC there soon.

Quote:
Originally Posted by LaurV View Post
But I just looked to my other stats, which are the GIMPS' Lifetime, and thinking that you may be talking about this:
Attachment 10294
which should be a trifle, just few days of full throttle to put you far behind...

Still thinking about it... But for me now it seems to be more important to do some P-1, because I will be soon pushed out of Lifetime's Top100, where I am trying to stay, once I was able to reach it... hehe. So, my cards will do some P-1 for the time being. Let you get some more advance, hehe ... you know what the problem with roosters is?
Fine, so be it! I was really curious on your power and if it was really true...
Quote:
Joking apart, I really like the new clLucas! I certainly have to play more with it! I think you guys did a wonderful job! kotgw and kudos!
I didn't do anything, msft did everything.
Also, 32 bits should be up in a few days at /clLucas.
kracker is offline   Reply With Quote
Old 2013-09-23, 14:42   #167
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

@Robish
+1 I would suggest doing a few DC's just to make sure.

Quote:
(quote from the web, I spent some time to search for this old joke, I can never tell it properly in English, but I use to tell it to every young engineer who come to work for the company, when he hits the upper threshold of the door's frame with his head... , buddy, you are the young rooster here!)
Challenge accepted.
kracker is offline   Reply With Quote
Old 2013-09-23, 23:55   #168
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

x86(32 bit up.)
kracker is offline   Reply With Quote
Old 2013-09-24, 10:44   #169
Robish
 
"Rob Gahan"
Aug 2013
Ireland

22×32 Posts
Smile

Quote:
Originally Posted by Bdot View Post
For me (rather for a 7850), -threads 64 was fastest. Slightly behind was 128, and a lot slower: 256.
Thanks mines a 7870 so they should behave similarly, Ill give it a shot thanks Bdot
Robish is offline   Reply With Quote
Old 2013-09-24, 15:19   #170
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

1000011110002 Posts
Default

Thanks to Xyzzy, clLucas is hosted here:
http://www.mersenneforum.org/cllucas/
kracker is offline   Reply With Quote
Old 2013-09-24, 21:56   #171
Robish
 
"Rob Gahan"
Aug 2013
Ireland

22·32 Posts
Smile -aggressive

Quote:
Originally Posted by kracker View Post
@Robish: Try -aggressive if that does anything for you.

(6 hours till my second DC completes )
Hey Kracker

-aggressive = 10% saving ie 284 drops to 252 hours


Cool. Nice one. Thanks. ;-)

clLucas_x64_1.01 62868347 -f 4194304 -threads 256
Platform :Advanced Micro Devices, Inc.
Device 0 : Pitcairn


start M62868347 fft length = 4194304
Iteration 10000 M( 62868347 )C, 0x2fead152a6afa7d8, n = 4194304, clLucas v1.01 e
rr = 0.002441 (2:42 real, 16.2905 ms/iter, ETA 284:24:15)
^C caught. Writing checkpoint.

clLucas_x64_1.01 62868347 -f 4194304 -threads 256 -aggressive
Platform :Advanced Micro Devices, Inc.
Device 0 : Pitcairn


start M62868347 fft length = 4194304
Iteration 10000 M( 62868347 )C, 0x2fead152a6afa7d8, n = 4194304, clLucas v1.01 e
rr = 0.002441 (2:25 real, 14.4776 ms/iter, ETA 252:45:15)
^C caught. Writing checkpoint.
Robish is offline   Reply With Quote
Old 2013-09-24, 23:39   #172
Robish
 
"Rob Gahan"
Aug 2013
Ireland

22·32 Posts
Wink VCL

Quote:
Originally Posted by msft View Post
Hi,
Thank you observations.

Hi msft

Have you any thoughts on distributing the code over several Gpu's?
Ok I may be jumping ahead but Ive been following VCL (www.mosix.com) about a year now and it allows massive computation power over TCPIP but makes the local PC think all the GPUs are local within the PC.

Epixoip's video and guides

http://www.youtube.com/watch?v=3axK5P8xw-E

http://hashcat.net/wiki/doku.php?id=linux_server_howto

http://hashcat.net/wiki/doku.php?id=vcl_cluster_howto



Food for thought?
Robish is offline   Reply With Quote
Old 2013-09-25, 03:09   #173
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

130416 Posts
Default

Quote:
Originally Posted by Robish View Post
Hi msft

Have you any thoughts on distributing the code over several Gpu's?
Ok I may be jumping ahead but Ive been following VCL (www.mosix.com) about a year now and it allows massive computation power over TCPIP but makes the local PC think all the GPUs are local within the PC.

Epixoip's video and guides

http://www.youtube.com/watch?v=3axK5P8xw-E

http://hashcat.net/wiki/doku.php?id=linux_server_howto

http://hashcat.net/wiki/doku.php?id=vcl_cluster_howto



Food for thought?
How would this have any advantage over running one test on each GPU? If we're after maximum workrate, why pay the networking cost?
VBCurtis is online now   Reply With Quote
Old 2013-09-25, 10:35   #174
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

3·137 Posts
Default

Multi-gpu support should be added first then.
It's not an easy undertaking, even if a single thread controls all the gpus.
It makes more sense to add multi-gpu support to CUDALucas/clLucas, since it would ideally cut the LL test time by the number of GPUs in the system (if they are the same and versus a single gpu).
Once that is done, merging several rigs and assigning a single exponent to them could be considered.
Karl M Johnson is offline   Reply With Quote
Old 2013-09-25, 14:08   #175
Robish
 
"Rob Gahan"
Aug 2013
Ireland

2416 Posts
Thumbs up

Quote:
Originally Posted by VBCurtis View Post
How would this have any advantage over running one test on each GPU? If we're after maximum workrate, why pay the networking cost?
I'm not an expert in the area by any means but OclHashcat divides the "work" and spreads the load evenly over all GPUs, ie Job done quicker, ie one queue to maintain.

Add VCL and you get up to 128 GPU cores all working together in unison. ie super computer. potentially completing LL tasks in minutes or seconds instead of days.
Robish is offline   Reply With Quote
Old 2013-09-25, 14:20   #176
Robish
 
"Rob Gahan"
Aug 2013
Ireland

22×32 Posts
Smile

Quote:
Originally Posted by Robish View Post
I'm not an expert in the area by any means but OclHashcat divides the "work" and spreads the load evenly over all GPUs, ie Job done quicker, ie one queue to maintain.

Add VCL and you get up to 128 GPU cores all working together in unison. ie super computer. potentially completing LL tasks in minutes or seconds instead of days.
in addition, a single hd7990 has a single precision compute power of 8 teraflops and two teraflops double precision.
if someone had the money to build 64 x 7990 (2 gpu's each) cluster with VCL they could achieve 128 teraflops (Double precision) of compute power (almost equal to all GIMPS current volunteers currently)....

Last fiddled with by Robish on 2013-09-25 at 14:22
Robish is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS VictordeHolland Linux 4 2018-04-11 13:44
OpenCL accellerated lattice siever pstach Factoring 1 2014-05-23 01:03
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
AMD's Graphics Core Next- a reason to accelerate towards OpenCL? Belteshazzar GPU Computing 19 2012-03-07 18:58

All times are UTC. The time now is 07:04.


Mon Aug 2 07:04:55 UTC 2021 up 10 days, 1:33, 0 users, load averages: 2.05, 1.88, 1.52

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.