mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2010-08-27, 23:50   #287
MooMoo2
 
MooMoo2's Avatar
 
Aug 2010

32×71 Posts
Default

Quote:
Originally Posted by nucleon View Post
Has there been any work done on the competition?

On ATI's cards - any work done on them?
Probably not. It doesn't really matter anyway, since an average dual core CPU should be able to outperform either a Nvidia card or an ATI card:

http://www.mersenneforum.org/showpos...82&postcount=1

Quote:
Q: So I can look for prime numbers on a GPU now?

Indeed you can.

Q: So how fast does it go?

It's a work in progress, but with a top-of-the-line card the current speed seems to be around what one core of a high-end PC can achieve.
MooMoo2 is offline   Reply With Quote
Old 2010-08-28, 00:17   #288
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

111110 Posts
Default

Quote:
Originally Posted by MooMoo2 View Post
Probably not. It doesn't really matter anyway, since an average dual core CPU should be able to outperform either a Nvidia card or an ATI card:
This might be a bit outdated. msft did some nice improvements!

http://www.mersenneforum.org/showpos...&postcount=206

A Core i7 980x (topend 6core) is around 7.5ms for a 2M FFT using all 6 cores.

Oliver
TheJudger is offline   Reply With Quote
Old 2010-08-28, 10:06   #289
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5×103 Posts
Default

GTX 480 with ver R code, 2M FFT 4.39ms.

As per TheJudger, the faq was written pre Fermi architecture. Top end GTX480 is streaks ahead in raw numbers, output per initial costs and output per ongoings.

The second fastest computer on the planet is using GPGPUs based on fermi:

www.top500.org

I'm sorry, but the time as come where we can no longer dismiss GPGPUs.

-- Craig
nucleon is offline   Reply With Quote
Old 2010-08-28, 17:49   #290
The Carnivore
 
The Carnivore's Avatar
 
Jun 2010

2·127 Posts
Default

Can someone tell me what the FFT length requirements are for CPUs vs GPUs? For example, if you wanted to check M24036583, you'd need a FFT length of 1280K if you were using a CPU. What FFT length would you need if you were using a GPU? 1280K? 2048K? 4096K?

I remember reading somewhere that starting a prime searching project without GPU support was like going to war without a piano. Some GPUs may match an overclocked quad or hex core if you were to compare FFT lengths, but they won't be of much use if they need FFT lengths that are much longer than CPUs.

BTW, my friend has a high end GPU, and he says it adds nearly 500 watts to his system at full load. Yuck.

Last fiddled with by The Carnivore on 2010-08-28 at 17:50
The Carnivore is offline   Reply With Quote
Old 2010-08-28, 18:11   #291
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

83A16 Posts
Default

Quote:
Originally Posted by The Carnivore View Post
For example, if you wanted to check M24036583, you'd need a FFT length of 1280K if you were using a CPU. What FFT length would you need if you were using a GPU? 1280K? 2048K? 4096K?

BTW, my friend has a high end GPU, and he says it adds nearly 500 watts to his system at full load. Yuck.
Round up to the nearest power of 2. 1280K > 1024K so you have to use 2048K FFT. But remember that a GTX 480 runs at under 4.5 ms/iteration using the 2048K FFT. Compare that with your CPU's speed at 1280K.

A PCIe card is limited to drawing 300W at full load by spec, but the cards are coming close to that limit.
frmky is offline   Reply With Quote
Old 2010-08-28, 18:37   #292
The Carnivore
 
The Carnivore's Avatar
 
Jun 2010

2·127 Posts
Default

Quote:
Originally Posted by frmky View Post
Round up to the nearest power of 2. 1280K > 1024K so you have to use 2048K FFT. But remember that a GTX 480 runs at under 4.5 ms/iteration using the 2048K FFT. Compare that with your CPU's speed at 1280K.
My core i7 920 overclocked to 3.2GHz gets me about 20ms/iteration. But that's with only one core running. With all cores running one LL test each, the total output matches the GTX 480.

Surpassing the GTX 480 should be possible if you get an i7 980x or use more aggressive overclocks.
The Carnivore is offline   Reply With Quote
Old 2010-08-28, 18:50   #293
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by The Carnivore View Post
My core i7 920 overclocked to 3.2GHz gets me about 20ms/iteration. But that's with only one core running. With all cores running one LL test each, the total output matches the GTX 480.
But keep in mind that the GPU needs to go to the next highest power of 2. Going from 1280K to 2048K is approximately the worst-case scenario, since 1280K is just a step above 1024K, so the GPU has to take a huge performance hit (compared to a 1024K number) to run this number, while the CPU only had to take a small performance hit (compared to a 1024K number) . Of course, the best case would be where they both would use a power-of-2 base. So in a worst-case (for the GPU) scenario, you still need all cores of your i7 working together to match its output! In a best-case scenario, it's closer to twice as fast as your CPU.

Last fiddled with by Mini-Geek on 2010-08-28 at 18:54
Mini-Geek is offline   Reply With Quote
Old 2010-08-28, 20:48   #294
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

32·5·107 Posts
Default

Quote:
Originally Posted by Mini-Geek View Post
But keep in mind that the GPU needs to go to the next highest power of 2. Going from 1280K to 2048K is approximately the worst-case scenario, since 1280K is just a step above 1024K, so the GPU has to take a huge performance hit (compared to a 1024K number) to run this number, while the CPU only had to take a small performance hit (compared to a 1024K number) . Of course, the best case would be where they both would use a power-of-2 base. So in a worst-case (for the GPU) scenario, you still need all cores of your i7 working together to match its output! In a best-case scenario, it's closer to twice as fast as your CPU.
I'd add that with his computer AND the GPU, he could run 2 jobs in parallel...

Luigi
ET_ is offline   Reply With Quote
Old 2010-08-29, 21:51   #295
sanzo
 
Aug 2010

2·3 Posts
Default

Hello everyone,

I'm new member of this forum but I'm a GIMPS member from an year.

I have two graphics card
-an ATI RAdeon HD 5870
-and a Geforce 210

I found this discussion because I wont to test Lucas-Lehmer test on a GPU.

I want to know if there is a opencl program that can run on ATI's card or if there is only a cuda version (for my 210 card)?

Thanks all
sanzo
sanzo is offline   Reply With Quote
Old 2010-08-30, 08:24   #296
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

588010 Posts
Default

Quote:
Originally Posted by sanzo View Post
Hello everyone,

I'm new member of this forum but I'm a GIMPS member from an year.

I have two graphics card
-an ATI RAdeon HD 5870
-and a Geforce 210

I found this discussion because I wont to test Lucas-Lehmer test on a GPU.

I want to know if there is a opencl program that can run on ATI's card or if there is only a cuda version (for my 210 card)?

Thanks all
sanzo
Only CUDA so far.
henryzz is offline   Reply With Quote
Old 2010-08-30, 09:07   #297
sanzo
 
Aug 2010

2×3 Posts
Default

Thanks henryzz,

Can I test this program on my gf 210??
I've win 7 64 bit, how I can try it? (exist a precompiled win version)...

sanzo
sanzo is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 01:26.


Sat Jul 17 01:26:15 UTC 2021 up 49 days, 23:13, 1 user, load averages: 1.35, 1.22, 1.22

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.