mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-11-26, 19:39   #540
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

46316 Posts
Default

Quote:
Originally Posted by diep View Post
Your 580 will totally destroy any AMD videocard with 1 gpu inside.
Unless someone invests in a dual Xeon system, I don't see why anyone would want to TF on more than one video card in the same system anyway (you could run it, but you wouldn't be able to max out both GPUs). An i7 3960x with 12 hyperthreads could maybe do it, but I wouldn't want to spend the money to see. A better use for two high-end GPUs in a system is TF on one and CUDALucas on the other (if it's nVidia).

Quote:
Note there is a trick posted on a website, which if it works would speedup TF for the smaller kernels quite some on AMD videocards.

As it would make it 3 cycles to multiply 23x23 == 46 bits
using a FMA trick.

Claim is that it works - yet i need to see proof of that first.

Gonna do that later this week as it's so cold here in this office now that winter sets in so i gotta run some GPU's to keep me warm :)
Are you modifying the mfakto code yourself? Let us know how it works out!
flashjh is offline   Reply With Quote
Old 2012-11-26, 19:41   #541
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

36 Posts
Default

Quote:
Originally Posted by flashjh View Post
Unless someone invests in a dual Xeon system, I don't see why anyone would want to TF on more than one video card in the same system anyway (you could run it, but you wouldn't be able to max out both GPUs). An i7 3960x with 12 hyperthreads could maybe do it, but I wouldn't want to spend the money to see. A better use for two high-end GPUs in a system is TF on one and CUDALucas on the other (if it's nVidia).



Are you modifying the mfakto code yourself? Let us know how it works out!
The GTX590 has 2 gpu's on 1 card and the Radeon HD6990 has 2 gpu's at 1 card as well, that's what i referred to.

So both those should be faster than the GTX580
diep is offline   Reply With Quote
Old 2012-11-26, 19:42   #542
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

216810 Posts
Default

Quote:
Originally Posted by flashjh View Post
Are you modifying the mfakto code yourself? Let us know how it works out!
Indeed! Keep us posted

Oh, and the reason I got a 7770 instead of a GTX 550, was because it was slightly faster, and the 7770 used less power too, so.. (this is before Kepler, which pushed compute down)

EDIT: well, not slightly. 7770 vs 550 *Ti* http://www.anandtech.com/bench/Product/536?vs=541

Last fiddled with by kracker on 2012-11-26 at 19:44
kracker is offline   Reply With Quote
Old 2012-11-26, 19:46   #543
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

10110110012 Posts
Default

Quote:
Originally Posted by flashjh View Post
Unless someone invests in a dual Xeon system, I don't see why anyone would want to TF on more than one video card in the same system anyway (you could run it, but you wouldn't be able to max out both GPUs). An i7 3960x with 12 hyperthreads could maybe do it, but I wouldn't want to spend the money to see. A better use for two high-end GPUs in a system is TF on one and CUDALucas on the other (if it's nVidia).



Are you modifying the mfakto code yourself? Let us know how it works out!
I use it for Wagstaff, so that's similar code, yet i write it myself obviously. For starters we are at smaller bitlevels so getting to 64 bits already eats half an hour of a single CPU core.

So a small kernel makes sense. I wanna write 2 kernels though, also one to get above 70 bits.

If the 23 bits trick works i definitely will report that of course.

I wrote my own C code for producing candidates. I use a bigger primebase here than TheJudger for example for Wagstaff and have different sieving tricks (unknown whether that's faster).

Wagstaff is slightly different from Mersenne there and i run at other CPU's with other sorts of caches here.

So far i only wrote the TF code in C and in OpenCL i had only toyed and experimented.

Yet my plan from some months ago was to have something going by end of november so i only got a few days now to keep me warm in this office as the gas price is too high you know. Better burn electricity :)

Will keep you updated. This isn't rocket science with equipment out of the 1950s, this is the real thing :)

Last fiddled with by diep on 2012-11-26 at 19:49
diep is offline   Reply With Quote
Old 2012-11-26, 19:56   #544
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by diep View Post
The GTX590 has 2 gpu's on 1 card and the Radeon HD6990 has 2 gpu's at 1 card as well, that's what i referred to.

So both those should be faster than the GTX580
No, read my post here. Truth is the 590 is also down clocked due to power issues, so a 590 is not good for CUDA work. I thought it would be and I was really disappointed!
flashjh is offline   Reply With Quote
Old 2012-11-26, 20:02   #545
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by flashjh View Post
No, read my post here. Truth is the 590 is also down clocked due to power issues, so a 590 is not good for CUDA work. I thought it would be and I was really disappointed!
Indeed, plus you lose some with SLI, they are better separate, and as you said they(590) are clocked lower.
kracker is offline   Reply With Quote
Old 2012-11-26, 20:02   #546
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

72910 Posts
Default

Quote:
Originally Posted by flashjh View Post
No, read my post here. Truth is the 590 is also down clocked due to power issues, so a 590 is not good for CUDA work. I thought it would be and I was really disappointed!
That's not a problem of the GTX590 but from the software you ran on the cpu and maybe you didn't have a pci-e 2.0 motherboard?

Blaming the GTX590 of software limitations is not very fair i'd say.

What bandwidth did you get when you benchmarked from CPU to GPU at the GTX590?

I've got 8 core Xeon L5420 machine here (2 cpu's a machine). They're $150 on ebay.
Note i have motherboards that are pci-e 2.0 (seaburg chipset).

Under full load (without the Tesla's nor the 6970) the machines eat 170 watt.

Last fiddled with by diep on 2012-11-26 at 20:04
diep is offline   Reply With Quote
Old 2012-11-26, 20:07   #547
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

87816 Posts
Default

What? the 590 IS clocked down, I believe heat issues, so it WILL be slower than 2 580's, plus SLI and mfaktx doesn't work optimally.
kracker is offline   Reply With Quote
Old 2012-11-26, 20:10   #548
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

13318 Posts
Default

Quote:
Originally Posted by kracker View Post
What? the 590 IS clocked down, I believe heat issues, so it WILL be slower than 2 580's, plus SLI and mfaktx doesn't work optimally.
I said a single GTX580 is faster than any other card with 1 gpu.

So it is logical then that 2x 580 is faster than a single GTX590 :)

Also the power usage of 2x 580 is kind of stupid much when running gpgpu.
what is it 1000+ watts machine cores included?
diep is offline   Reply With Quote
Old 2012-11-26, 20:15   #549
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by diep View Post
That's not a problem of the GTX590 but from the software you ran on the cpu and maybe you didn't have a pci-e 2.0 motherboard?

What bandwidth did you get when you benchmarked from CPU to GPU at the GTX590?

I've got 8 core Xeon L5420 machine here (2 cpu's a machine). They're $150 on ebay.
Note i have motherboards that are pci-e 2.0 (seaburg chipset).

Under full load (without the Tesla's nor the 6970) the machines eat 170 watt.
Tried mfaktc and CUDALucas. No matter what CPU/MB combo I threw at it, it was impossible to max out both cards on the 590. I use all PCI-e 3.0/2.0 boards paired with either 3770K or 2700K CPUs. One 3770 it clocked at 4.5GHz and it still didn't help the 590. Since CUDALucas does not use much CPU, I thought I could get both onboard 580s to max out, but they just fought over resources (memory maybe). In the end, the 590 just could not perform like even one 580 for TF for CuLu. I'm sure it's awesome or gaming though.

These are the systems the 590 was tested in:

i7-3770k @ 4.5, MSI Z77A-G41, 16Gb, GTX 580
i7 3770k stock, ASUS P8H67-M PRO/CSM, 16Gb, GTX 580
i7 2700k stock, Biostar TP67xe, 16Gb, GTX 580
i7 2700k stock, ECS H61H2-M2, 8Gb, GTX 580

Quote:
Blaming the GTX590 of software limitations is not very fair i'd say.
True, but I'm not gaming. The only thing I buy GPUs for is TF or CUDALucas. If it's not good for that, I don't want it. That is actually why I started with the question here: I want to know if it's time to upgrade to newer cards... doesn't look like it.

Last fiddled with by flashjh on 2012-11-26 at 20:18
flashjh is offline   Reply With Quote
Old 2012-11-26, 20:19   #550
diep
 
diep's Avatar
 
Sep 2006
The Netherlands

36 Posts
Default

Quote:
Originally Posted by flashjh View Post
Tried mfaktc and CUDALucas. No matter what CPU/MB combo I threw at it, it was impossible to max out both cards on the 590. I use all PCI-e 3.0/2.0 boards paired with either 3770K or 2700K CPUs. One 3770 it clocked at 4.5GHz and it still didn't help the 590. Since CUDALucas does not use much CPU, I thought I could get both onboard 580s to max out, but they just fought over resources (memory maybe). In the end, the 590 just could not perform like even one 580 for TF for CuLu. I'm sure it's awesome or gaming though.

These are the systems the 590 as tested in:

i7-3770k @ 4.5, MSI Z77A-G41, 16Gb, GTX 580
i7 3770k stock, ASUS P8H67-M PRO/CSM, 16Gb, GTX 580
i7 2700k stock, Biostar TP67xe, 16Gb, GTX 580
i7 2700k stock, ECS H61H2-M2, 8Gb, GTX 580



True, but I'm not gaming. The only thing I buy GPUs for is TF or CUDALucas. If it's not good for that, I don't want it. That is actually why I started with the question here: I want to know if it's time to upgrade to newer cards... doesn't look like it.
Oh boy where to start.

I'm busy with Trial Factoring here.

CUDAlucas is something *totally* different man.

Trial factoring is using integers. The 590 is fast for integers.
CUDAlucas is a FFT. Of course that is no good plan at the 590 as
the FFT's require shared RAM and are lobotomized for double precision.

For double precision number crunching get Tesla's from Nvidia!

I've got a bunch of Tesla C2075's here they are fast for floating point!

You mixup trial factoring with DWT?

with TF you need fast cpu's and cpu cores and good software to resupply.
That thing can go what is it 800 million/s or so that GTX590?

How the hell to generate enough FC's with a quad core 2700 chip?

Over here i got a cluster of 8 nodes L5420 dual socket Xeons.
64 cores in total and another box of 16 cores AMD 8356's.

Enough to supply the videocard for TF? :)

Last fiddled with by diep on 2012-11-26 at 20:24
diep is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2718 2021-07-06 18:30
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 07:38.


Mon Aug 2 07:38:48 UTC 2021 up 10 days, 2:07, 0 users, load averages: 0.94, 1.30, 1.36

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.