mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   GTX 780 TI (https://www.mersenneforum.org/showthread.php?t=18855)

Manpowre 2013-11-14 10:31

[QUOTE=TheMawn;359236]My understanding was the drivers for the GTX did not allow for full use of the DP hardware whereas the drivers for the Quadro did.[/QUOTE]

The problem with DP is that that section of the GPU that is enabled to do DP, can not be enabled with higher clock speeds, like we see now with 780TI.

I can easily overclock my Titans in SP mode, but doing that for DP, they either start to report wrong residue, or just cudalucas crash. Thermal heat inside GPU to do DP is an issue.

Nvidia simple made a full GK110 chip higher clock speeds, but had to ditch something, the DP part also to make the card cheaper.

Did anyone test the GTX--> quadro board enabling DP on the 680 boards with residue and proper cuda testing ? Atleast I didnt see that. I just read that people were able to use the board with quadro drivers.

But I am surprised that Nvidia didnt give us DP 1/3 with the 780TI switch, that just takes the GPU back down in speed to enable DP. I guess that would make Titan owners angry!.. but better price indeed.!

jasonp 2013-11-14 12:30

The thread that ixfd64 refers to was first posted on the Nvidia user forums. My understanding is that there isn't anything you can do to turn on the full DP performance of a consumer-grade Nvidia GPU; it's disabled on the actual GPU chip. The extra features those threads talk about refer to things like (as the thread describes) three-monitor support. Basic stuff, like the number of cores or what those cores can do, is apparently determined by blowing fuses on the die ('hard-straps'). You don't want that to be otherwise, as you don't know whether the extra cores are turned off for marketing reasons or because the board is not stable with them turned on. The more advanced stuff, like direct PCIe data transfers from the GPU to another card, is controlled by the driver, which chooses not to allow them based on what the GPU board reports about itself. It's this latter class of stuff that can be turned on through various hardware mods.

LaurV 2013-11-15 05:39

In the discussion that followed about hacks and clocks, some of my questions remained unnoticed.

Say, I want to buy a new card, the one which performs best with cudaLucas. The main question is if I should buy a Titan or a 780Ti.

So, is [URL="http://www.mersenne.ca/cudalucas.php?model=508"]this result in James' site[/URL] real? Any of you having 780Ti can confirm it?

How many reports are behind of the benchmark? One? Two? More? After reading a lot around, I am skeptical of Ti's DP performance, and in fact I suspect the report is fake. I didn't hold a monster card like that in my hands yet, but I am planning to look for one (to hold on my hands, at least :D). The people from anandtech and tech-blah-blah are more interested in gaming, and they are not very knowledgeable either (for example, they all claim that the Titan and 780 is the same die, but they are not: they are both Kepler architecture, but the Titan, as I mentioned before, is Tesla-like, same die as k20(x), it gets 1 DP block for every 3 FP blocks, like Teslas do, but 780 is Geforce-like, with 1 in 8 DP blocks, like all the 6xx and 7xx do, when the clock is not divided. For 780Ti, because there are a lot of cores enabled there, the chip couldn't dissipate the heat properly internally, they divided the DP clock by 3. There is where 1/24 DP performance comes from). Say, I have strong reasons to believe that there are two different GK110 chips.

Anyhow, theoretically, the Ti [B][U]can't[/U][/B] achieve that cudaLucas performance, unless some "unblocking tricks" which I don't know yet, were applied to let the card unleash its 1/8 DP. In this case, the additional 200 cores may justify the additional GHzDays surplus, although is hard to believe that 8% more cores multiplied by 2-3% faster clock can compensate for 166% more clock ticks! (i.e. 1/3 Titan's DP, against a fictional 1/8 Ti's DP, assuming someone unlocked the clock, otherwise is 1/24, which would mean 700% more clock ticks!).No matter how you arrange the puzzle, there is only one way that Ti gets 13% more DP in cudaLucas (from 61 to 69): only if it could perform 1/3 DP. In this case, 2880/2688 cuda cores, multiplied by 928/876 clock boos in turbo mode, and made to percents, it is [B][U]exactly[/U][/B] the 13% performance boost, or 69/61, see James's table.

If the DP benchmarks would be so good for Ti, there would be no reason for nVidia to keep the Titan's price to $1000, but sell Ti's with $700. There must be a glitch. Both cards are non-ECC, and the 3GB additional memory does not justify the difference. Or even if it does, nobody would buy Titans anymore, because everything you can do in 6 gigs, you can also do in 3 gigs (maybe except LL testing for OBD range, hehe).

So, there [U][B]must[/B][/U] be something I am missing... Can Ti cards be "hacked" into doing 1/3 DP? Or the benchmark is fake? I could not confirm this boost from web, and, for example "folding at home" which uses some intensive DP calculus in some of their benchmarks, show Titans performing 2-3 times better than 780Ti. Not 20-30% better, but 2-3 TIMES better, i.e. double or triple output.

axn 2013-11-15 06:17

[QUOTE=LaurV;359345]In the discussion that followed about hacks and clocks, some of my questions remained unnoticed.

Say, I want to buy a new card, the one which performs best with cudaLucas. The main question is if I should buy a Titan or a 780Ti.

So, is [URL="http://www.mersenne.ca/cudalucas.php?model=508"]this result in James' site[/URL] real? Any of you having 780Ti can confirm it? [/QUOTE]

Could be. I have reason to suspect that CUDALucas performance is bottlenecked by memory interface. In this case 780, Ti, and Titan are all 384 bits ([url]http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units#GeForce_700_Series[/url]), but Ti has higher effective bandwidth (compared to 780 & Titan). So I'd expect 780 to perform roughly the same as Titan, with Ti performing better than both. Incidentally, the only other GPU with 384 bit bus is the 580!

Please don't pay any attention to the ridiculous difference in theoretical peak GFLOPS (1300 for titan, 210 for Ti).

LaurV 2013-11-15 06:40

Well, you may remember I sold Tesla 2070 end of 2011, for which the interface was also 384 bits, but the performance was lousy... I didn't aim for stability, therefore I bought two 580's and some beer with the money :D

But you may be right about memory bottleneck... I have to digest this better.

axn 2013-11-15 09:37

[CODE]GPU Bus B/w SP DP
---------------------- ---- ---- ---- ----
GeForce GTX 480 384 177 1345 168(?)
GeForce GTX 560 Ti OEM 320 152 1031 129
GeForce GTX 560 Ti 448 320 152 1312 164
GeForce GTX 570 320 152 1405 176
GeForce GTX 580 384 192 1581 198
GeForce GTX 670 256 192 2460 102
GeForce GTX 680 256 192 3090 129
GeForce GTX 760 256 192 2257 94
GeForce GTX 760 Ti 256 192 2460 103
GeForce GTX 770 256 224 3213 134
GeForce GTX 780 384 288 3977 166
GeForce GTX 780 Ti 384 336 5046 210
GeForce GTX Titan 384 288 4500 1300
[/CODE]
I picked the highest b/w GPUs (excluding 590/690 dual GPUs) from the wikipedia page. The key columns are b/w and DP.

Manpowre 2013-11-17 13:25

It is interesting to read that 580 board has better DP performance than 780, and close to 780ti.

And this table is the reason for Nvidia not to drop price on Titan. They know the ones left to buy titans are Cuda developers using DP!

Stef42 2013-11-17 14:47

As a current GTX580 owner, it would not be useful to upgrade to a 780Ti I guess. Only Titan seems to be better by a good margin.

Manpowre 2013-11-17 15:32

[QUOTE=Stef42;359616]As a current GTX580 owner, it would not be useful to upgrade to a 780Ti I guess. Only Titan seems to be better by a good margin.[/QUOTE]

I bought 2 titans when I started this project in may.. but digging into these numbers and understanding cudalucas and p1 code more, I realized DP and 1/8 and 1/24 and 1/3 of SP speed is important.

Even though 580/590 has 500 some cudacores, it has 1/8 DP speed of SP. that means, in DP mode, which we use for cudalucas, it competes still 3 years later, and now we can buy these 580/590 for 50-100 USD each second hand. 2x 580 boards competes with 1 titan for 1000USD. It is a matter of hosting more boards on one machine and power supply.

flashjh 2013-11-17 17:16

[QUOTE=Manpowre;359619]It is a matter of hosting more boards on one machine and power supply.[/QUOTE]If I had the $$ I would switch to/use Titans to save power. Right now I use 580s for the reason of price vs performance ratio. Power usage is quite high though and it does take a toll on the budget.

TheMawn 2013-11-17 18:11

I have no idea where you're finding $50 - $100 GTX 580's. I know there are many who would upgrade from a 580 to a 670 or a 570 to a 660 Ti in a heartbeat, but even $100 isn't going to get you very far.


All times are UTC. The time now is 01:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.