mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   GTX 780 TI (https://www.mersenneforum.org/showthread.php?t=18855)

Manpowre 2013-11-11 10:53

GTX 780 TI
 
[url]http://anandtech.com/show/7492/the-geforce-gtx-780-ti-review[/url]

[url]http://www.tomshardware.com/reviews/geforce-gtx-780-ti-review-benchmarks,3663.html[/url]

2880 stream processorc, but 1/24 DP speed of SP. :(

Titan is still king for cudalucas. but MfactC will be faster.

Manpowre 2013-11-11 10:55

From Tomshardware.com:



[QUOTE]One might expect to see massive performance from Nvidia’s new offering here, but the GeForce GTX 780 Ti’s double-precision performance (1/24-rate) is much more limited than what you can achieve with GeForce GTX Titan (1/3-rate).

In many applications, this really doesn’t matter much, but the otherwise slower Titan is twice as fast in Blender. A look at a computational finance workload (Monte Carlo Price Options) shows a real-world double-to-single precision ratio of 1:25.8 for the GeForce GTX 780 Ti and 1:5.8 for the Titan. This is fairly close to the expected values. Clearly, you'll need to decide for yourself if lower compute performance is a problem before you spend $700 on a 780 Ti.[/QUOTE]

Manpowre 2013-11-11 11:01

It is very interesting the test Anandtech did on compute, scroll down to folding at home double precision:

[url]http://anandtech.com/show/7492/the-geforce-gtx-780-ti-review/14[/url]

Even 290x with its 1/8 DP speed over SP is slower than 780ti. where the Titan is almost twice as good as one 290x.

LaurV 2013-11-11 13:45

Yeah, by the way, since I saw [URL="http://www.mersenne.ca/cudalucas.php?model=508"]that result on mersenne.ca[/URL], my question was how real it is, and by how many (different users/cards) benchmarks is backed up. The card has some lousy DP performance, but it has about 200 more cores. Can the cudaLucas performance be real?

kracker 2013-11-11 15:30

Yep, I knew nVidia would kinda do something like this after 290X :razz:

Also, DP ratio "1/8" "1/2" does not tell you the actual speed, it just means etc DP is 1/8 of single precision.

The titan will be king in DP, but I think the 290X will be king in mfakto.

Manpowre 2013-11-11 16:33

[QUOTE=kracker;358981]Yep, I knew nVidia would kinda do something like this after 290X :razz:

Also, DP ratio "1/8" "1/2" does not tell you the actual speed, it just means etc DP is 1/8 of single precision.

The titan will be king in DP, but I think the 290X will be king in mfakto.[/QUOTE]

Absolutely 290x will most probably be single precision king !

ixfd64 2013-11-11 16:42

I wonder if it's possible to modify the card to increase the DP performance. There's a huge thread on hacking Nvidia cards at EEVBlog here: [url]http://www.eevblog.com/forum/chat/hacking-nvidia-cards-into-their-professional-counterparts[/url]

TheMawn 2013-11-11 18:04

If AMD were to adopt something like Cuda or anything similar which would be useful for LL and P-1, I think our GPU end of things would turn red in a big hurry. The R9 290X just plain beats the GTX 780 Ti in DP computing but I think the issue is actually using it. Too bad.

Apparently it's a good business practice to make the same chip and cripple its performance in different ways to market them as different GPUs? Unbelievable that some GTX 780 Ti is sitting in some dude's computer with idle hardware.

Obviously AMD and Nvidia aren't hurting all that much if they can afford to use drivers to disable a physical component on their GPU and sell it for $1000 less. Many poor sods who just don't know (hell, I didn't until now) are paying for a Tesla but are getting a Geforce with one or two switches pointing in other directions.

xilman 2013-11-11 18:49

[QUOTE=TheMawn;358993]Apparently it's a good business practice to make the same chip and cripple its performance in different ways to market them as different GPUs? [/QUOTE]To me, that sounds rather naive.

For whatever reasons, none of which are intended, some chips don't make the grade.

Which would you rather do --- discard the inferior devices or market them as functional but of poorer performance than intended?

My preference ought to be clear.

kracker 2013-11-11 20:19

[QUOTE=TheMawn;358993]If AMD were to adopt something like Cuda or anything similar which would be useful for LL and P-1, I think our GPU end of things would turn red in a big hurry. The R9 290X just plain beats the GTX 780 Ti in DP computing but I think the issue is actually using it. Too bad.

Apparently it's a good business practice to make the same chip and cripple its performance in different ways to market them as different GPUs? Unbelievable that some GTX 780 Ti is sitting in some dude's computer with idle hardware.

Obviously AMD and Nvidia aren't hurting all that much if they can afford to use drivers to disable a physical component on their GPU and sell it for $1000 less. Many poor sods who just don't know (hell, I didn't until now) are paying for a Tesla but are getting a Geforce with one or two switches pointing in other directions.[/QUOTE]

What Xilman said, many more expensive chips are binned higher so they will actually work/be stable for what it does. However, most of the time I think they overprice it... But there is a reason there are cheap cards and expensive cards.

TheMawn 2013-11-11 22:09

[QUOTE=xilman;358995]To me, that sounds rather naive.

For whatever reasons, none of which are intended, some chips don't make the grade.

Which would you rather do --- discard the inferior devices or market them as functional but of poorer performance than intended?

My preference ought to be clear.[/QUOTE]

[QUOTE=kracker;359000]What Xilman said, many more expensive chips are binned higher so they will actually work/be stable for what it does. However, most of the time I think they overprice it... But there is a reason there are cheap cards and expensive cards.[/QUOTE]

Neither of you understood my meaning, by the looks of things.

I'm not giving them crap for pricing a GTX 780 Ti higher than a GTX 780 despite it having the same chip. I fully agree that doing that is perfectly fine. It's analogous to an i7-4770 being binned as an i7-4770K or i7-4770k if the luck of the draw made it better or worse, respectively, than the standard for a plain 4770.

I'm giving them crap for [URL="http://www.newegg.ca/Product/Product.aspx?Item=N82E16814132008"]this[/URL]. And [URL="http://www.newegg.ca/Product/Product.aspx?Item=N82E16814162145"]this[/URL]. $3500 vs $730.


The GTX 780 Ti has its dual precision computing power crippled. The Tesla does not. Otherwise they're the same card. Go to the link in the ixfd64's post where the guy makes a few hardware and software hacks to fool his GTX 690 into thinking it's a Quadro.

Nvidia uses drivers that disallow the use of a number of the components based on the model. Driver says "You a 690 you no get fast DP" or "You a Quadro you get fast DP".

It's sort of similar to the story where you could flash the bios of your HD 6950 while shorting two pins and magically transform it into an HD 6970. The difference there was similar to the difference between the 780 and 780 Ti. The 780 has 1/15 of its hardware disabled because that portion of it didn't make the grade. The 780 Ti had all its hardware meet the cut but that was rare enough to justify binning it higher. Same as the 6950-6970 (though the two extra cores or whatever the hell it was DID actually function just fine).

I just can't see how only an eighth of the DP hardware "made the cut" on the GTX 780 Ti whereas the rest of the card is mint.

kracker 2013-11-11 23:34

:davieddy: To the best of my knowledge, the 780 and the 780 Ti is [I]not [/I]the same die.
Also, sure you could chip a 6950 or a 690 to a higher/business card, but I'll bet you that the success "stable" rate is not 100%.

The professional cards are more expensive sure, one part is from higher bin, but the majority is... because it is a professional card. I mean think of it this way:

Gamer: Oh noez, a $3000 card? I can't afford that... :no:
Business: Oh noez, it's $3000? Oh well, I'll have to deal with that, I don't have much choice.

You see nVidia/AMD's grand scheme of things?

Manpowre 2013-11-12 00:17

The actual GPU might be the same production, it is when testing the GPU it comes out to be a lower clocked or parts of the GPU is not as good.

As I see it, the PCB board might be the same in some cases, like titan and K20, but the difference are many:
* K20 has registered memory chips
* Titan and 780ti has higher clock speed, but lower DP than K20
* K20 is clocked lower due to registered memory, and secondly to 100% assure that the "pixels" or compute nodes inside GPU are correct.
* K5000/K6000 also have registered memory to reduce memory errors.
* Who cares if a pixel here and there is wrong in battlefield or any other game, blinking red for 2-3 frames ?
* The guys in TV business, which use the K5000/K6000 DO care if a red pixel is shown for 2-3 frames. it is a big deal (I work in this business, so I know).
* When you clock the GPU higher, parts of the GPU does not work as intended. It could mean that DP part of the GPU doesnt work as intended with full 1/3 enabled. Instead, Nvidia cripples it to ensure it becomes a stable GTX board competing in 3D rendering for games.

Nvidia had to produce 19000 K20's for the TITAN cluster setup first. out of that, they probably got enough to produce the Titan, thats probably WHY the titan was born so early. Made it GTX and put on unregistered memory. Remember that titan came in February 2013. 780TI comes in november 2013. 10 months later.

Nvidia with TMSC perfected the chip production to enable all cuda cores for 780ti. Also there is no new TITAN cluster machine to take the top production. And it is competition they have to deal with as the 290x is out.

As I see it, the professional cards are there to ensure the stability, same GPU, better production, but that red dot for 2-3 frames isnt there for K5000/K6000. For Tesla cards, they are clocked slower to ensure compute performance are as promised with registered memory. Less risk of FFT errors for instance.

I really love the titan cards, they did something right there.. atleast for me who loves computing.

LaurV 2013-11-12 05:56

[QUOTE=kracker;359017]To the best of my knowledge, the 780 and the 780 Ti is [I]not [/I]the same die.[/QUOTE]
This I can confirm, from a very trustful source.
OTOH, Titan and K20(x) is the same chip.

Talking about DP performance, 1/24 is quite ridiculous. I can't understand where did they come with the number, in fact. A SP float has 23 bits of mantissa, and a DP float has 53 bits. But the most of the applications which can't live with SP floats, will feel convenient with 46 bits of mantissa too. Including most of our LL tests. The point is that when one uses SP floats to do FFT multiplications, there is not enough space in SP floats to do the carry propagation, etc. One may use two SP floats, having together 46 bits of mantissa, and 16 bits of exponent, to keep the polynomial coefficients of Karatsuba multiplication, to get 1/3 DP* performance, in software (please remark the star, this is still 128 times "less accurate" than "true" DP, as there are 7 bits less in the mantisa, but still enough for the most of the needs, and you get it 8 times faster!). Also, one can use 3 SP floats, for a mantissa of 69 bits and 24 bits of exponent, and use software Toom-Cook-like multiplication, to get 1/5 (with some overhead for addition and constant multiplication) or 1/7 (with almost no overhead) DP performance, or just do 1/9 with blind "school grade" multiplication, there is more precision than a "real" DP, and you get it 3-5 times faster (compared to 1/24). Considering that the cards could do all the Karatsuba/Toom-Cook multiplications and overhead in the same time, in parallel, and they even possess 24-bit hardware multipliers, well... It seems a bit stupid for me... Why should they release cards with 1/24 DP performance?

If one has a card which can multiply SP floats very fast, and multiply 9 of them in the same time, then he can use 3 of them to fully cover a DP float, and implement 1/9 DP performance in software. Or I am dreaming?

But this is another discussion...

xilman 2013-11-12 11:55

[QUOTE=TheMawn;359011]Neither of you understood my meaning, by the looks of things.[/QUOTE]Perhaps so.
[QUOTE=TheMawn;359011]
I'm giving them crap for [URL="http://www.newegg.ca/Product/Product.aspx?Item=N82E16814132008"]this[/URL]. And [URL="http://www.newegg.ca/Product/Product.aspx?Item=N82E16814162145"]this[/URL]. $3500 vs $730.[/QUOTE]It's a free market. You will pay what you think it is worth. Some people think it's worth paying $3500 whereas you do not. So be it.

LaurV 2013-11-12 12:09

The difference between the two is the reliability, i.e. the ecc memory, which does not lose bits on the way, and few thousands (no joke) hours of intensive testing and few hundreds [URL="http://en.wikipedia.org/wiki/Environmental_chamber"]climate chamber cycles[/URL] (for each chip/card) to ensure that the cards perform according with the specs. Imagine Retina is his evil lair doing ballistic rockets with k20. :smile: Do you think he will use 780's instead, because they are cheap, risking to miss his over-the-ocean targets :razz: because a pixel fail?

Someone needs to pay for the current consumption, menpower, etc, during those tests. Good comfort is expensive, anywhere.

retina 2013-11-12 14:19

[QUOTE=LaurV;359076]Imagine Retina is his evil lair doing ballistic rockets with k20. :smile: Do you think he will use 780's instead, because they are cheap, risking to miss his over-the-ocean targets :razz: because a pixel fail?[/QUOTE]Absolutely. ECC is the only way to go if you need reliability and accuracy. But down clocking (or at least not overclocking) helps a lot towards achieving that on non-ECCed cheaper systems. But you'll never solve the cosmic ray, or PSU glitch, problem just by underclocking; those will still trip you up.

[size=1]As for the ICBMs: Have you been spying on me? I thought only the NSA/GCHQ was doing that sort of thing. Anyhow, it is not true and I deny it emphatically. Not doing that. Nope. Not me. Must be my next-door-neighbour in the lair next to mine. Or Mini-me. It's probably him. Yeah, that's it, the little bugger.[/size]

Manpowre 2013-11-12 14:51

Yepp, so it is not just about rebranding existing cards. The 780TI is simply made for consumer market, non-registered memory, higher clock speeds which cripples reliability but perfected chip reliability.

Though, Im surprised Nvidia didnt drop the pricetag on the Titan. I cant even imagine they sell any of these anymore. 780ti is better for gaming. only a few of us buing titans for cuda and computing.

For video editing, 290x is even better. Folding and mersenne is mabye the only usage left for Titan cards.

TheMawn 2013-11-13 05:12

Alright. I stand corrected.

I still fail to see the justification in two thousand dollars more for a GPU but if someone else thinks it's worth it, so be it. To be fair, I didn't know about the ECC memory on the professional cards.

I still won't be surprised when someone finds out the 780 Ti can be successfully hacked into a Quadro in 95% of cases. I won't be surprised, nor will I get bit.

Manpowre 2013-11-13 14:13

[QUOTE=TheMawn;359162]Alright. I stand corrected.

I still fail to see the justification in two thousand dollars more for a GPU but if someone else thinks it's worth it, so be it. To be fair, I didn't know about the ECC memory on the professional cards.

I still won't be surprised when someone finds out the 780 Ti can be successfully hacked into a Quadro in 95% of cases. I won't be surprised, nor will I get bit.[/QUOTE]

Remember that the old hack, was that someone found a way of using the quadro drivers which also have a API that allows grabbing frames and send to video boards. This is what the company I work for is doing.

The quadro cards are clocked lower due to slower ECC memory, and error correction through the GPU. also, 80% of the cost of the board is not the board itself, it is the quadro drivers which is way different than the GTX drivers. the quadro drivers include a full board API aswell.

The hack didnt turn a gtx card into quadro, it made the gtx card report that it is a quadro with similar HW which allows quadro drivers to be used instead.

LaurV 2013-11-14 04:01

[QUOTE=Manpowre;359190]The hack didnt turn a gtx card into quadro, it made the gtx card report that it is a quadro with similar HW which allows quadro drivers to be used instead.[/QUOTE]Wanna say that. You were faster. The "hack" was in fact cheating the drivers into believing a quadro is inserted where a gforce was. It was not a "performance change" under no circumstances (in fact, the performance of quadro is totally lousy, they sacrifice a lot of performance for stability) but some people feel better to lose a bit of stability to get more performance, and this was what the "cheat" did.

TheMawn 2013-11-14 04:39

[QUOTE=Manpowre;359190]The hack didnt turn a gtx card into quadro, it made the gtx card report that it is a quadro with similar HW which allows quadro drivers to be used instead.[/QUOTE]

My understanding was the drivers for the GTX did not allow for full use of the DP hardware whereas the drivers for the Quadro did.

Manpowre 2013-11-14 10:31

[QUOTE=TheMawn;359236]My understanding was the drivers for the GTX did not allow for full use of the DP hardware whereas the drivers for the Quadro did.[/QUOTE]

The problem with DP is that that section of the GPU that is enabled to do DP, can not be enabled with higher clock speeds, like we see now with 780TI.

I can easily overclock my Titans in SP mode, but doing that for DP, they either start to report wrong residue, or just cudalucas crash. Thermal heat inside GPU to do DP is an issue.

Nvidia simple made a full GK110 chip higher clock speeds, but had to ditch something, the DP part also to make the card cheaper.

Did anyone test the GTX--> quadro board enabling DP on the 680 boards with residue and proper cuda testing ? Atleast I didnt see that. I just read that people were able to use the board with quadro drivers.

But I am surprised that Nvidia didnt give us DP 1/3 with the 780TI switch, that just takes the GPU back down in speed to enable DP. I guess that would make Titan owners angry!.. but better price indeed.!

jasonp 2013-11-14 12:30

The thread that ixfd64 refers to was first posted on the Nvidia user forums. My understanding is that there isn't anything you can do to turn on the full DP performance of a consumer-grade Nvidia GPU; it's disabled on the actual GPU chip. The extra features those threads talk about refer to things like (as the thread describes) three-monitor support. Basic stuff, like the number of cores or what those cores can do, is apparently determined by blowing fuses on the die ('hard-straps'). You don't want that to be otherwise, as you don't know whether the extra cores are turned off for marketing reasons or because the board is not stable with them turned on. The more advanced stuff, like direct PCIe data transfers from the GPU to another card, is controlled by the driver, which chooses not to allow them based on what the GPU board reports about itself. It's this latter class of stuff that can be turned on through various hardware mods.

LaurV 2013-11-15 05:39

In the discussion that followed about hacks and clocks, some of my questions remained unnoticed.

Say, I want to buy a new card, the one which performs best with cudaLucas. The main question is if I should buy a Titan or a 780Ti.

So, is [URL="http://www.mersenne.ca/cudalucas.php?model=508"]this result in James' site[/URL] real? Any of you having 780Ti can confirm it?

How many reports are behind of the benchmark? One? Two? More? After reading a lot around, I am skeptical of Ti's DP performance, and in fact I suspect the report is fake. I didn't hold a monster card like that in my hands yet, but I am planning to look for one (to hold on my hands, at least :D). The people from anandtech and tech-blah-blah are more interested in gaming, and they are not very knowledgeable either (for example, they all claim that the Titan and 780 is the same die, but they are not: they are both Kepler architecture, but the Titan, as I mentioned before, is Tesla-like, same die as k20(x), it gets 1 DP block for every 3 FP blocks, like Teslas do, but 780 is Geforce-like, with 1 in 8 DP blocks, like all the 6xx and 7xx do, when the clock is not divided. For 780Ti, because there are a lot of cores enabled there, the chip couldn't dissipate the heat properly internally, they divided the DP clock by 3. There is where 1/24 DP performance comes from). Say, I have strong reasons to believe that there are two different GK110 chips.

Anyhow, theoretically, the Ti [B][U]can't[/U][/B] achieve that cudaLucas performance, unless some "unblocking tricks" which I don't know yet, were applied to let the card unleash its 1/8 DP. In this case, the additional 200 cores may justify the additional GHzDays surplus, although is hard to believe that 8% more cores multiplied by 2-3% faster clock can compensate for 166% more clock ticks! (i.e. 1/3 Titan's DP, against a fictional 1/8 Ti's DP, assuming someone unlocked the clock, otherwise is 1/24, which would mean 700% more clock ticks!).No matter how you arrange the puzzle, there is only one way that Ti gets 13% more DP in cudaLucas (from 61 to 69): only if it could perform 1/3 DP. In this case, 2880/2688 cuda cores, multiplied by 928/876 clock boos in turbo mode, and made to percents, it is [B][U]exactly[/U][/B] the 13% performance boost, or 69/61, see James's table.

If the DP benchmarks would be so good for Ti, there would be no reason for nVidia to keep the Titan's price to $1000, but sell Ti's with $700. There must be a glitch. Both cards are non-ECC, and the 3GB additional memory does not justify the difference. Or even if it does, nobody would buy Titans anymore, because everything you can do in 6 gigs, you can also do in 3 gigs (maybe except LL testing for OBD range, hehe).

So, there [U][B]must[/B][/U] be something I am missing... Can Ti cards be "hacked" into doing 1/3 DP? Or the benchmark is fake? I could not confirm this boost from web, and, for example "folding at home" which uses some intensive DP calculus in some of their benchmarks, show Titans performing 2-3 times better than 780Ti. Not 20-30% better, but 2-3 TIMES better, i.e. double or triple output.

axn 2013-11-15 06:17

[QUOTE=LaurV;359345]In the discussion that followed about hacks and clocks, some of my questions remained unnoticed.

Say, I want to buy a new card, the one which performs best with cudaLucas. The main question is if I should buy a Titan or a 780Ti.

So, is [URL="http://www.mersenne.ca/cudalucas.php?model=508"]this result in James' site[/URL] real? Any of you having 780Ti can confirm it? [/QUOTE]

Could be. I have reason to suspect that CUDALucas performance is bottlenecked by memory interface. In this case 780, Ti, and Titan are all 384 bits ([url]http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units#GeForce_700_Series[/url]), but Ti has higher effective bandwidth (compared to 780 & Titan). So I'd expect 780 to perform roughly the same as Titan, with Ti performing better than both. Incidentally, the only other GPU with 384 bit bus is the 580!

Please don't pay any attention to the ridiculous difference in theoretical peak GFLOPS (1300 for titan, 210 for Ti).

LaurV 2013-11-15 06:40

Well, you may remember I sold Tesla 2070 end of 2011, for which the interface was also 384 bits, but the performance was lousy... I didn't aim for stability, therefore I bought two 580's and some beer with the money :D

But you may be right about memory bottleneck... I have to digest this better.

axn 2013-11-15 09:37

[CODE]GPU Bus B/w SP DP
---------------------- ---- ---- ---- ----
GeForce GTX 480 384 177 1345 168(?)
GeForce GTX 560 Ti OEM 320 152 1031 129
GeForce GTX 560 Ti 448 320 152 1312 164
GeForce GTX 570 320 152 1405 176
GeForce GTX 580 384 192 1581 198
GeForce GTX 670 256 192 2460 102
GeForce GTX 680 256 192 3090 129
GeForce GTX 760 256 192 2257 94
GeForce GTX 760 Ti 256 192 2460 103
GeForce GTX 770 256 224 3213 134
GeForce GTX 780 384 288 3977 166
GeForce GTX 780 Ti 384 336 5046 210
GeForce GTX Titan 384 288 4500 1300
[/CODE]
I picked the highest b/w GPUs (excluding 590/690 dual GPUs) from the wikipedia page. The key columns are b/w and DP.

Manpowre 2013-11-17 13:25

It is interesting to read that 580 board has better DP performance than 780, and close to 780ti.

And this table is the reason for Nvidia not to drop price on Titan. They know the ones left to buy titans are Cuda developers using DP!

Stef42 2013-11-17 14:47

As a current GTX580 owner, it would not be useful to upgrade to a 780Ti I guess. Only Titan seems to be better by a good margin.

Manpowre 2013-11-17 15:32

[QUOTE=Stef42;359616]As a current GTX580 owner, it would not be useful to upgrade to a 780Ti I guess. Only Titan seems to be better by a good margin.[/QUOTE]

I bought 2 titans when I started this project in may.. but digging into these numbers and understanding cudalucas and p1 code more, I realized DP and 1/8 and 1/24 and 1/3 of SP speed is important.

Even though 580/590 has 500 some cudacores, it has 1/8 DP speed of SP. that means, in DP mode, which we use for cudalucas, it competes still 3 years later, and now we can buy these 580/590 for 50-100 USD each second hand. 2x 580 boards competes with 1 titan for 1000USD. It is a matter of hosting more boards on one machine and power supply.

flashjh 2013-11-17 17:16

[QUOTE=Manpowre;359619]It is a matter of hosting more boards on one machine and power supply.[/QUOTE]If I had the $$ I would switch to/use Titans to save power. Right now I use 580s for the reason of price vs performance ratio. Power usage is quite high though and it does take a toll on the budget.

TheMawn 2013-11-17 18:11

I have no idea where you're finding $50 - $100 GTX 580's. I know there are many who would upgrade from a 580 to a 670 or a 570 to a 660 Ti in a heartbeat, but even $100 isn't going to get you very far.

kracker 2013-11-17 18:27

[QUOTE=TheMawn;359627]I have no idea where you're finding $50 - $100 GTX 580's. I know there are many who would upgrade from a 580 to a 670 or a 570 to a 660 Ti in a heartbeat, but even $100 isn't going to get you very far.[/QUOTE]

Yes, getting a 580 quite cheap. You just have to look in the right places. (kladner)

kladner 2013-11-17 18:40

[QUOTE=kracker;359631]Yes, getting a 580 quite cheap. You just have to look in the right places. (kladner)[/QUOTE]

My 580 came off Ebay, but it was quite a bargain at $165, shipping included. I think the seller put off potential buyers with his description. He had apparently run the card without the bracket. While he reattached the bracket, he did not locate the studs for attaching the video cables. I still haven't replaced them, as they aren't absolutely critical. I just happened to be at the right place at the right time to pick up on an auction with no bids, and win it with one bid. The seller was a good sport about it, but he was pretty disappointed by the price he got.

Manpowre 2013-11-17 19:43

I have 2x 580s in my analogue mail at postoffice ready to be picked up, they are a little more expencive in Norway, but, I paid 100usd for each..

The 2x 590 boards I bought earlier, I got for 175usd each. but that was before 290x came to market.. so price was still a little higher (late summer)

ET_ 2013-11-27 15:57

Does [COLOR="Red"]James[/COLOR] read this thread?

I'm also curious about the benchark of 780 Ti being better than Titan's and GTX 580's...

Luigi

Jayder 2013-11-27 17:24

[QUOTE=Manpowre;359635]I have 2x 580s in my analogue mail at postoffice ready to be picked up, they are a little more expencive in Norway, but, I paid 100usd for each..

The 2x 590 boards I bought earlier, I got for 175usd each. but that was before 290x came to market.. so price was still a little higher (late summer)[/QUOTE]
Where are you buying your boards? The prices on ebay seem to be about $130-$200 for a 580 and about $300-$450 for a 590 before taxes and shipping.

Manpowre 2013-11-27 17:33

[QUOTE=Jayder;360445]Where are you buying your boards? The prices on ebay seem to be about $130-$200 for a 580 and about $300-$450 for a 590 before taxes and shipping.[/QUOTE]

Local marketplace for norway. I dont think people here will be interested sending out of country. In norway there is use and throw mentality. So once 1-2 generations hardware is there the prices drop. Now with 290x it dropped even more..

TheMawn 2013-11-28 00:25

I think the 780 Ti wins by brute force. I don't care if your Prius is more efficient, my semi has more power.

kracker 2013-11-28 00:41

[QUOTE=TheMawn;360478]I think the 780 Ti wins by brute force. I don't care if your Prius is more efficient, my semi has more power.[/QUOTE]

What do you mean? a 780 Ti is [I][B]NOT [/B][/I]faster than a Titan in CuLu.

axn 2013-11-28 04:32

[QUOTE=kracker;360479]What do you mean? a 780 Ti is [I][B]NOT [/B][/I]faster than a Titan in CuLu.[/QUOTE]

But that is what the numbers from James' site is implying (as highlighted by LaurV's posts). See [url]http://www.mersenne.ca/cudalucas.php?model=508[/url].

However, having re-reviewed the numbers, I think that the Titan numbers there might've been obtained without enabling the full speed DP option of Titan (as specified @ [url]http://hothardware.com/Reviews/GeForce-GTX-Titan-Performance-Yes-It-CAN-Play-Crysis-3/?page=15[/url]). In that case, the 780 Ti numbers are legit, but the Titan's could be lot faster.

EDIT:- Maybe this thread should be moved under "GPU Computing"?

LaurV 2013-11-28 05:30

[QUOTE=kracker;360479]What do you mean? a 780 Ti is [I][B]NOT [/B][/I]faster than a Titan in CuLu.[/QUOTE]
Well, [URL="http://www.mersenne.ca/cudalucas.php?model=508"]according with this[/URL], [COLOR=Red][B][U]it is[/U][/B][/COLOR]. And [B][U]MUCH[/U][/B]. That is why all the discussion I started with [URL="http://www.mersenneforum.org/showthread.php?p=359345#post359345"]this post[/URL] (read carefully the continuation). Still unclear if the data is correct, and if James reads those posts. For me, the data still look fabricated, someone (new kid) with a new Ti toy, and boasting on the web. I may be wrong. This (see axn's post #26) is easy to test, if one has a Titan, if the memory is indeed the limitation, one can feed the Titan (first test case) one BIG-LL test (say 65M expo), or (second test case) with two small-LL tests (say 2x 33M expos) in parallel. Due to memory limitation, part of the FP64 power will not be used in the first case, but the second case would have about 1.4 times the memory transfers (yes, there are not two times, think about how FFT works) requirement there would be a discrepancy between the two test cases. Also, if you do (case 1) one small-LL, or (case 2) two small LLs in parallel (same expo range, say 30M), then the second test case should be about 40% more efficient than the first, i.e. you should be able to get "half of the DC" for free. It just need two cudaLucas copies in two folders, feed them to the same Titan card, and put the results here on the forum, for both test cases. Then we will dissect them.

I mean, c'mon! the Titan has ~6 times more FP64 power (1/3 instead of 1/24, but lower clock and ~10% less cuda cores), if the memory speed is indeed the limit, than it would mean cudaLucas can be rewritten to access the memory less, and we would have a 6-times faster LL test. Doing 65M LL tests in 12 hours, instead of 72 hours, how does it sounds? :razz: (I know this is difficult-to-impossible to do due to "serial" structure of the LL test, but at least we could combine LL with some other FP64 activity which does not require so much memory access, and still have that other work for free!).

Therefore, either the results on James' page are boasted, or nVidia is lying about 1/3 vs 1/24 ratios.

Unfortunately, I didn't lay my hands on a Titan or 780Ti yet, to pluck all its feathers... But that time will come.

(edit: crosspost with axn, sorry, I oppened this to answer in the morning, but could not finish editing till the lunch break, busy time here)

TheMawn 2013-11-28 06:27

[QUOTE=kracker;360479]What do you mean? a 780 Ti is [I][B]NOT [/B][/I]faster than a Titan in CuLu.[/QUOTE]

What I meant was perhaps the 780 Ti is just so freaking much faster as a part that the Titan which is better suited still loses. Brute force vs efficiency.

I don't know much more obvious I can make the analogy.

kracker 2013-11-28 18:15

@LaurV: I agree 100%. If the 780 ti has the DP switch, then definitely yes. Maybe it does have the speed that James's site has, but this leads me to either of these two:[LIST][*]Titan benchmarks are not with the "DP" switch on.[*]The 780 Ti was based on the Titan[/LIST]
Hmm... looking [URL="http://www.anandtech.com/show/7492/the-geforce-gtx-780-ti-review"]here[/URL] maybe the 1000 MHz boost in memory? But still, something is not right. It is clear though, the Titan has 1/3 FP64, the 780 ti [I]1/24[/I].

Again, I don't know... Just speculation.
-AMD guy


@TheMawn: You made yourself perfectly clear. Please take a good look at the Titan, 780 and 780 Ti's specs and you'll come to a different conclusion.

EDIT: [URL="http://images.anandtech.com/graphs/graph7492/59703.png"]http://images.anandtech.com/graphs/graph7492/59703.png[/URL]

LaurV 2013-11-29 03:20

[QUOTE=kracker;360552]maybe the 1000 MHz boost in memory?[/QUOTE]
I considered the memory boost, [U]and[/U] the additional cuda cores, that is why I said "six times" and not "eight times" faster for DP, see my post. Because otherwise, 1/3 is 8 times faster than 1/24, for the same clock and the same number of cuda cores.

Now, if you are talking about "DP switch", that would be a totally different thing, and it would explain a lot of things... [B][U]If[/U][/B] the 780Ti [B][U]has[/U][/B] a "DP switch", to raise the DP clock to "normal" (instead of 1/8 as it is now) to make it work "full speed", than both boards would do same SPDP ratio (1/3), and you would get about 5-10% from the clock ratios, about 8-12% from the additional cuda cores, THAT would perfectly justify the 13-17% [U]more speed[/U] for the 780Ti.


So, does the 780Ti has a DP switch? Maybe that is all it is about?

axn 2013-11-29 03:43

[QUOTE=LaurV;360582]Now, if you are talking about "DP switch", that would be a totally different thing, and it would explain a lot of things... [B][U]If[/U][/B] the 780Ti [B][U]has[/U][/B] a "DP switch", to raise the DP clock to "normal" (instead of 1/8 as it is now) to make it work "full speed", than both boards would do same SPDP ratio (1/3), and you would get about 5-10% from the clock ratios, about 8-12% from the additional cuda cores, THAT would perfectly justify the 13-17% [U]more speed[/U] for the 780Ti.


So, does the 780Ti has a DP switch? Maybe that is all it is about?[/QUOTE]

Then NVIDIA would be stupid to sell it for $700 (vs $1000 for Titan). More probable is Titan benchmarks were obtained with the DP switch off. If only I had a Titan to test out my theory :wink:

kracker 2013-11-29 05:07

[QUOTE=LaurV;360582]I considered the memory boost, [U]and[/U] the additional cuda cores, that is why I said "six times" and not "eight times" faster for DP, see my post. Because otherwise, 1/3 is 8 times faster than 1/24, for the same clock and the same number of cuda cores.

Now, if you are talking about "DP switch", that would be a totally different thing, and it would explain a lot of things... [B][U]If[/U][/B] the 780Ti [B][U]has[/U][/B] a "DP switch", to raise the DP clock to "normal" (instead of 1/8 as it is now) to make it work "full speed", than both boards would do same SPDP ratio (1/3), and you would get about 5-10% from the clock ratios, about 8-12% from the additional cuda cores, THAT would perfectly justify the 13-17% [U]more speed[/U] for the 780Ti.


So, does the 780Ti has a DP switch? Maybe that is all it is about?[/QUOTE]

Sorry, my brain was foggy...

One thing I know, the 780 TI does NOT have the DP switch.

LaurV 2013-11-29 05:44

[QUOTE=kracker;360586]One thing I know, the 780 TI does NOT have the DP switch.[/QUOTE]

Then it must be what axn said: the benchmark for Titan was done with the DP switch disabled. This raises another question: how fast is a Titan with the switch enabled? 300 GHzDays/Day of front-LL?? 400? 500? :shock: (for comparison: the 580's value in the table is ~30, but it can do 35, max 40, when overclocked and well cooled; for Titan, the value in the table is ~60, which multiplied by 8, assuming the theory is right, and thinking also about memory limitation, but assuming you can cool it properly to avoid thermal throttling, which for sure [U]I can[/U] do...)

Mother of the gun! Aren't your fingers twitching? I would sacrifice the additional $300 for a Titan over the price of a 780Ti, only to test this theory... Only if it wouldn't be so freaking difficult to get those Titans in Thai. I will visit some shops this weekend to see what they offer, and if they not, I may plan a trip to HK or S-pores soon.... I may change all my park of 580's to Titans.... dreaming.... twitching... twitching... dreaming...

BTW, what is Santa Claus doing this year? Does anyone know? :razz:

kladner 2013-11-29 06:18

[QUOTE=LaurV;360588].....

BTW, what is Santa Claus doing this year? Does anyone know? :razz:[/QUOTE]

Santa LaurV could dispose of those cast-off 580's over by here. :razz:

ET_ 2013-11-29 11:48

A nice article running GPU benchmarks on Linux and Windows:

[URL="http://www.phoronix.com/scan.php?page=article&item=nvidia_gtx_titan&num=1"]Here[/URL].

Luigi

Manpowre 2013-11-29 13:06

Well, this is what I get from each of my 2x titans and DP switch on.
73m exponents, 68hours - 206ghz days

= 72.79 ghz days per titan.

There is overhead in the GPU cards which actually slows down the titan, and that is the memcopy back and forth.

73m exponents:
- Titan - 2.9-3.0 ms per iteration
- 580 - 6.6-6.8 ms per iteration
- 590 - 8.2-8.4 ms per iteration (you get 2 simultaniously on one board though)

Also,

I have done FFT benchmark to choose the fastest FFT length for each platform within the error criteria of 0.25. The cudalucas FFT length for the 73m exponents are 43xx something, while I found it alot quicker with 4096 on the titans. The 580/590 uses the 43xx FFT length that cudalucas picks.

So there are many things that make the benchmark not look so good.

axn 2013-11-29 14:59

[QUOTE=Manpowre;360610]
= 72.79 ghz days per titan.[/QUOTE]

which is higher than the 68.1 reported for 780 Ti (and 60.7 reported for Titan itself).

Although, the 780 Ti still wins out when you factor in the cost.

kracker 2013-11-29 15:39

I stand on my original belief. The 780 Ti does not perform as well as in the charts.

Manpowre 2013-11-29 18:44

honestly, I believe the 780 TI will perform something like 4-4.5ms for the 73m exponents with 4096k FFT length.

It is not only about the 1/3 DP speed of SP, or 1/24. it is also that cudalucas is made in such a way where there is memcopy between host memory and GPU card 2 times for each iteration. This takes alot of time too. With a better memory bus for the 780TI, I believe it will perform good, or similar like the 290x, which I also expect to be at 4-4.5m for 4096 FFT length.

This means something like 40-45 ghz days with LL test. I dont believe in 55-60ghz days for 780ti.

also, if you need 2x 780ti's to compute like almost like one titan, you also loose one slot on host machine. but if you only are going to have 1x machine and can afford 2x 780s for gaming, then 2 of these are good.

I am looking for good titan deals right now.

I also saw rumors that nvidia will make a dual gpu card 790 and titan ultra/black edition with 1/3 DP and 2880 cudacores. this means titan1 prices will drop soon. probably to the level of 780ti.

I do think the charts for 780ti and titan has to be updated.!

owftheevil 2013-12-03 15:12

I get between 80GHzD/D and 90GHzD/D, depending on the exponent on a moderately core overclocked titan.

And Manpowre, quit spreading those rumors about the host <-> device memcopies. They are not true.:razz:


All times are UTC. The time now is 01:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.