mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   Geforce GTX Titan 6GB (https://www.mersenneforum.org/showthread.php?t=17834)

chalsall 2013-03-11 22:00

[QUOTE=Redarm;332885]M( 57885161 )P, n = 3584K, CUDALucas v2.05 Alpha

proven in 55h[/QUOTE]

Sweet.

So that shows that the Titans are not DOA, although they need some TLC (read: tweaking).

What is the Cost / Benefit ratio?

TheJudger 2013-03-11 22:00

[QUOTE=ixfd64;332877]It's too bad the Kepler cards are so inefficient at trial factoring. In comparison, the three-year-old GTX 480 cranks out 373.6 GHz-days/day (according to James' benchmark page) with a mere 480 cores at 700 MHz.[/QUOTE][LIST][*]GF100/GF110 (e.g. GTX 470, 480, 570, 580): 3.04billion transistors, 512 cores[*]GK104 (e.g. GTX 670, 680): 3.54billion transistors, 1536 cores[/LIST]So they did some magic voodoo (no, not 3dfx in this case) or the cores are just simpler or stripped down. On Fermi class GPUs each core can do integer multiply, on Kepler this is not the case, only a fraction can do it. Same for integer add, compare, whatever...

Oliver

Ralf Recker 2013-03-11 22:20

[QUOTE=TheJudger;332889][LIST][*]GF100/GF110 (e.g. GTX 470, 480, 570, 580): 3.04billion transistors, 512 cores[*]GK104 (e.g. GTX 670, 680): 3.54billion transistors, 1536 cores[/LIST]So they did some magic voodoo (no, not 3dfx in this case) or the cores are just simpler or stripped down. On Fermi class GPUs each core can do integer multiply, on Kepler this is not the case, only a fraction can do it. Same for integer add, compare, whatever...

Oliver[/QUOTE]
Kepler chips have no double clocked shaders like Fermis or Teslas.

TheJudger 2013-03-11 22:49

[QUOTE=TheJudger;332889]So they did some magic voodoo (no, not 3dfx in this case)[...][/QUOTE]

Oh, btw: I'm still dreaming that ine day I'll own a working (with "PCI rework") [B]Voodoo 5 6000[/B]...

[QUOTE=Ralf Recker;332893]Kepler chips have no double clocked shaders like Fermis or Teslas.[/QUOTE]

Yes, but Keplers have higher base clock so this isn't really twice the clock. GTX 580 is running at 1544MHz (shaders), GTX 680 is running at 1006+MHz. For mfaktc clock rates close to 1100MHz seems to be normal for the boost clock.
So Fermi has ~50% higher clock rate while Kepler has 3 times mores shaders (GF110 vs. GK104). But I guess you know this already.

Oliver

ET_ 2013-03-12 08:01

[QUOTE=nucleon;332883]
[CODE]$ ./CUDALucas-2.03-cuda4.2-sm_30-x86-64.exe -t
------- DEVICE 0 -------
name GeForce GTX TITAN
totalGlobalMem [COLOR="Red"] 2147287040[/COLOR]
sharedMemPerBlock 49152
regsPerBlock 65536
warpSize 32
memPitch [COLOR="Red"]2147483647[/COLOR]
maxThreadsPerBlock 1024
maxThreadsDim[3] 1024,1024,64
maxGridSize[3] [COLOR="Red"]2147483647[/COLOR],65535,65535
totalConstMem 65536
Compatibility 3.5
clockRate (MHz) 875
textureAlignment 512
deviceOverlap 1
multiProcessorCount 14

doesn't begin with Test= or DoubleCheck=.
Starting M29319943 fft length = 1572864
iteration = 25 < 1000 && err = 0.264275 >= 0.25, increasing n from 1572864
Starting M29319943 fft length = 1835008
Iteration 10000 M( 29319943 )C, 0xf43887980b952e31, n = 1835008, CUDALucas v2.03 err = 0.0094 (0:20 real, 1.9694 ms/iter, ETA 16:01:43)
Iteration 20000 M( 29319943 )C, 0xfa7f1f3ff9688114, n = 1835008, CUDALucas v2.03 err = 0.0099 (0:19 real, 1.9429 ms/iter, ETA 15:48:26)
Iteration 30000 M( 29319943 )C, 0x1251164fa47a5274, n = 1835008, CUDALucas v2.03 err = 0.0099 (0:20 real, 1.9584 ms/iter, ETA 15:55:40)
Iteration 40000 M( 29319943 )C, 0xd0b78d06f7897616, n = 1835008, CUDALucas v2.03 err = 0.0099 (0:19 real, 1.9652 ms/iter, ETA 15:58:41)
Iteration 50000 M( 29319943 )C, 0xc9d9e96451672155, n = 1835008, CUDALucas v2.03 err = 0.0100 (0:20 real, 1.9489 ms/iter, ETA 15:50:23)
Iteration 60000 M( 29319943 )C, 0x0ccf100ccee6050c, n = 1835008, CUDALucas v2.03 err = 0.0100 (0:19 real, 1.9501 ms/iter, ETA 15:50:39)
[/CODE][/QUOTE]


Shouldn't Titan have 6GB of memory?

It seems that CuLu uses a signed Int32 to show memory usage: will it be the same for allocation?

Luigi

nucleon 2013-03-12 11:30

ETA 1hr50mins, and still going.

Sweet.

-- Craig

henryzz 2013-03-12 11:54

I can't remember is msieve poly selection double precision?

I assume gpu-ecm is. How do these run?
Once the program is ready a Titan should be able to get though a load of P-1s as well.

ATH 2013-03-12 13:12

[QUOTE=Redarm;332885]M( 57885161 )P, n = 3584K, CUDALucas v2.05 Alpha

proven in 55h[/QUOTE]

Versus 86-87h on the 580: [URL="http://www.mersenneforum.org/showpost.php?p=332024&postcount=103"]http://www.mersenneforum.org/showpost.php?p=332024&postcount=103[/URL]

so roughly ~ 50% faster even when downclocked.


Do you plan to test the stability on higher clock rates to find the highest stable clock? Or did you already do this?

Batalov 2013-03-12 16:46

For its price, shouldn't it be ~150% faster to break even?

xilman 2013-03-12 17:22

[QUOTE=henryzz;332951]I can't remember is msieve poly selection double precision?

I assume gpu-ecm is. How do these run?
Once the program is ready a Titan should be able to get though a load of P-1s as well.[/QUOTE]You assume incorrectly.

henryzz 2013-03-12 17:49

[QUOTE=xilman;332988]You assume incorrectly.[/QUOTE]

Integer arithmetic then? I thought stage 2 would be using fft? Knew I was forgetting something. It is stage 1 only.


All times are UTC. The time now is 21:57.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.