![]() |
[QUOTE=Redarm;332885]M( 57885161 )P, n = 3584K, CUDALucas v2.05 Alpha
proven in 55h[/QUOTE] Sweet. So that shows that the Titans are not DOA, although they need some TLC (read: tweaking). What is the Cost / Benefit ratio? |
[QUOTE=ixfd64;332877]It's too bad the Kepler cards are so inefficient at trial factoring. In comparison, the three-year-old GTX 480 cranks out 373.6 GHz-days/day (according to James' benchmark page) with a mere 480 cores at 700 MHz.[/QUOTE][LIST][*]GF100/GF110 (e.g. GTX 470, 480, 570, 580): 3.04billion transistors, 512 cores[*]GK104 (e.g. GTX 670, 680): 3.54billion transistors, 1536 cores[/LIST]So they did some magic voodoo (no, not 3dfx in this case) or the cores are just simpler or stripped down. On Fermi class GPUs each core can do integer multiply, on Kepler this is not the case, only a fraction can do it. Same for integer add, compare, whatever...
Oliver |
[QUOTE=TheJudger;332889][LIST][*]GF100/GF110 (e.g. GTX 470, 480, 570, 580): 3.04billion transistors, 512 cores[*]GK104 (e.g. GTX 670, 680): 3.54billion transistors, 1536 cores[/LIST]So they did some magic voodoo (no, not 3dfx in this case) or the cores are just simpler or stripped down. On Fermi class GPUs each core can do integer multiply, on Kepler this is not the case, only a fraction can do it. Same for integer add, compare, whatever...
Oliver[/QUOTE] Kepler chips have no double clocked shaders like Fermis or Teslas. |
[QUOTE=TheJudger;332889]So they did some magic voodoo (no, not 3dfx in this case)[...][/QUOTE]
Oh, btw: I'm still dreaming that ine day I'll own a working (with "PCI rework") [B]Voodoo 5 6000[/B]... [QUOTE=Ralf Recker;332893]Kepler chips have no double clocked shaders like Fermis or Teslas.[/QUOTE] Yes, but Keplers have higher base clock so this isn't really twice the clock. GTX 580 is running at 1544MHz (shaders), GTX 680 is running at 1006+MHz. For mfaktc clock rates close to 1100MHz seems to be normal for the boost clock. So Fermi has ~50% higher clock rate while Kepler has 3 times mores shaders (GF110 vs. GK104). But I guess you know this already. Oliver |
[QUOTE=nucleon;332883]
[CODE]$ ./CUDALucas-2.03-cuda4.2-sm_30-x86-64.exe -t ------- DEVICE 0 ------- name GeForce GTX TITAN totalGlobalMem [COLOR="Red"] 2147287040[/COLOR] sharedMemPerBlock 49152 regsPerBlock 65536 warpSize 32 memPitch [COLOR="Red"]2147483647[/COLOR] maxThreadsPerBlock 1024 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] [COLOR="Red"]2147483647[/COLOR],65535,65535 totalConstMem 65536 Compatibility 3.5 clockRate (MHz) 875 textureAlignment 512 deviceOverlap 1 multiProcessorCount 14 doesn't begin with Test= or DoubleCheck=. Starting M29319943 fft length = 1572864 iteration = 25 < 1000 && err = 0.264275 >= 0.25, increasing n from 1572864 Starting M29319943 fft length = 1835008 Iteration 10000 M( 29319943 )C, 0xf43887980b952e31, n = 1835008, CUDALucas v2.03 err = 0.0094 (0:20 real, 1.9694 ms/iter, ETA 16:01:43) Iteration 20000 M( 29319943 )C, 0xfa7f1f3ff9688114, n = 1835008, CUDALucas v2.03 err = 0.0099 (0:19 real, 1.9429 ms/iter, ETA 15:48:26) Iteration 30000 M( 29319943 )C, 0x1251164fa47a5274, n = 1835008, CUDALucas v2.03 err = 0.0099 (0:20 real, 1.9584 ms/iter, ETA 15:55:40) Iteration 40000 M( 29319943 )C, 0xd0b78d06f7897616, n = 1835008, CUDALucas v2.03 err = 0.0099 (0:19 real, 1.9652 ms/iter, ETA 15:58:41) Iteration 50000 M( 29319943 )C, 0xc9d9e96451672155, n = 1835008, CUDALucas v2.03 err = 0.0100 (0:20 real, 1.9489 ms/iter, ETA 15:50:23) Iteration 60000 M( 29319943 )C, 0x0ccf100ccee6050c, n = 1835008, CUDALucas v2.03 err = 0.0100 (0:19 real, 1.9501 ms/iter, ETA 15:50:39) [/CODE][/QUOTE] Shouldn't Titan have 6GB of memory? It seems that CuLu uses a signed Int32 to show memory usage: will it be the same for allocation? Luigi |
ETA 1hr50mins, and still going.
Sweet. -- Craig |
I can't remember is msieve poly selection double precision?
I assume gpu-ecm is. How do these run? Once the program is ready a Titan should be able to get though a load of P-1s as well. |
[QUOTE=Redarm;332885]M( 57885161 )P, n = 3584K, CUDALucas v2.05 Alpha
proven in 55h[/QUOTE] Versus 86-87h on the 580: [URL="http://www.mersenneforum.org/showpost.php?p=332024&postcount=103"]http://www.mersenneforum.org/showpost.php?p=332024&postcount=103[/URL] so roughly ~ 50% faster even when downclocked. Do you plan to test the stability on higher clock rates to find the highest stable clock? Or did you already do this? |
For its price, shouldn't it be ~150% faster to break even?
|
[QUOTE=henryzz;332951]I can't remember is msieve poly selection double precision?
I assume gpu-ecm is. How do these run? Once the program is ready a Titan should be able to get though a load of P-1s as well.[/QUOTE]You assume incorrectly. |
[QUOTE=xilman;332988]You assume incorrectly.[/QUOTE]
Integer arithmetic then? I thought stage 2 would be using fft? Knew I was forgetting something. It is stage 1 only. |
| All times are UTC. The time now is 21:57. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.