mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   Geforce GTX Titan 6GB (https://www.mersenneforum.org/showthread.php?t=17834)

Brain 2013-03-04 22:13

Very first Titan results
 
[QUOTE=Brain;331780]This time CL 2.03 - Win64 - CUDA 5.0 - SM 2.0 & 3.5.
Supposed for Titan tests:
[URL]https://dl.dropbox.com/u/72392549/CUDALucas-2.03-5.0-x64.exe.zip[/URL]
[URL]https://dl.dropbox.com/u/72392549/CUDALucas_Libs_x64_50.zip[/URL][/QUOTE]
Very first results:
[CODE]CUDALucas-2.03-5.0-sm_35-x64.exe -f 3211264 57885161

Warning: No ini file detected. Using defaults for non-specified options.
Starting M57885161 fft length = 3211264
Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8801 ms/iter, ETA 46:17:51)
Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8939 ms/iter, ETA 46:30:43)
Iteration 30000 M( 57885161 )C, 0xce0d85ab0065a232, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8696 ms/iter, ETA 46:06:49)
Iteration 40000 M( 57885161 )C, 0x744cba031972157f, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:28 real, 2.8664 ms/iter, ETA 46:03:13)
Iteration 50000 M( 57885161 )C, 0x3c87ab95f73ad13a, n = 3211264, CUDALucas v2.03 err = 0.2617 (0:29 real, 2.8675 ms/iter, ETA 46:03:47)
[/CODE]Activating DP in driver does make ~55% difference (faster).

Will go to bed now. Btw, I had to update my BIOS. Otherwise, with Titan installed I could not even boot. :-(
I will have to check system stability. For now, everything's looking good.

axn 2013-03-05 03:28

[QUOTE=Brain;331979]Very first results:
[CODE]CUDALucas-2.03-5.0-sm_35-x64.exe -f 3211264 57885161

Warning: No ini file detected. Using defaults for non-specified options.
Starting M57885161 fft length = 3211264
Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8801 ms/iter, ETA 46:17:51)
Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8939 ms/iter, ETA 46:30:43)
Iteration 30000 M( 57885161 )C, 0xce0d85ab0065a232, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8696 ms/iter, ETA 46:06:49)
Iteration 40000 M( 57885161 )C, 0x744cba031972157f, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:28 real, 2.8664 ms/iter, ETA 46:03:13)
Iteration 50000 M( 57885161 )C, 0x3c87ab95f73ad13a, n = 3211264, CUDALucas v2.03 err = 0.2617 (0:29 real, 2.8675 ms/iter, ETA 46:03:47)
[/CODE]
[/QUOTE]
Does anyone know how this compares against 580?

Batalov 2013-03-05 03:38

Jerry's run was with the safe FFT size of 3670016, and had 5.40 ms/iter
[CODE]Iteration 100000 M( 57885161 )C, 0xe54ba81dac4ff3d8, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4058 ms/iter, ETA 86:45:49)
Iteration 200000 M( 57885161 )C, 0xd7128445f5c5747e, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4041 ms/iter, ETA 86:35:10)
Iteration 300000 M( 57885161 )C, 0x4332497b4eefcf79, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4031 ms/iter, ETA 86:25:10)
Iteration 400000 M( 57885161 )C, 0x8488c2c5dcfec2a1, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4049 ms/iter, ETA 86:17:53)
Iteration 500000 M( 57885161 )C, 0xf362026a2cd691fd, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4057 ms/iter, ETA 86:09:37)
Iteration 600000 M( 57885161 )C, 0x27700576a6eb689d, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4027 ms/iter, ETA 85:57:47)
Iteration 700000 M( 57885161 )C, 0x28fbaf6fdd566d5f, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4076 ms/iter, ETA 85:53:26)
Iteration 800000 M( 57885161 )C, 0x21414d014f00e9b1, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4085 ms/iter, ETA 85:45:19)
Iteration 900000 M( 57885161 )C, 0x183564c507d8d431, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4063 ms/iter, ETA 85:34:10)
[/CODE]
rcv had another run, but [URL="http://mersenneforum.org/showpost.php?p=331785&postcount=96"]his post[/URL] has no timing data.

flashjh 2013-03-05 04:57

1 Attachment(s)
[STRIKE]Residues don't match... they should.[/STRIKE]

Edit: For reference I attached the full run of 57885161.

It's nice to see the Titan double the speed, as long as it's accurate.

Edit 2: Silly me :blush:, I knew I was missing something.

I'll try it tomorrow, I'm heading to bed now.

Batalov 2013-03-05 05:06

He posted 10,000-staggered residues. (not 100,000)

Can you try the other fft sizes for a better speed comparison on your 580?
-f 3211264, -f 3407872, -f 3276800 ?
-f 3145728 still bails out? (I am trying to guess available sizes)

axn 2013-03-05 06:05

Excerpting from jerry's updated file:
[CODE]Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:54 real, 5.3929 ms/iter, ETA 86:41:25)
Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:54 real, 5.4033 ms/iter, ETA 86:50:37)
Iteration 30000 M( 57885161 )C, 0xce0d85ab0065a232, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:54 real, 5.4021 ms/iter, ETA 86:48:32)
Iteration 40000 M( 57885161 )C, 0x6746379dfc966410, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:55 real, 5.4061 ms/iter, ETA 86:51:30)
Iteration 50000 M( 57885161 )C, 0xa5797ceaebc59091, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4083 ms/iter, ETA 86:52:44)
[/CODE]
It is disagreeing from 4th one onwards. Obviously the Titan run needs to be done with a larger FFT.

rcv 2013-03-05 12:21

1 Attachment(s)
Attached is the head and tail of my full-run of the New Prime on a GTX 570. (For speed comparison, my 570 is running at a 1464 MHz clock, with 15 SMs.)

At the head of my run, you will see a few test runs, with different FFT lengths, until I decide on a good balance between speed and accuracy. [Remember, good CuFFT lengths are a product of powers of 2, 3, 5, and 7, with the bigger factors generally being less efficient than the smaller factors.] (I finally chose n = 3402K = 2^11 * 3^5 * 5^0 * 7^1, which had max errors in the 0.04xx range.)

This is with the 4.1.28 version of CuFFT:
[CODE]lrwxrwxrwx 1 root staff 13 Jan 27 2012 libcufft.so -> libcufft.so.4
lrwxrwxrwx 1 root staff 18 Jan 27 2012 libcufft.so.4 -> libcufft.so.4.1.28
-rwxr-xr-x 1 root staff 26557304 Jan 27 2012 libcufft.so.4.1.28[/CODE]The smallest FFT size I tried (that appeared to be working) was n=3200K with errors in the range 0.14xxx. Contrast to brain's run on Titan at n=3176K=2^16*7^2, reporting errors in the 0.26xxx range. Although it is on the edge, I think Brain's run *should* have worked. As axn points out, the residues are wrong!!! I hope somebody can try a Titan run with slightly larger FFT size.

Brain 2013-03-05 18:17

[QUOTE=flashjh;332024][STRIKE]Residues don't match... they should.[/STRIKE]

Edit: For reference I attached the full run of 57885161.

It's nice to see the Titan double the speed, as long as it's accurate.
[/QUOTE]
Now the Titan seems to be accurate with larger FFT (residues match so far):
[CODE]Iteration 3370000 M( 57885161 )C, 0xa21631bf7db89b35, n = 3670016, CUDALucas v2.03 err = 0.0117 (0:39 real, 3.8927 ms/iter, ETA 58:56:31)
Iteration 3380000 M( 57885161 )C, 0x6afd076a23a5fbf8, n = 3670016, CUDALucas v2.03 err = 0.0117 (0:39 real, 3.8940 ms/iter, ETA 58:57:01)
Iteration 3390000 M( 57885161 )C, 0x4558cdaf71c60894, n = 3670016, CUDALucas v2.03 err = 0.0118 (0:38 real, 3.8303 ms/iter, ETA 57:58:33)
Iteration 3400000 M( 57885161 )C, 0x4f5fe35b1f269d80, n = 3670016, CUDALucas v2.03 err = 0.0118 (0:39 real, 3.8604 ms/iter, ETA 58:25:14)
Iteration 3410000 M( 57885161 )C, 0xf8234d4a1907d819, n = 3670016, CUDALucas v2.03 err = 0.0118 (0:38 real, 3.8606 ms/iter, ETA 58:24:48)[/CODE]flashjh:
[CODE]Iteration 3370000 M( 57885161 )C, 0xa21631bf7db89b35, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.403 ms/iter, ETA 81:48:35)
Iteration 3380000 M( 57885161 )C, 0x6afd076a23a5fbf8, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4008 ms/iter, ETA 81:45:43)
Iteration 3390000 M( 57885161 )C, 0x4558cdaf71c60894, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4039 ms/iter, ETA 81:47:40)
Iteration 3400000 M( 57885161 )C, 0x4f5fe35b1f269d80, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4104 ms/iter, ETA 81:52:38)
Iteration 3410000 M( 57885161 )C, 0xf8234d4a1907d819, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4049 ms/iter, ETA 81:46:43)[/CODE]
Beware of greediness...

Brain 2013-03-06 15:30

Roundoff errors
 
[QUOTE=nucleon;331842]Default fan speed is around 45-50% I've upped it to 80%, and temps went from 72-75dedgC down to 50degC.

Hopefully this improves reliability aspect.

I used MSI afterburner app to adjust fan speeds.

-- Craig[/QUOTE]
I'm suffering from sporadic Titan roundoff errors, too. Sometimes after minutes, sometimes after hours. FFT length should have been sufficient...

I will try a test with lower frequency and a very lengthy FFT. I think I will use a 2M FFT to do some double checks. Very annoying.
@nucleon: Something new at you?

ixfd64 2013-03-06 19:26

FYI: a full list of opportunities to win a Titan is available here: [url]http://www.overclock.net/t/1365601/facebook-giveaways-6-gtx-titans-asus-ares-ii-gtx-670-gtx-660-ti-7970-fx-8350-facebook-required[/url]

nucleon 2013-03-06 20:53

[QUOTE=Brain;332168]I'm suffering from sporadic Titan roundoff errors, too. Sometimes after minutes, sometimes after hours. FFT length should have been sufficient...

I will try a test with lower frequency and a very lengthy FFT. I think I will use a 2M FFT to do some double checks. Very annoying.
@nucleon: Something new at you?[/QUOTE]

After running it for a few days, I'm not overly happy with the product. I've switched it to TF to see how it went. Even mfaktc crashes every so often.

-- Craig


All times are UTC. The time now is 10:32.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.