![]() |
Very first Titan results
[QUOTE=Brain;331780]This time CL 2.03 - Win64 - CUDA 5.0 - SM 2.0 & 3.5.
Supposed for Titan tests: [URL]https://dl.dropbox.com/u/72392549/CUDALucas-2.03-5.0-x64.exe.zip[/URL] [URL]https://dl.dropbox.com/u/72392549/CUDALucas_Libs_x64_50.zip[/URL][/QUOTE] Very first results: [CODE]CUDALucas-2.03-5.0-sm_35-x64.exe -f 3211264 57885161 Warning: No ini file detected. Using defaults for non-specified options. Starting M57885161 fft length = 3211264 Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8801 ms/iter, ETA 46:17:51) Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8939 ms/iter, ETA 46:30:43) Iteration 30000 M( 57885161 )C, 0xce0d85ab0065a232, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8696 ms/iter, ETA 46:06:49) Iteration 40000 M( 57885161 )C, 0x744cba031972157f, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:28 real, 2.8664 ms/iter, ETA 46:03:13) Iteration 50000 M( 57885161 )C, 0x3c87ab95f73ad13a, n = 3211264, CUDALucas v2.03 err = 0.2617 (0:29 real, 2.8675 ms/iter, ETA 46:03:47) [/CODE]Activating DP in driver does make ~55% difference (faster). Will go to bed now. Btw, I had to update my BIOS. Otherwise, with Titan installed I could not even boot. :-( I will have to check system stability. For now, everything's looking good. |
[QUOTE=Brain;331979]Very first results:
[CODE]CUDALucas-2.03-5.0-sm_35-x64.exe -f 3211264 57885161 Warning: No ini file detected. Using defaults for non-specified options. Starting M57885161 fft length = 3211264 Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8801 ms/iter, ETA 46:17:51) Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8939 ms/iter, ETA 46:30:43) Iteration 30000 M( 57885161 )C, 0xce0d85ab0065a232, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:29 real, 2.8696 ms/iter, ETA 46:06:49) Iteration 40000 M( 57885161 )C, 0x744cba031972157f, n = 3211264, CUDALucas v2.03 err = 0.2578 (0:28 real, 2.8664 ms/iter, ETA 46:03:13) Iteration 50000 M( 57885161 )C, 0x3c87ab95f73ad13a, n = 3211264, CUDALucas v2.03 err = 0.2617 (0:29 real, 2.8675 ms/iter, ETA 46:03:47) [/CODE] [/QUOTE] Does anyone know how this compares against 580? |
Jerry's run was with the safe FFT size of 3670016, and had 5.40 ms/iter
[CODE]Iteration 100000 M( 57885161 )C, 0xe54ba81dac4ff3d8, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4058 ms/iter, ETA 86:45:49) Iteration 200000 M( 57885161 )C, 0xd7128445f5c5747e, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4041 ms/iter, ETA 86:35:10) Iteration 300000 M( 57885161 )C, 0x4332497b4eefcf79, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4031 ms/iter, ETA 86:25:10) Iteration 400000 M( 57885161 )C, 0x8488c2c5dcfec2a1, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4049 ms/iter, ETA 86:17:53) Iteration 500000 M( 57885161 )C, 0xf362026a2cd691fd, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4057 ms/iter, ETA 86:09:37) Iteration 600000 M( 57885161 )C, 0x27700576a6eb689d, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4027 ms/iter, ETA 85:57:47) Iteration 700000 M( 57885161 )C, 0x28fbaf6fdd566d5f, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4076 ms/iter, ETA 85:53:26) Iteration 800000 M( 57885161 )C, 0x21414d014f00e9b1, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4085 ms/iter, ETA 85:45:19) Iteration 900000 M( 57885161 )C, 0x183564c507d8d431, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4063 ms/iter, ETA 85:34:10) [/CODE] rcv had another run, but [URL="http://mersenneforum.org/showpost.php?p=331785&postcount=96"]his post[/URL] has no timing data. |
1 Attachment(s)
[STRIKE]Residues don't match... they should.[/STRIKE]
Edit: For reference I attached the full run of 57885161. It's nice to see the Titan double the speed, as long as it's accurate. Edit 2: Silly me :blush:, I knew I was missing something. I'll try it tomorrow, I'm heading to bed now. |
He posted 10,000-staggered residues. (not 100,000)
Can you try the other fft sizes for a better speed comparison on your 580? -f 3211264, -f 3407872, -f 3276800 ? -f 3145728 still bails out? (I am trying to guess available sizes) |
Excerpting from jerry's updated file:
[CODE]Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:54 real, 5.3929 ms/iter, ETA 86:41:25) Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:54 real, 5.4033 ms/iter, ETA 86:50:37) Iteration 30000 M( 57885161 )C, 0xce0d85ab0065a232, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:54 real, 5.4021 ms/iter, ETA 86:48:32) Iteration 40000 M( 57885161 )C, 0x6746379dfc966410, n = 3670016, CUDALucas v2.03 err = 0.0273 (0:55 real, 5.4061 ms/iter, ETA 86:51:30) Iteration 50000 M( 57885161 )C, 0xa5797ceaebc59091, n = 3670016, CUDALucas v2.03 err = 0.0281 (0:54 real, 5.4083 ms/iter, ETA 86:52:44) [/CODE] It is disagreeing from 4th one onwards. Obviously the Titan run needs to be done with a larger FFT. |
1 Attachment(s)
Attached is the head and tail of my full-run of the New Prime on a GTX 570. (For speed comparison, my 570 is running at a 1464 MHz clock, with 15 SMs.)
At the head of my run, you will see a few test runs, with different FFT lengths, until I decide on a good balance between speed and accuracy. [Remember, good CuFFT lengths are a product of powers of 2, 3, 5, and 7, with the bigger factors generally being less efficient than the smaller factors.] (I finally chose n = 3402K = 2^11 * 3^5 * 5^0 * 7^1, which had max errors in the 0.04xx range.) This is with the 4.1.28 version of CuFFT: [CODE]lrwxrwxrwx 1 root staff 13 Jan 27 2012 libcufft.so -> libcufft.so.4 lrwxrwxrwx 1 root staff 18 Jan 27 2012 libcufft.so.4 -> libcufft.so.4.1.28 -rwxr-xr-x 1 root staff 26557304 Jan 27 2012 libcufft.so.4.1.28[/CODE]The smallest FFT size I tried (that appeared to be working) was n=3200K with errors in the range 0.14xxx. Contrast to brain's run on Titan at n=3176K=2^16*7^2, reporting errors in the 0.26xxx range. Although it is on the edge, I think Brain's run *should* have worked. As axn points out, the residues are wrong!!! I hope somebody can try a Titan run with slightly larger FFT size. |
[QUOTE=flashjh;332024][STRIKE]Residues don't match... they should.[/STRIKE]
Edit: For reference I attached the full run of 57885161. It's nice to see the Titan double the speed, as long as it's accurate. [/QUOTE] Now the Titan seems to be accurate with larger FFT (residues match so far): [CODE]Iteration 3370000 M( 57885161 )C, 0xa21631bf7db89b35, n = 3670016, CUDALucas v2.03 err = 0.0117 (0:39 real, 3.8927 ms/iter, ETA 58:56:31) Iteration 3380000 M( 57885161 )C, 0x6afd076a23a5fbf8, n = 3670016, CUDALucas v2.03 err = 0.0117 (0:39 real, 3.8940 ms/iter, ETA 58:57:01) Iteration 3390000 M( 57885161 )C, 0x4558cdaf71c60894, n = 3670016, CUDALucas v2.03 err = 0.0118 (0:38 real, 3.8303 ms/iter, ETA 57:58:33) Iteration 3400000 M( 57885161 )C, 0x4f5fe35b1f269d80, n = 3670016, CUDALucas v2.03 err = 0.0118 (0:39 real, 3.8604 ms/iter, ETA 58:25:14) Iteration 3410000 M( 57885161 )C, 0xf8234d4a1907d819, n = 3670016, CUDALucas v2.03 err = 0.0118 (0:38 real, 3.8606 ms/iter, ETA 58:24:48)[/CODE]flashjh: [CODE]Iteration 3370000 M( 57885161 )C, 0xa21631bf7db89b35, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.403 ms/iter, ETA 81:48:35) Iteration 3380000 M( 57885161 )C, 0x6afd076a23a5fbf8, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4008 ms/iter, ETA 81:45:43) Iteration 3390000 M( 57885161 )C, 0x4558cdaf71c60894, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4039 ms/iter, ETA 81:47:40) Iteration 3400000 M( 57885161 )C, 0x4f5fe35b1f269d80, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4104 ms/iter, ETA 81:52:38) Iteration 3410000 M( 57885161 )C, 0xf8234d4a1907d819, n = 3670016, CUDALucas v2.03 err = 0.0298 (0:54 real, 5.4049 ms/iter, ETA 81:46:43)[/CODE] Beware of greediness... |
Roundoff errors
[QUOTE=nucleon;331842]Default fan speed is around 45-50% I've upped it to 80%, and temps went from 72-75dedgC down to 50degC.
Hopefully this improves reliability aspect. I used MSI afterburner app to adjust fan speeds. -- Craig[/QUOTE] I'm suffering from sporadic Titan roundoff errors, too. Sometimes after minutes, sometimes after hours. FFT length should have been sufficient... I will try a test with lower frequency and a very lengthy FFT. I think I will use a 2M FFT to do some double checks. Very annoying. @nucleon: Something new at you? |
FYI: a full list of opportunities to win a Titan is available here: [url]http://www.overclock.net/t/1365601/facebook-giveaways-6-gtx-titans-asus-ares-ii-gtx-670-gtx-660-ti-7970-fx-8350-facebook-required[/url]
|
[QUOTE=Brain;332168]I'm suffering from sporadic Titan roundoff errors, too. Sometimes after minutes, sometimes after hours. FFT length should have been sufficient...
I will try a test with lower frequency and a very lengthy FFT. I think I will use a 2M FFT to do some double checks. Very annoying. @nucleon: Something new at you?[/QUOTE] After running it for a few days, I'm not overly happy with the product. I've switched it to TF to see how it went. Even mfaktc crashes every so often. -- Craig |
| All times are UTC. The time now is 10:32. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.