![]() |
[QUOTE=kracker;353006]Ok, well frankly, 3-4 DC's(all cpu) a day is my max at the moment. So not much.[/QUOTE]
That sounds like a good bit to me. My GPUs take between 20 and 22 hours per DC (~28M). |
FWIW:
[code] Adapter 0 - AMD Radeon HD 7900 Series New Core Peak : 1000 New Memory Peak : 1500 Platform :Advanced Micro Devices, Inc. Device 0 : Tahiti start M57885161 fft length = 3145728 err = 0.352051, increasing n from 3145728 start M57885161 fft length = 3276800 Iteration 10000 M( 57885161 )C, 0x76c27556683cd84d, n = 3276800, clLucas v1.00 err = 0.1289 (4:01 real, 24.1701 ms/iter, ETA 388:32:03) Iteration 20000 M( 57885161 )C, 0xfd8e311d20ffe6ab, n = 3276800, clLucas v1.00 err = 0.1289 (4:00 real, 23.9985 ms/iter, ETA 385:42:34) [/code]I'd expect more with this 7970, but the 1x PCIe slot may be a bottleneck. GPU load is 99%, using the -aggressive option. I also tested a couple of 5870s, but they keep throwing this error [code] Error: CommandQueue::enqueueNDRangeKernel() failed. Error code : CL_INVALID_WORK_GROUP_SIZE Location : Kernels.cpp:425 [/code]and tests with known primes fail. However, I'm quite happy with the overall development so far, nice work msft :cool: |
See [URL="http://mersenneforum.org/showpost.php?p=350303&postcount=64"]this[/URL]:
2097152 and 4194304 fft work best.(-f) Also, you might want to stick with DC's(for now) until we know clLucas can do those ranges. |
[QUOTE=kracker;353065]See [URL="http://mersenneforum.org/showpost.php?p=350303&postcount=64"]this[/URL]:
2097152 and 4194304 fft work best.(-f) Also, you might want to stick with DC's(for now) until we know clLucas can do those ranges.[/QUOTE] Thanks :) Also, that 57885161 was just a speed test (M#48), I'm already working on a DC. |
HD 7770
[code] Iteration 10000 M( 22256453 )C, 0x3d9450d492b7e880, n = 1179648, clLucas v1.00 err = 0.2656 (2:39 real, 15.9270 ms/iter, ETA 98:23:35) Iteration 10000 M( 24732709 )C, 0x81a12a304a754572, n = 1310720, clLucas v1.00 err = 0.2813 (3:04 real, 18.4028 ms/iter, ETA 126:21:56) Iteration 10000 M( 29412433 )C, 0x27d7d112a73aa203, n = 1572864, clLucas v1.00 err = 0.25 (4:02 real, 24.1517 ms/iter, ETA 197:14:18) Iteration 10000 M( 30620113 )C, 0x212dca3cec0acde2, n = 1638400, clLucas v1.00 err = 0.25 (5:11 real, 31.0753 ms/iter, ETA 264:13:34) Iteration 10000 M( 32993419 )C, 0xcf86a69b844e35c0, n = 1769472, clLucas v1.00 err = 0.2813 (7:18 real, 43.7117 ms/iter, ETA 400:26:52) Iteration 10000 M( 36418493 )C, 0x2f1388379572d5b4, n = 1966080, clLucas v1.00 err = 0.25 (5:30 real, 33.0295 ms/iter, ETA 333:57:52) Iteration 10000 M( 38955173 )C, 0x8a45e3bbd4e4fc9b, n = 2097152, clLucas v1.00 err = 0.25 (2:03 real, 12.2586 ms/iter, ETA 132:35:49) Iteration 10000 M( 43792559 )C, 0x7048d84bbfb0f810, n = 2359296, clLucas v1.00 err = 0.2813 (5:54 real, 35.3544 ms/iter, ETA 429:56:54) Iteration 10000 M( 48375209 )C, 0xf957e240d591a99e, n = 2621440, clLucas v1.00 err = 0.2188 (6:33 real, 39.2538 ms/iter, ETA 527:18:36) Iteration 10000 M( 57899201 )C, 0xa2ac01bbc76d92ee, n = 3145728, clLucas v1.00 err = 0.25 (9:00 real, 53.9709 ms/iter, ETA 867:43:57) Iteration 10000 M( 60622229 )C, 0xd81c849f11fd1054, n = 3276800, clLucas v1.00 err = 0.2813 (11:06 real, 66.5953 ms/iter, ETA 1121:12:22) Iteration 10000 M( 65066623 )C, 0xde7aeb8cc7a2a826, n = 3538944, clLucas v1.00 err = 0.2539 (16:00 real, 96.0663 ms/iter, ETA 1735:51:51) Iteration 10000 M( 67662869 )C, 0xf854d1dee3fbb5d7, n = 3932160, clLucas v1.00 err = 0.05078 (11:59 real, 71.8933 ms/iter, ETA 1350:59:44) Iteration 10000 M( 76722161 )C, 0x4b6ba0a6078e4bbb, n = 4194304, clLucas v1.00 err = 0.25 (4:14 real, 24.9663 ms/iter, ETA 540:43:00) [/code] |
The errors are nicely low. Is this using the double double sin and cos data for the ffts?
|
[QUOTE=owftheevil;353091]The errors are nicely low. Is this using the double double sin and cos data for the ffts?[/QUOTE]
You're going to have to ask msft about that, I only compiled it... that's all. :razz: EDIT: Just realized I won't be able to submit my DC(yet) when it is doen to Primenet. |
[QUOTE=LaurV;352974]Ok, played with it, but the result is very odd![/QUOTE]
Hi, Can you check with -threads option? [code] $ ./clLucas -threads 64 36666666 $ ./clLucas -threads 128 36666666 [/code] |
[QUOTE=owftheevil;353091]The errors are nicely low. Is this using the double double sin and cos data for the ffts?[/QUOTE]
Hi, Double sin cos,I make fftw like function. [code] /* * Improve accuracy by reducing x to range [0..1/8] * before multiplication by 2 * PI. */ #define K2PI 6.2831853071795864769252867665590057683943388 #define m2pi(m, n) ((K2PI * (m)) / (n)) static void ap_sincos(int m, int n, double * si, double * co) { double s,c,theta; int n14,n24,n34,n44; int m14,m44; int n18,n28,n38,n48,n58,n68,n78; int m88; n14 = n; n24 = n14 + n14; n34 = n14 + n24; n44 = n24 + n24; m14 = m; m44 = m14 + m14; m44 = m44 + m44; n18 = n; n28 = n18 + n18; n38 = n18 + n28; n48 = n28 + n28; n58 = n28 + n38; n68 = n38 + n38; n78 = n38 + n48; m88 = m + m; m88 = m88 + m88; m88 = m88 + m88; if(n18 > m88) { theta = m2pi(m44,n44); s = sin(theta); c = cos(theta); } else if(n28 > m88) { theta = m2pi(n14-m44,n44); s = cos(theta); c = sin(theta); } else if(n38 > m88) { theta = m2pi(m44-n14,n44); s = cos(theta); c = -sin(theta); } else if(n48 > m88) { theta = m2pi(n24-m44,n44); s = sin(theta); c = -cos(theta); } else if(n58 > m88) { theta = m2pi(m44-n24,n44); s = -sin(theta); c = -cos(theta); } else if(n68 > m88) { theta = m2pi(n34-m44,n44); s = -cos(theta); c = -sin(theta); } else if(n78 > m88) { theta = m2pi(m44-n34,n44); s = -cos(theta); c = sin(theta); } else { theta = m2pi(n44-m44,n44); s = -sin(theta); c = cos(theta); } *si = s; *co = c; } [/code] |
[QUOTE=TeknoHog;353061]I also tested a couple of 5870s, but they keep throwing this error
[code] Error: CommandQueue::enqueueNDRangeKernel() failed. Error code : CL_INVALID_WORK_GROUP_SIZE Location : Kernels.cpp:425 [/code]and tests with known primes fail. [/QUOTE] Hi, I found bug with this issue,Fix next version. |
[QUOTE=LaurV;352976]Trying to squeeze some more juice from the Tahiti, I believe this is what msft is after, isn't he? :smile:[/QUOTE]
Hi, Error occurs just 30000. |
| All times are UTC. The time now is 13:00. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.