![]() |
|
|
#859 |
|
"Jerry"
Nov 2011
Vancouver, WA
21438 Posts |
So I finsihed M26026433 with 1.49 and 1.61. They matched until iteration 20270000:
1.49 - Code:
Iteration 20260000 6.5 msec/Iter ETA:37481.8 sec M( 26026433 )C, 0xc33c4e6850d51e00, n = 1572864, CUDALucas v1.49 Iteration 20270000 6.5 msec/Iter ETA:37416.8 sec M( 26026433 )C, 0x20a4fdbbb8670afe, n = 1572864, CUDALucas v1.49 Code:
Iteration 20260000 M( 26026433 )C, 0xc33c4e6850d51e00, n = 1572864, CUDALucas v1.61 (0:58 real, 5.8703 ms/iter, ETA 9:23:32) Iteration 20270000 M( 26026433 )C, 0xaf1bbcd34aa82da2, n = 1572864, CUDALucas v1.61 (0:59 real, 5.8674 ms/iter, ETA 9:22:17) ![]() ![]() I attached the full run of both 1.49 and 1.61 just in case anyone else wants to do some testing. Everyone else is having good luck with 1.58, so i think that it must be ok. I probably have bad video card memory. I have a few more papers to write and then I can concentrate on compliling the memory test. If anyone else beats me to it, no hard feelings. BTW - 1.61 is a lot faster, so hopefully it will work. I'm going to start a few tests on 1.61 all by itself with a couple of exponents that have been posted since M26026433. Maybe by video card just doesn't like that exponent? ![]() PS - Also accepting more 1.58 and 1.61 results posted here so we can continue to troubleshoot and gain confidence. |
|
|
|
|
|
#860 |
|
Jun 2011
2038 Posts |
BTW we need to put flushing after each output line back in under Windows (the half lines and delays are driving me crazy
)
|
|
|
|
|
|
#861 | |
|
Romulan Interpreter
Jun 2011
Thailand
3·3,221 Posts |
I just finished this morning testing of another two exponents and some others still had few hours to go when I left the house for job, one hour ago.
Both of them were done with v1.61 on (intentionally) the same card (GTX580 3GB mem version, overclocked to 822MHz (factory 791MHz)). Strange is the fact that there was a match for 26166407, and one mismatch for 26177689. I have the screen output saved in text files, for both the good and the bad, every 30k iterations, if someone need them for something (retest, comparisons, etc). Quote:
But since we started with non-power-of-2-FFT optimization, everything got insane... I like v1.61, much faster, nice output format. I will help with testing, I want to help making it more stable, but I don't have the time and the knowledge to help with programming right now. If you want me to do something special related to that, just say. I will redo this one from you (26026433), together with mine one from above which was mismatched (26177689), using the same idea (test them on the same card in parallel, if the card gets nuts/hot/bored, both expos should be fk'd'up) over the weekend and I will let you know. This time I will save all the partial files (by copying them every 30 minutes from c_xxxxxxx or t_xxxxxxx into c/t_xxxxxx_001/002/etc, I already wrote a batch file last night to help me with this). So in case we get a mismatch, we can re-do only the bad part. This will help to insulate the problem. Last fiddled with by LaurV on 2012-03-02 at 02:34 |
|
|
|
|
|
|
#862 | |||||
|
"Jerry"
Nov 2011
Vancouver, WA
1,123 Posts |
Quote:
Quote:
Quote:
Quote:
Quote:
I wonder if my video card is getting too hot? Doesn't seem like the problem though because 1.49 has matched like 4 or 5 times now while the rest match sometimes and not other times. |
|||||
|
|
|
|
|
#863 |
|
Jul 2009
Tokyo
2×5×61 Posts |
Hi,
Ver 1.62 Fix inv warning. Fix durty print when exponent is prime. Add fflush (Thanks apsen). Add device information print. Change residue test exponent. Add save all check point file option. Add Set fft length option. Code Cleanup. Thank you for lots of help. Code:
$ ./CUDALucas -r DEVICE:0------------------------ name GeForce GTX 460 totalGlobalMem 804454400 sharedMemPerBlock 49152 regsPerBlock 32768 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 65535,65535,65535 totalConstMem 65536 major.minor 2.1 clockRate 1350000 textureAlignment 512 deviceOverlap 1 multiProcessorCount 7 Iteration 10000 M( 756893 )C, 0xb94c673f25fe7ded, n = 65536, CUDALucas v1.62 (0:04 real, 0.3873 ms/iter, ETA 4:46) Iteration 10000 M( 859433 )C, 0x3c4ad525c2d0aed0, n = 65536, CUDALucas v1.62 (0:04 real, 0.3737 ms/iter, ETA 5:13) Iteration 10000 M( 1257787 )C, 0x3f45bf9bea7213ea, n = 98304, CUDALucas v1.62 (0:05 real, 0.5869 ms/iter, ETA 12:07) Iteration 10000 M( 1398269 )C, 0xa4a6d2f0e34629db, n = 98304, CUDALucas v1.62 (0:07 real, 0.6078 ms/iter, ETA 13:58) Iteration 10000 M( 2976221 )C, 0x2a7111b7f70fea2f, n = 163840, CUDALucas v1.62 (0:09 real, 0.9124 ms/iter, ETA 45:00) Iteration 10000 M( 3021377 )C, 0x6387a70a85d46baf, n = 163840, CUDALucas v1.62 (0:09 real, 0.9012 ms/iter, ETA 45:12) Iteration 10000 M( 6972593 )C, 0x88f1d2640adb89e1, n = 393216, CUDALucas v1.62 (0:21 real, 2.1014 ms/iter, ETA 4:03:45) Iteration 10000 M( 13466917 )C, 0x9fdc1f4092b15d69, n = 786432, CUDALucas v1.62 (0:42 real, 4.1782 ms/iter, ETA 15:36:37) Iteration 10000 M( 20996011 )C, 0x5fc58920a821da11, n = 1179648, CUDALucas v1.62 (0:58 real, 5.7841 ms/iter, ETA 33:42:29) Iteration 10000 M( 24036583 )C, 0xcbdef38a0bdc4f00, n = 1310720, CUDALucas v1.62 (1:07 real, 6.7199 ms/iter, ETA 44:50:12) Iteration 10000 M( 25964951 )C, 0x62eb3ff0a5f6237c, n = 1572864, CUDALucas v1.62 (1:22 real, 8.1972 ms/iter, ETA 59:05:16) Iteration 10000 M( 30402457 )C, 0x0b8600ef47e69d27, n = 1835008, CUDALucas v1.62 (1:32 real, 9.2709 ms/iter, ETA 78:15:42) Iteration 10000 M( 32582657 )C, 0x02751b7fcec76bb1, n = 1835008, CUDALucas v1.62 (1:32 real, 9.2723 ms/iter, ETA 83:53:17) err = 0.411325, increasing n from 1966080 Iteration 10000 M( 37156667 )C, 0x67ad7646a1fad514, n = 2097152, CUDALucas v1.62 (1:30 real, 8.9835 ms/iter, ETA 92:40:47) Iteration 10000 M( 42643801 )C, 0x8f90d78d5007bba7, n = 2359296, CUDALucas v1.62 (1:58 real, 11.7533 ms/iter, ETA 139:10:43) Iteration 10000 M( 43112609 )C, 0xe86891ebf6cd70c4, n = 2359296, CUDALucas v1.62 (1:58 real, 11.7494 ms/iter, ETA 140:39:59) $ ./CUDALucas -c 1000 -s 756893 DEVICE:0------------------------ name GeForce GTX 460 ~~~ start M756893 fft length = 65536 Iteration 1000 M( 756893 )C, 0x615ea033a371ca9a, n = 65536, CUDALucas v1.62 (0:01 real, 0.5074 ms/iter, ETA 6:23) Iteration 2000 M( 756893 )C, 0x76f26a440d5ccbf0, n = 65536, CUDALucas v1.62 (0:00 real, 0.3674 ms/iter, ETA 4:37) Iteration 3000 M( 756893 )C, 0x09ce424e95d1537d, n = 65536, CUDALucas v1.62 (0:01 real, 0.3669 ms/iter, ETA 4:36) Iteration 4000 M( 756893 )C, 0x8d8b29a43e8bda9e, n = 65536, CUDALucas v1.62 (0:00 real, 0.3797 ms/iter, ETA 4:45) Iteration 5000 M( 756893 )C, 0x4ed704ca77266721, n = 65536, CUDALucas v1.62 (0:00 real, 0.3876 ms/iter, ETA 4:51) Iteration 6000 M( 756893 )C, 0x7c8272c8bdd405cb, n = 65536, CUDALucas v1.62 (0:01 real, 0.3858 ms/iter, ETA 4:49) ^C caught. Writing checkpoint. $ ls c756893 CUDALucas.cu cuda_safecalls.h s756893.1001 s756893.3001 s756893.5001 s756893.6452 timeval.c CUDALucas CUDALucas.o Makefile s756893.2001 s756893.4001 s756893.6001 t756893 |
|
|
|
|
|
#864 |
|
Jul 2009
Tokyo
11428 Posts |
I make mismatch score.
Code:
M26176441,1.49,19283A19B247BA__ by "LaurV" on 2012-02-12(#733) M26176441,Prime95,C7949A2F450242__ by "Laurent Deniel" M26026433,1.50,190df3dc67d21885,flashjh(#762) M26026433,1.50,ee1b55e2b3e0c8b5,flashjh(#762) M26026433,1.49,457f73d49f90b822,flashjh(#806) M26026433,1.58,457f73d49f90b822,frmky(#852) M26026433,1.61,8785a2a489e2a25b,flashjh(#859) M26026433,1.49,457F73D49F90B822 by "LaurV" on 2012-02-12 M26026433,Prime95,457F73D49F90B822 by "Phantomas" on 2009-04-09 M26166389,1.58,CA3209D01495D1__ by "LaurV" on 2012-02-29(#836) M26166389,Prime95,074DDCFBF8F78__ by "Ahmer Ali" M26176597,1.58,AD22DF54329720__ by "LaurV" on 2012-02-29(#836) M26176597,Prime95,458D46360B696D__ by "Alessandro Polverini" M26177689,1.61,4D941722F9216A__ by "LaurV" on 2012-03-02(#861) M26177689,Prime95,943F047277037E__ by "C. Cooper / S. Boone" |
|
|
|
|
|
#865 |
|
Jul 2009
Tokyo
2·5·61 Posts |
I have enhancement,Please wait next version.
|
|
|
|
|
|
#866 |
|
Romulan Interpreter
Jun 2011
Thailand
3×3,221 Posts |
we have many other unreported, I can give you a list, but I think it does not help at all. Let's concentrate to the one we have the residue lists already and try to see where the error is coming from. I will do this over the weekend with v1.61
Eagerly waiting for next version! KUTGW! kudos! Last fiddled with by LaurV on 2012-03-02 at 08:34 |
|
|
|
|
|
#867 | |
|
Jul 2009
Tokyo
10011000102 Posts |
Quote:
step 1) Retry Prime95 (or gpulucas!). step 2) Retry CUDALucas (use mismatch report version). Now I start M26166409 on Prime95. |
|
|
|
|
|
|
#868 |
|
Jul 2009
Tokyo
2×5×61 Posts |
|
|
|
|
|
|
#869 |
|
Jul 2009
Tokyo
2×5×61 Posts |
Ver 1.63
Only use complex to complex fft. Code:
$ ./CUDALucas -r DEVICE:0------------------------ name GeForce GTX 460 totalGlobalMem 804454400 sharedMemPerBlock 49152 regsPerBlock 32768 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 65535,65535,65535 totalConstMem 65536 major.minor 2.1 clockRate 1350000 textureAlignment 512 deviceOverlap 1 multiProcessorCount 7 Iteration 10000 M( 756893 )C, 0xb94c673f25fe7ded, n = 65536, CUDALucas v1.63 (0:04 real, 0.3923 ms/iter, ETA 4:50) Iteration 10000 M( 859433 )C, 0x3c4ad525c2d0aed0, n = 65536, CUDALucas v1.63 (0:04 real, 0.3830 ms/iter, ETA 5:21) Iteration 10000 M( 1257787 )C, 0x3f45bf9bea7213ea, n = 98304, CUDALucas v1.63 (0:05 real, 0.5458 ms/iter, ETA 11:16) Iteration 10000 M( 1398269 )C, 0xa4a6d2f0e34629db, n = 98304, CUDALucas v1.63 (0:06 real, 0.5427 ms/iter, ETA 12:28) Iteration 10000 M( 2976221 )C, 0x2a7111b7f70fea2f, n = 163840, CUDALucas v1.63 (0:08 real, 0.7994 ms/iter, ETA 39:26) Iteration 10000 M( 3021377 )C, 0x6387a70a85d46baf, n = 163840, CUDALucas v1.63 (0:08 real, 0.7788 ms/iter, ETA 39:04) Iteration 10000 M( 6972593 )C, 0x88f1d2640adb89e1, n = 393216, CUDALucas v1.63 (0:19 real, 1.8962 ms/iter, ETA 3:39:57) Iteration 10000 M( 13466917 )C, 0x9fdc1f4092b15d69, n = 786432, CUDALucas v1.63 (0:37 real, 3.7644 ms/iter, ETA 14:03:50) Iteration 10000 M( 20996011 )C, 0x5fc58920a821da11, n = 1179648, CUDALucas v1.63 (0:51 real, 5.1375 ms/iter, ETA 29:56:25) Iteration 10000 M( 24036583 )C, 0xcbdef38a0bdc4f00, n = 1310720, CUDALucas v1.63 (1:00 real, 6.0356 ms/iter, ETA 40:16:16) Iteration 10000 M( 25964951 )C, 0x62eb3ff0a5f6237c, n = 1572864, CUDALucas v1.63 (1:14 real, 7.4174 ms/iter, ETA 53:28:01) Iteration 10000 M( 30402457 )C, 0x0b8600ef47e69d27, n = 1835008, CUDALucas v1.63 (1:23 real, 8.2723 ms/iter, ETA 69:49:53) Iteration 10000 M( 32582657 )C, 0x02751b7fcec76bb1, n = 1835008, CUDALucas v1.63 (1:23 real, 8.2783 ms/iter, ETA 74:53:45) err = 0.367069, increasing n from 1966080 Iteration 10000 M( 37156667 )C, 0x67ad7646a1fad514, n = 2097152, CUDALucas v1.63 (1:30 real, 9.0108 ms/iter, ETA 92:57:40) Iteration 10000 M( 42643801 )C, 0x8f90d78d5007bba7, n = 2359296, CUDALucas v1.63 (1:46 real, 10.5272 ms/iter, ETA 124:39:33) Iteration 10000 M( 43112609 )C, 0xe86891ebf6cd70c4, n = 2359296, CUDALucas v1.63 (1:45 real, 10.5222 ms/iter, ETA 125:58:25) |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |