![]() |
[QUOTE=flashjh;291141] Also, does adding (float) negate the speedup by changing s_inv from a double to float?... I could just ignore the warnings.
[/QUOTE] BIts and pieces. Thank you for your work! |
[QUOTE=Brain;291151](and night active) [/QUOTE]
Naw, jest a third of th' way aroun' th' wo'ld, cuss it all t' tarnation. ;);) |
[QUOTE=flashjh;291084]
So I've been testing resume. It works for quite a while and then I get a mismatch residue between 1.49 and 1.58. They still have ~11 hours to finish to see which one is correct (hopefully 1.58). I'll post when I know. I still haven't let 1.58 run all the way through by itself yet, that is next.[/QUOTE] 1.49 and 1.58 finished. 1.49 matched; 1.58 did not. I had to resume one time, so I have no way of knowing if that caused a problem. I wish it didn't take so long to test... I'm running 1.49 and 1.61 side-by-side now to see. Has anyone else tested 1.50 and up with good results (especially after resume) I'm inclined to think we need to move back to 1.49 and go from there. |
[QUOTE=flashjh;291157]Has anyone else tested 1.50 and up with good results (especially after resume)
[/QUOTE] Now I start test. |
[QUOTE=msft;290432]Hi ,
Ver 1.53 support residue test. [code] cudalucas.1.53$ ./CUDALucas -r Iteration 10000 2.6 msec/Iter ETA 2:46:19 M( 5000000 )C, 0x8c6648628dd918a6, n = 524288, CUDALucas v1.53 Iteration 10000 4.3 msec/Iter ETA 11:05:59 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.53 Iteration 10000 4.9 msec/Iter ETA 16:39:19 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.53 Iteration 10000 6.9 msec/Iter ETA 33:18:59 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.53 Iteration 10000 8.5 msec/Iter ETA 55:31:59 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.53 Iteration 10000 9.5 msec/Iter ETA 74:58:29 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.53 Iteration 10000 9.5 msec/Iter ETA 87:28:29 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.53 Iteration 10000 12.3 msec/Iter ETA 133:17:59 M( 40000000 )C, 0x2318fe9e59886055, n = 2359296, CUDALucas v1.53 Iteration 10000 14 msec/Iter ETA 174:57:39 M( 45000000 )C, 0x59b7ec093d4375da, n = 2621440, CUDALucas v1.53 Iteration 10000 17.2 msec/Iter ETA 259:40:29 M( 55000000 )C, 0x78f7b93265e51196, n = 3145728, CUDALucas v1.53 Iteration 10000 19.5 msec/Iter ETA 343:00:09 M( 65000000 )C, 0x679258d6765d8e5b, n = 3670016, CUDALucas v1.53 Iteration 10000 22.4 msec/Iter ETA 427:42:59 M( 70000000 )C, 0x652d4a670f44317e, n = 3932160, CUDALucas v1.53 Iteration 10000 19.5 msec/Iter ETA 395:46:49 M( 75000000 )C, 0xded1605fcc4a0f88, n = 4194304, CUDALucas v1.53 Iteration 10000 24.9 msec/Iter ETA 566:35:59 M( 85000000 )C, 0xf3211d616c78bc9f, n = 4718592, CUDALucas v1.53 Iteration 10000 28.6 msec/Iter ETA 738:48:39 M( 95000000 )C, 0x99f8f6024ac3c4d6, n = 5242880, CUDALucas v1.53 Iteration 10000 38.9 msec/Iter ETA 1055:26:59 M( 100000000 )C, 0x92ce4f1ec07668a3, n = 5767168, CUDALucas v1.53 [/code][/QUOTE] msft, Can you tell me how to use this function? Are the values you posted the reference ones or is there something else? Here's my test of 1.61 (4.1 | sm2.1) GTX580: [CODE]Iteration 10000 M( 300000 )C, 0xccd3c6710ee96727, n = 32768, CUDALucas v1.61 (0:01 real, 0.0995 ms/iter, ETA 0:27) Iteration 10000 M( 400000 )C, 0x49289743e63a185a, n = 32768, CUDALucas v1.61 (0:01 real, 0.1013 ms/iter, ETA 0:38) Iteration 10000 M( 500000 )C, 0x0d2b64647b2790db, n = 32768, CUDALucas v1.61 (0:01 real, 0.1012 ms/iter, ETA 0:48) Iteration 10000 M( 600000 )C, 0x52f766e46017f853, n = 32768, CUDALucas v1.61 (0:01 real, 0.1013 ms/iter, ETA 0:58) Iteration 10000 M( 700000 )C, 0xcb71353c1c6057f3, n = 65536, CUDALucas v1.61 (0:02 real, 0.1446 ms/iter, ETA 1:38) Iteration 10000 M( 800000 )C, 0xf7defb19e8c744a6, n = 65536, CUDALucas v1.61 (0:01 real, 0.1451 ms/iter, ETA 1:53) Iteration 10000 M( 900000 )C, 0xe379617d2b298f08, n = 65536, CUDALucas v1.61 (0:02 real, 0.1471 ms/iter, ETA 2:09) Iteration 10000 M( 1000000 )C, 0x8a7e626ebd1482d6, n = 65536, CUDALucas v1.61 (0:01 real, 0.1436 ms/iter, ETA 2:20) Iteration 10000 M( 1600000 )C, 0xf8ed27349af79417, n = 98304, CUDALucas v1.61 (0:03 real, 0.2526 ms/iter, ETA 6:39) Iteration 10000 M( 2000000 )C, 0xe97a31b61d296b22, n = 131072, CUDALucas v1.61 (0:02 real, 0.2432 ms/iter, ETA 8:01) Iteration 10000 M( 3000000 )C, 0xc36698d45f9f1411, n = 163840, CUDALucas v1.61 (0:04 real, 0.4363 ms/iter, ETA 21:40) Iteration 10000 M( 4000000 )C, 0x32e22ec56e2ec01a, n = 229376, CUDALucas v1.61 (0:05 real, 0.5372 ms/iter, ETA 35:38) Iteration 10000 M( 5000000 )C, 0x8c6648628dd918a6, n = 294912, CUDALucas v1.61 (0:06 real, 0.6537 ms/iter, ETA 54:15) Iteration 10000 M( 10000000 )C, 0x55318a84ffd14bc7, n = 589824, CUDALucas v1.61 (0:12 real, 1.2231 ms/iter, ETA 3:23:26) Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 917504, CUDALucas v1.61 (0:19 real, 1.8592 ms/iter, ETA 7:44:10) Iteration 10000 M( 19000000 )C, 0x11cdb0e202613cd7, n = 1048576, CUDALucas v1.61 (0:18 real, 1.8269 ms/iter, ETA 9:37:54) Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1179648, CUDALucas v1.61 (0:24 real, 2.3506 ms/iter, ETA 13:02:44) Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.61 (0:33 real, 3.3266 ms/iter, ETA 23:04:59) Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.61 (0:38 real, 3.7484 ms/iter, ETA 31:12:57) Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 1966080, CUDALucas v1.61 (0:42 real, 4.1539 ms/iter, ETA 40:21:42) Iteration 10000 M( 38000000 )C, 0x94754a0a7c728963, n = 2097152, CUDALucas v1.61 (0:35 real, 3.4553 ms/iter, ETA 36:27:11) Iteration 10000 M( 40000000 )C, 0x2318fe9e59886055, n = 2359296, CUDALucas v1.61 (0:48 real, 4.7923 ms/iter, ETA 53:13:17) Iteration 10000 M( 43112609 )C, 0xe86891ebf6cd70c4, n = 2359296, CUDALucas v1.61 (0:47 real, 4.7161 ms/iter, ETA 56:27:45) Iteration 10000 M( 45000000 )C, 0x59b7ec093d4375da, n = 2621440, CUDALucas v1.61 (0:54 real, 5.3904 ms/iter, ETA 67:21:01) Iteration 10000 M( 55000000 )C, 0x78f7b93265e51196, n = 3145728, CUDALucas v1.61 (1:02 real, 6.2336 ms/iter, ETA 95:12:04) Iteration 10000 M( 65000000 )C, 0x679258d6765d8e5b, n = 3670016, CUDALucas v1.61 (1:15 real, 7.5163 ms/iter, ETA 135:40:10) Iteration 10000 M( 70000000 )C, 0x652d4a670f44317e, n = 3932160, CUDALucas v1.61 (1:24 real, 8.3895 ms/iter, ETA 163:04:55) Iteration 10000 M( 75000000 )C, 0xded1605fcc4a0f88, n = 4194304, CUDALucas v1.61 (1:10 real, 6.9938 ms/iter, ETA 145:39:54) Iteration 10000 M( 85000000 )C, 0xf3211d616c78bc9f, n = 4718592, CUDALucas v1.61 (1:34 real, 9.4066 ms/iter, ETA 222:02:52) Iteration 10000 M( 95000000 )C, 0x99f8f6024ac3c4d6, n = 5242880, CUDALucas v1.61 (1:42 real, 10.1923 ms/iter, ETA 268:54:23) Iteration 10000 M( 100000000 )C, 0x92ce4f1ec07668a3, n = 5767168, CUDALucas v1.61 (2:23 real, 14.3520 ms/iter, ETA 398:35:17)[/CODE] Jerry |
[QUOTE=msft;291186]Now I start test.[/QUOTE]
1.49 and 1.61 residue mismatch after several hours. I have another test or two to run, but I think everything 1.50 and up have a problem. |
My "reference" values
[CODE]"F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.54\bin\CUDALucas.cuda4.1.sm_21.WIN64.exe" -r
Iteration 10000 2.2 msec/Iter ETA 2:46:19 M( 5000000 )C, 0x8c6648628dd918a6, n = 524288, CUDALucas v1.54 Iteration 10000 3.1 msec/Iter ETA 8:19:29 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.54 Iteration 10000 3.4 msec/Iter ETA 12:29:29 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.54 Iteration 10000 5.1 msec/Iter ETA 27:45:49 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.54 Iteration 10000 6.1 msec/Iter ETA 41:38:59 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.54 Iteration 10000 7 msec/Iter ETA 58:18:49 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.54 Iteration 10000 6.8 msec/Iter ETA 58:18:59 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.54 Iteration 10000 8.9 msec/Iter ETA 88:51:59 M( 40000000 )C, 0x2318fe9e59886055, n = 2359296, CUDALucas v1.54 Iteration 10000 10.2 msec/Iter ETA 124:58:19 M( 45000000 )C, 0x59b7ec093d4375da, n = 2621440, CUDALucas v1.54 Iteration 10000 12.4 msec/Iter ETA 183:17:59 M( 55000000 )C, 0x78f7b93265e51196, n = 3145728, CUDALucas v1.54 Iteration 10000 14.1 msec/Iter ETA 252:44:19 M( 65000000 )C, 0x679258d6765d8e5b, n = 3670016, CUDALucas v1.54 Iteration 10000 16.1 msec/Iter ETA 311:03:59 M( 70000000 )C, 0x652d4a670f44317e, n = 3932160, CUDALucas v1.54 Iteration 10000 14 msec/Iter ETA 291:37:39 M( 75000000 )C, 0xded1605fcc4a0f88, n = 4194304, CUDALucas v1.54 Iteration 10000 17.9 msec/Iter ETA 401:20:29 M( 85000000 )C, 0xf3211d616c78bc9f, n = 4718592, CUDALucas v1.54 Iteration 10000 20.6 msec/Iter ETA 527:43:19 M( 95000000 )C, 0x99f8f6024ac3c4d6, n = 5242880, CUDALucas v1.54 Iteration 10000 28 msec/Iter ETA 777:41:59 M( 100000000 )C, 0x92ce4f1ec07668a3, n = 5767168, CUDALucas v1.54[/CODE]They match. I did many restarts with my latest 1.50 dc. Was good. Waiting for first 1.58 to finish. |
[QUOTE=flashjh;291188]msft,
Can you tell me how to use this function? Are the values you posted the reference ones or is there something else?[/QUOTE] Use 1.58 choose_length routine. For compare under same fft length. |
I restarted 1.49 and 1.61, this time output each to a .txt file. I'll let them finish (~ 43 hours) and then I'll have the full list of residues for each in a file. And hopefully 1.49 will match again and I can start using that to troubleshoot the problem since I'll have a good list of M26026433 residues from start to finish.
|
[QUOTE=Brain;291194][CODE]"F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.54\bin\CUDALucas.cuda4.1.sm_21.WIN64.exe" -r
Iteration 10000 2.2 msec/Iter ETA 2:46:19 M( 5000000 )C, 0x8c6648628dd918a6, n = 524288, CUDALucas v1.54 Iteration 10000 3.1 msec/Iter ETA 8:19:29 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.54 Iteration 10000 3.4 msec/Iter ETA 12:29:29 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.54 Iteration 10000 5.1 msec/Iter ETA 27:45:49 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.54 Iteration 10000 6.1 msec/Iter ETA 41:38:59 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.54 Iteration 10000 7 msec/Iter ETA 58:18:49 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.54 Iteration 10000 6.8 msec/Iter ETA 58:18:59 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.54 Iteration 10000 8.9 msec/Iter ETA 88:51:59 M( 40000000 )C, 0x2318fe9e59886055, n = 2359296, CUDALucas v1.54 Iteration 10000 10.2 msec/Iter ETA 124:58:19 M( 45000000 )C, 0x59b7ec093d4375da, n = 2621440, CUDALucas v1.54 Iteration 10000 12.4 msec/Iter ETA 183:17:59 M( 55000000 )C, 0x78f7b93265e51196, n = 3145728, CUDALucas v1.54 Iteration 10000 14.1 msec/Iter ETA 252:44:19 M( 65000000 )C, 0x679258d6765d8e5b, n = 3670016, CUDALucas v1.54 Iteration 10000 16.1 msec/Iter ETA 311:03:59 M( 70000000 )C, 0x652d4a670f44317e, n = 3932160, CUDALucas v1.54 Iteration 10000 14 msec/Iter ETA 291:37:39 M( 75000000 )C, 0xded1605fcc4a0f88, n = 4194304, CUDALucas v1.54 Iteration 10000 17.9 msec/Iter ETA 401:20:29 M( 85000000 )C, 0xf3211d616c78bc9f, n = 4718592, CUDALucas v1.54 Iteration 10000 20.6 msec/Iter ETA 527:43:19 M( 95000000 )C, 0x99f8f6024ac3c4d6, n = 5242880, CUDALucas v1.54 Iteration 10000 28 msec/Iter ETA 777:41:59 M( 100000000 )C, 0x92ce4f1ec07668a3, n = 5767168, CUDALucas v1.54[/CODE]They match. I did many restarts with my latest 1.50 dc. Was good. Waiting for first 1.58 to finish.[/QUOTE] Thanks! Maybe I have a bad video card? Why would 1.49 work but 1.58|1.61 not work? The reason I don't think it's the card is I've been running 1.49 at the same time on the same card (started at the same time). 1.49 results good - 1.58|1.61 not good. From above though, it looks like 1.50 is good. We'll see... |
I finished 4 DCs.
All mismatches. Two of them were done on tesla boards (26166389 and 26176597). My money are on a software bug. Returning to v1.3 and mfaktc. |
| All times are UTC. The time now is 23:10. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.