![]() |
1 Attachment(s)
[QUOTE=Dubslow;301858]What's sfv?[/QUOTE]
Google is your [URL="http://en.wikipedia.org/wiki/Simple_file_verification"]friend[/URL] :D (in this case wikipedia) edit: cuffts are on the way. [strike] What's the drop limit? (zip file is 33 megs and I wonder if I should compress it harder or split it in two. I tried already two times and yer friend's site says I/O error and crashes at about half. Is it my net - as usually - bad Sunday morning or the site has a limit of 15 megs?)[/strike] edit 2: It went through. Here [URL="http://filesmelt.com/dl/cuffts.zip"]the link[/URL]. The .sfv is attached to this post. BTW, yer linux should have this built-in by default, no need any tools like total commander which I am using in windoze. Ye should know it... WinSFV is very convenient, just doubleclick on any sfv file and it tells you if any of the files checksumed inside was changed. Of course you may need to rename it back (delete the .txt extension which I added for forum reasons) |
Theoretically it should handle it, but with files that large you never know. Email it to me.
Edit: Wow, I'm amazed. I managed to DL the file in 2 seconds flat. FS must have a server nearby to me. (Edit2: No, it wasn't installed by default. :razz: MD5 is much more common. Addendum: [quote='man cksfv']cksfv is a tool for verifying CRC32 checksums of files. CRC32 checksums are used to verify that files are not corrupted. The algorithm is cryp‐ tographically crippled so it can not be used for security purposes. md5sum (1) or sha1sum (1) are much better tools for checksuming files. cksfv should only be used for compatibility with other systems.[/quote]) |
LaurV, for printing total time, does that need to appear in the results file, or just on the screen?
Edit: I've found a bug in 2.03, however it will only manifest itself if you screw up the formatting of the SaveFolder, ResultsFile, or WorkFile options in the ini file. PS @LaurV: [code]bill@Gravemind:~/CUDALucas∰∂ ./new.CUDALucas 132049 -f 0 Warning: Couldn't parse ini file option ResultsFile; using default "results.txt" Starting M132049 fft length = 7168 Iteration 30000 M( 132049 )C, 0xbcd4392925c8b6c9, n = 7K, CUDALucas v2.04 Alpha err = 0.0103 (0:03 real, 0.0892 ms/iter, ETA 0:08) ^C SIGINT caught. Writing checkpoint. bill@Gravemind:~/CUDALucas∰∂ ./new.CUDALucas 132049 -f 0 Warning: Couldn't parse ini file option ResultsFile; using default "results.txt" Continuing work from a partial result of M132049 fft length = 7168 iteration = 40129 Iteration 60000 M( 132049 )C, 0x1a3c4b80c267f04f, n = 7K, CUDALucas v2.04 Alpha err = 0.0097 (0:02 real, 0.0578 ms/iter, ETA 0:03) Iteration 90000 M( 132049 )C, 0x28ecbb0541f5ec16, n = 7K, CUDALucas v2.04 Alpha err = 0.0097 (0:03 real, 0.0873 ms/iter, ETA 0:02) Iteration 120000 M( 132049 )C, 0x816902f6d3a9764a, n = 7K, CUDALucas v2.04 Alpha err = 0.0103 (0:02 real, 0.0872 ms/iter, ETA 0:00) M( 132049 )P, n = 7K, CUDALucas v2.04 Alpha. Estimated total time: 0:11 bill@Gravemind:~/CUDALucas∰∂ [/code] :grin: The best part is that it can still read the old checkpoints, realize it's reading old checkpoints and then not print the time. |
[QUOTE=Dubslow;301514]After much fighting with SF Beta and SVN, there is now a CUDALucas SourceForge [URL="https://sourceforge.net/projects/cudalucas/"]page[/URL]. :smile:
In addition, timeval.c is gone entirely, not to mention various changes to the defines and includes. Any binaries compiled should be identical, but flash should also test and make sure this newer stuff does actually compile. Does anybody feel like writing a README? (I've marked it as GPL, but that's certainly open to discussion.) msft and flash, please report your SF usernames so I can add you. Anyone else is welcome to join as a "Member". (Or Developer if you ask nicely. :smile:)[/QUOTE] Ok, SVN was a pretty big learning curve :smile: I updated the files per the comments to rev20 @Dusbslow, can you recompile and test in Linux? I changed the [SIZE=2]IniGetStr function and added a custom sprintf_s for MSVS that should only affect writing results.txt files, but I need you to test compile/run again.[/SIZE] [SIZE=2]Otherwise, everything seems to be working well. After you test compile/run, I'll recompile and post to SourceForge.[/SIZE] |
[QUOTE=flashjh;302037]Ok, SVN was a pretty big learning curve :smile:
[/quote]Yeah, me too. Check out the comments for r19 :razz: [QUOTE=flashjh;302037] I updated the files per the comments to rev20 @Dusbslow, can you recompile and test in Linux? I changed the [SIZE=2]IniGetStr function and added a custom sprintf_s for MSVS that should only affect writing results.txt files, but I need you to test compile/run again.[/SIZE] [SIZE=2]Otherwise, everything seems to be working well. After you test compile/run, I'll recompile and post to SourceForge.[/SIZE][/QUOTE] That was indeed the bug I was referring to. I'll test it, but the code looks good. (spritf?) I wasn't intending to post executables to SF until it moves to at least Beta. I'm only about halfway through the changes, not done yet :smile: (Among other things, results file locking isn't implemented yet.) If you want, feel free to post executables here, but I do warn everyone, it's still in Alpha. :razz: In particular, you can compile that "test" version I mentioned flash, use /DTEST as defined by the makefile rule "test". I was waiting to ask for you to make it until I had written the Python I need locally to interpret the results, but you can make it now if you want. :smile: |
Edit (I wish): Yes, it does compile fine.
In other news, I suddenly can't commit to svn/tags. [code]bill@Gravemind:~/CUDALucas∰∂ svn commit --username=dubslow tags/v2.03-final/ svn: Commit failed (details follow): svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/p/cudalucas/code/!svn/act/2c8fdab0-c8b5-425a-b904-ad76555a2b37' svn: Your commit message was left in a temporary file: svn: '/home/bill/CUDALucas/tags/svn-commit.tmp' bill@Gravemind:~/CUDALucas∰∂ svn commit --username dubslow tags/v2.03-final/ svn: Commit failed (details follow): svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/p/cudalucas/code/!svn/act/56dab32b-3b9e-4bfe-877f-0520584bae98' svn: Your commit message was left in a temporary file: svn: '/home/bill/CUDALucas/tags/svn-commit.2.tmp' bill@Gravemind:~/CUDALucas∰∂ [/code] It didn't even give me a chance to enter my password. :huh: |
[QUOTE=Dubslow;302040] I'm only about halfway through the changes, not done yet :smile:[/QUOTE]
I got off my butt and coded some of them. A silly bug took 20 minutes of my time, but the results are pretty. :grin: [code]bill@Gravemind:~/CUDALucas/test∰∂ cat worktodo.txt DoubleCheck=N/A,216091,24K,1,69 Test=12K,N/A,69,216091 Test=86243,4K DoubleCheck=CA9CAECD26710FC828DFBBB8________,26458577,69,1 bill@Gravemind:~/CUDALucas/test∰∂ ./new.CUDALucas Starting M216091 fft length = 24K Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1734 ms/iter, ETA 0:34) Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1752 ms/iter, ETA 0:33) Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1708 ms/iter, ETA 0:30) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1750 ms/iter, ETA 0:29) Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1723 ms/iter, ETA 0:27) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1713 ms/iter, ETA 0:25) Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1716 ms/iter, ETA 0:24) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1703 ms/iter, ETA 0:22) Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1747 ms/iter, ETA 0:20) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1702 ms/iter, ETA 0:18) Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1694 ms/iter, ETA 0:16) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1718 ms/iter, ETA 0:15) Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1734 ms/iter, ETA 0:13) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1744 ms/iter, ETA 0:12) Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1715 ms/iter, ETA 0:10) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1710 ms/iter, ETA 0:08) Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1709 ms/iter, ETA 0:06) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1711 ms/iter, ETA 0:05) Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1713 ms/iter, ETA 0:03) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1708 ms/iter, ETA 0:01) Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1700 ms/iter, ETA 0:00) M( 216091 )P, n = 24K, CUDALucas v2.04 Alpha. Estimated total time: 0:38 Starting M216091 fft length = 12K Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1511 ms/iter, ETA 0:30) Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1476 ms/iter, ETA 0:28) Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1479 ms/iter, ETA 0:26) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1494 ms/iter, ETA 0:25) Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1464 ms/iter, ETA 0:23) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1490 ms/iter, ETA 0:22) Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1551 ms/iter, ETA 0:21) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1506 ms/iter, ETA 0:19) Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1480 ms/iter, ETA 0:17) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1499 ms/iter, ETA 0:16) Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1463 ms/iter, ETA 0:14) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1481 ms/iter, ETA 0:13) Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1458 ms/iter, ETA 0:11) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1460 ms/iter, ETA 0:10) Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1463 ms/iter, ETA 0:08) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1469 ms/iter, ETA 0:07) Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1470 ms/iter, ETA 0:05) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1449 ms/iter, ETA 0:04) Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1469 ms/iter, ETA 0:02) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1482 ms/iter, ETA 0:01) Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1485 ms/iter, ETA 0:00) M( 216091 )P, n = 12K, CUDALucas v2.04 Alpha. Estimated total time: 0:32 Starting M86243 fft length = 4K Iteration 10000 M( 86243 )C, 0x26d11035920b3773, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1160 ms/iter, ETA 0:08) Iteration 20000 M( 86243 )C, 0x233a5255467a4c6e, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1137 ms/iter, ETA 0:06) Iteration 30000 M( 86243 )C, 0x88e3195a12367bb8, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1141 ms/iter, ETA 0:05) Iteration 40000 M( 86243 )C, 0x70b63ef639328851, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1153 ms/iter, ETA 0:04) Iteration 50000 M( 86243 )C, 0x0ff1f54cfeeb4909, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1136 ms/iter, ETA 0:03) Iteration 60000 M( 86243 )C, 0x25a4a96c66e7f897, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:02 real, 0.1183 ms/iter, ETA 0:02) Iteration 70000 M( 86243 )C, 0xb639453c818baba2, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:01 real, 0.1139 ms/iter, ETA 0:01) Iteration 80000 M( 86243 )C, 0xdd477c413184da18, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:01 real, 0.1089 ms/iter, ETA 0:00) M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Alpha. Estimated total time: 0:09 Starting M26458577 fft length = 1536K ^C SIGINT caught. Writing checkpoint. bill@Gravemind:~/CUDALucas/test∰∂ [/code] (It can't actually handle underscores, but I edited the output for obvious reasons. :razz:) PS Would any code gurus be willing to examine parse_worktodo_line() starting from line 317 and check for any stupids? |
Reproducible error in cufftbench
@msft: I've stumbled across this seemingly reproducible error in cufftbench().
[code]bill@Gravemind:~/CUDALucas∰∂ CUDALucas -threads 128 -cufftbench 5881856 5914624 64 CUFFT bench start = 5881856 end = 5914624 distance = 64 CUFFT_Z2Z size= 5881856 time= 986.398254 msec CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED CUFFT_INVALID_TYPE CUFFT_INVALID_VALUE CUFFT_INTERNAL_ERROR CUFFT_EXEC_FAILED CUFFT_SETUP_FAILED CUFFT_INVALID_SIZE CUFFT_UNALIGNED_DATA CUFFT Unknown error code bill@Gravemind:~/CUDALucas∰∂ CUDALucas -cufftbench 5881856 5914624 64 CUFFT bench start = 5881856 end = 5914624 distance = 64 CUFFT_Z2Z size= 5881856 time= 986.098572 msec CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED CUFFT_INVALID_TYPE CUFFT_INVALID_VALUE CUFFT_INTERNAL_ERROR CUFFT_EXEC_FAILED CUFFT_SETUP_FAILED CUFFT_INVALID_SIZE CUFFT_UNALIGNED_DATA CUFFT Unknown error code bill@Gravemind:~/CUDALucas∰∂ [/code] (This is with v2.03, although it also occurs in v2.04_test. In the latter case, it continued to test more lengths, but it did stop before it was supposed to.) Also, as I previously reported, cufftbench() still uses 1-2 full cores. Is that a bug or the nature of the function? |
More data on cufft crash
[code]bill@Gravemind:~/CUDALucas∰∂ CUDALucas -cufftbench $((256*128)) $((65535*128)) $((256*128))
CUFFT bench start = 32768 end = 8388480 distance = 32768 <good output snipped> CUFFT_Z2Z size= 6815744 time= 17.126163 msec CUFFT_Z2Z size= 6848512 time= 21.510880 msec CUFFT_Z2Z size= 6881280 time= 13.638905 msec CUFFT_Z2Z size= 6914048 time= 699.387634 msec CUFFT_Z2Z size= 6946816 time= 22.775032 msec CUFFT_Z2Z size= 6979584 time= 30.465769 msec CUFFT_Z2Z size= 7012352 time= 37.825619 msec CUFFT_Z2Z size= 7045120 time= 20.284300 msec CUFFT_Z2Z size= 7077888 time= 12.884492 msec CUFFT_Z2Z size= 7110656 time= 18.780321 msec CUFFT_Z2Z size= 7143424 time= 39.204491 msec CUFFT_Z2Z size= 7176192 time= 31.473606 msec CUFFT_Z2Z size= 7208960 time= 18.138344 msec CUFFT_Z2Z size= 7241728 time= 23.035593 msec CUFFT_Z2Z size= 7274496 time= 22.267868 msec CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED CUFFT_INVALID_TYPE CUFFT_INVALID_VALUE CUFFT_INTERNAL_ERROR CUFFT_EXEC_FAILED CUFFT_SETUP_FAILED CUFFT_INVALID_SIZE CUFFT_UNALIGNED_DATA CUFFT Unknown error code [/code] It's a different size this time. |
Hi ,Dubslow
I believe you can read source code. |
[QUOTE=msft;302625]Hi ,Dubslow
I believe you can read source code.[/QUOTE] Yes I can, but I don't have the first clue about CUDA in general or cufft in particular. Just in case, I did look through it and I see the line that's causing the issue, but I have no idea what's wrong or how to fix it. [code]void cufftbench (int cufftbench_s, int cufftbench_e, int cufftbench_d) { cudaEvent_t start, stop; double *x; float outerTime; int i, j; printf ("CUFFT bench start = %d end = %d distance = %d\n", cufftbench_s, cufftbench_e, cufftbench_d); cutilSafeCall (cudaMalloc ((void **) &g_x, sizeof (double) * cufftbench_e)); x = ((double *) malloc (sizeof (double) * cufftbench_e + 1)); for (i = 0; i <= cufftbench_e; i++) x[i] = 0; cutilSafeCall (cudaMemcpy (g_x, x, sizeof (double) * cufftbench_e, cudaMemcpyHostToDevice)); cutilSafeCall (cudaEventCreate (&start)); cutilSafeCall (cudaEventCreate (&stop)); for (j = cufftbench_s; j <= cufftbench_e; j += cufftbench_d) { [B]cufftSafeCall (cufftPlan1d (&plan, j / 2, CUFFT_Z2Z, 1));[/B] cufftSafeCall (cufftExecZ2Z (plan, (cufftDoubleComplex *) g_x, (cufftDoubleComplex *) g_x, CUFFT_INVERSE)); cutilSafeCall (cudaEventRecord (start, 0)); for (i = 0; i < 100; i++) cufftSafeCall (cufftExecZ2Z (plan, (cufftDoubleComplex *) g_x, (cufftDoubleComplex *) g_x, CUFFT_INVERSE)); cutilSafeCall (cudaEventRecord (stop, 0)); cutilSafeCall (cudaEventSynchronize (stop)); cutilSafeCall (cudaEventElapsedTime (&outerTime, start, stop)); printf ("CUFFT_Z2Z size= %d time= %f msec\n", j, outerTime / 100); cufftSafeCall (cufftDestroy (plan)); } cutilSafeCall (cudaFree ((char *) g_x)); cutilSafeCall (cudaEventDestroy (start)); cutilSafeCall (cudaEventDestroy (stop)); free ((char *) x); }[/code] The bolded line is the one that's barfing. (I do recognize that it's the line that sets up the FFT, and it's the next line and inner loop that actually execute the FFT.) |
| All times are UTC. The time now is 23:16. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.