![]() |
|
|
#1420 |
|
Romulan Interpreter
Jun 2011
Thailand
226658 Posts |
Google is your friend :D (in this case wikipedia)
edit: cuffts are on the way. edit 2: It went through. Here the link. The .sfv is attached to this post. BTW, yer linux should have this built-in by default, no need any tools like total commander which I am using in windoze. Ye should know it... WinSFV is very convenient, just doubleclick on any sfv file and it tells you if any of the files checksumed inside was changed. Of course you may need to rename it back (delete the .txt extension which I added for forum reasons) Last fiddled with by LaurV on 2012-06-10 at 02:17 |
|
|
|
|
|
#1421 | |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
Theoretically it should handle it, but with files that large you never know. Email it to me.
Edit: Wow, I'm amazed. I managed to DL the file in 2 seconds flat. FS must have a server nearby to me. (Edit2: No, it wasn't installed by default. MD5 is much more common. Addendum: Quote:
Last fiddled with by Dubslow on 2012-06-10 at 02:25 |
|
|
|
|
|
|
#1422 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
LaurV, for printing total time, does that need to appear in the results file, or just on the screen?
Edit: I've found a bug in 2.03, however it will only manifest itself if you screw up the formatting of the SaveFolder, ResultsFile, or WorkFile options in the ini file. PS @LaurV: Code:
bill@Gravemind:~/CUDALucas∰∂ ./new.CUDALucas 132049 -f 0 Warning: Couldn't parse ini file option ResultsFile; using default "results.txt" Starting M132049 fft length = 7168 Iteration 30000 M( 132049 )C, 0xbcd4392925c8b6c9, n = 7K, CUDALucas v2.04 Alpha err = 0.0103 (0:03 real, 0.0892 ms/iter, ETA 0:08) ^C SIGINT caught. Writing checkpoint. bill@Gravemind:~/CUDALucas∰∂ ./new.CUDALucas 132049 -f 0 Warning: Couldn't parse ini file option ResultsFile; using default "results.txt" Continuing work from a partial result of M132049 fft length = 7168 iteration = 40129 Iteration 60000 M( 132049 )C, 0x1a3c4b80c267f04f, n = 7K, CUDALucas v2.04 Alpha err = 0.0097 (0:02 real, 0.0578 ms/iter, ETA 0:03) Iteration 90000 M( 132049 )C, 0x28ecbb0541f5ec16, n = 7K, CUDALucas v2.04 Alpha err = 0.0097 (0:03 real, 0.0873 ms/iter, ETA 0:02) Iteration 120000 M( 132049 )C, 0x816902f6d3a9764a, n = 7K, CUDALucas v2.04 Alpha err = 0.0103 (0:02 real, 0.0872 ms/iter, ETA 0:00) M( 132049 )P, n = 7K, CUDALucas v2.04 Alpha. Estimated total time: 0:11 bill@Gravemind:~/CUDALucas∰∂ ![]() The best part is that it can still read the old checkpoints, realize it's reading old checkpoints and then not print the time. Last fiddled with by Dubslow on 2012-06-10 at 03:59 |
|
|
|
|
|
#1423 | |
|
"Jerry"
Nov 2011
Vancouver, WA
100011000112 Posts |
Quote:
![]() I updated the files per the comments to rev20 @Dusbslow, can you recompile and test in Linux? I changed the IniGetStr function and added a custom sprintf_s for MSVS that should only affect writing results.txt files, but I need you to test compile/run again. Otherwise, everything seems to be working well. After you test compile/run, I'll recompile and post to SourceForge. |
|
|
|
|
|
|
#1424 | |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
Yeah, me too. Check out the comments for r19
![]() Quote:
I wasn't intending to post executables to SF until it moves to at least Beta. I'm only about halfway through the changes, not done yet (Among other things, results file locking isn't implemented yet.) If you want, feel free to post executables here, but I do warn everyone, it's still in Alpha. ![]() In particular, you can compile that "test" version I mentioned flash, use /DTEST as defined by the makefile rule "test". I was waiting to ask for you to make it until I had written the Python I need locally to interpret the results, but you can make it now if you want.
|
|
|
|
|
|
|
#1425 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
Edit (I wish): Yes, it does compile fine.
In other news, I suddenly can't commit to svn/tags. Code:
bill@Gravemind:~/CUDALucas∰∂ svn commit --username=dubslow tags/v2.03-final/ svn: Commit failed (details follow): svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/p/cudalucas/code/!svn/act/2c8fdab0-c8b5-425a-b904-ad76555a2b37' svn: Your commit message was left in a temporary file: svn: '/home/bill/CUDALucas/tags/svn-commit.tmp' bill@Gravemind:~/CUDALucas∰∂ svn commit --username dubslow tags/v2.03-final/ svn: Commit failed (details follow): svn: Server sent unexpected return value (403 Forbidden) in response to MKACTIVITY request for '/p/cudalucas/code/!svn/act/56dab32b-3b9e-4bfe-877f-0520584bae98' svn: Your commit message was left in a temporary file: svn: '/home/bill/CUDALucas/tags/svn-commit.2.tmp' bill@Gravemind:~/CUDALucas∰∂
|
|
|
|
|
|
#1426 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
160658 Posts |
I got off my butt and coded some of them. A silly bug took 20 minutes of my time, but the results are pretty.
![]() Code:
bill@Gravemind:~/CUDALucas/test∰∂ cat worktodo.txt DoubleCheck=N/A,216091,24K,1,69 Test=12K,N/A,69,216091 Test=86243,4K DoubleCheck=CA9CAECD26710FC828DFBBB8________,26458577,69,1 bill@Gravemind:~/CUDALucas/test∰∂ ./new.CUDALucas Starting M216091 fft length = 24K Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1734 ms/iter, ETA 0:34) Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1752 ms/iter, ETA 0:33) Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1708 ms/iter, ETA 0:30) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1750 ms/iter, ETA 0:29) Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1723 ms/iter, ETA 0:27) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1713 ms/iter, ETA 0:25) Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1716 ms/iter, ETA 0:24) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1703 ms/iter, ETA 0:22) Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1747 ms/iter, ETA 0:20) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1702 ms/iter, ETA 0:18) Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1694 ms/iter, ETA 0:16) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1718 ms/iter, ETA 0:15) Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1734 ms/iter, ETA 0:13) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1744 ms/iter, ETA 0:12) Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1715 ms/iter, ETA 0:10) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1710 ms/iter, ETA 0:08) Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1709 ms/iter, ETA 0:06) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:01 real, 0.1711 ms/iter, ETA 0:05) Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1713 ms/iter, ETA 0:03) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1708 ms/iter, ETA 0:01) Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 24K, CUDALucas v2.04 Alpha err = 0.0000 (0:02 real, 0.1700 ms/iter, ETA 0:00) M( 216091 )P, n = 24K, CUDALucas v2.04 Alpha. Estimated total time: 0:38 Starting M216091 fft length = 12K Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1511 ms/iter, ETA 0:30) Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1476 ms/iter, ETA 0:28) Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1479 ms/iter, ETA 0:26) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1494 ms/iter, ETA 0:25) Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1464 ms/iter, ETA 0:23) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1490 ms/iter, ETA 0:22) Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1551 ms/iter, ETA 0:21) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1506 ms/iter, ETA 0:19) Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1480 ms/iter, ETA 0:17) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1499 ms/iter, ETA 0:16) Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:01 real, 0.1463 ms/iter, ETA 0:14) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 12K, CUDALucas v2.04 Alpha err = 0.0045 (0:02 real, 0.1481 ms/iter, ETA 0:13) Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1458 ms/iter, ETA 0:11) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1460 ms/iter, ETA 0:10) Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1463 ms/iter, ETA 0:08) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1469 ms/iter, ETA 0:07) Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1470 ms/iter, ETA 0:05) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1449 ms/iter, ETA 0:04) Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1469 ms/iter, ETA 0:02) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:01 real, 0.1482 ms/iter, ETA 0:01) Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 12K, CUDALucas v2.04 Alpha err = 0.0046 (0:02 real, 0.1485 ms/iter, ETA 0:00) M( 216091 )P, n = 12K, CUDALucas v2.04 Alpha. Estimated total time: 0:32 Starting M86243 fft length = 4K Iteration 10000 M( 86243 )C, 0x26d11035920b3773, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1160 ms/iter, ETA 0:08) Iteration 20000 M( 86243 )C, 0x233a5255467a4c6e, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1137 ms/iter, ETA 0:06) Iteration 30000 M( 86243 )C, 0x88e3195a12367bb8, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1141 ms/iter, ETA 0:05) Iteration 40000 M( 86243 )C, 0x70b63ef639328851, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1153 ms/iter, ETA 0:04) Iteration 50000 M( 86243 )C, 0x0ff1f54cfeeb4909, n = 4K, CUDALucas v2.04 Alpha err = 0.2617 (0:01 real, 0.1136 ms/iter, ETA 0:03) Iteration 60000 M( 86243 )C, 0x25a4a96c66e7f897, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:02 real, 0.1183 ms/iter, ETA 0:02) Iteration 70000 M( 86243 )C, 0xb639453c818baba2, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:01 real, 0.1139 ms/iter, ETA 0:01) Iteration 80000 M( 86243 )C, 0xdd477c413184da18, n = 4K, CUDALucas v2.04 Alpha err = 0.2812 (0:01 real, 0.1089 ms/iter, ETA 0:00) M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Alpha. Estimated total time: 0:09 Starting M26458577 fft length = 1536K ^C SIGINT caught. Writing checkpoint. bill@Gravemind:~/CUDALucas/test∰∂ )PS Would any code gurus be willing to examine parse_worktodo_line() starting from line 317 and check for any stupids? Last fiddled with by Dubslow on 2012-06-12 at 02:01 |
|
|
|
|
|
#1427 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
11100001101012 Posts |
@msft: I've stumbled across this seemingly reproducible error in cufftbench().
Code:
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -threads 128 -cufftbench 5881856 5914624 64 CUFFT bench start = 5881856 end = 5914624 distance = 64 CUFFT_Z2Z size= 5881856 time= 986.398254 msec CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED CUFFT_INVALID_TYPE CUFFT_INVALID_VALUE CUFFT_INTERNAL_ERROR CUFFT_EXEC_FAILED CUFFT_SETUP_FAILED CUFFT_INVALID_SIZE CUFFT_UNALIGNED_DATA CUFFT Unknown error code bill@Gravemind:~/CUDALucas∰∂ CUDALucas -cufftbench 5881856 5914624 64 CUFFT bench start = 5881856 end = 5914624 distance = 64 CUFFT_Z2Z size= 5881856 time= 986.098572 msec CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED CUFFT_INVALID_TYPE CUFFT_INVALID_VALUE CUFFT_INTERNAL_ERROR CUFFT_EXEC_FAILED CUFFT_SETUP_FAILED CUFFT_INVALID_SIZE CUFFT_UNALIGNED_DATA CUFFT Unknown error code bill@Gravemind:~/CUDALucas∰∂ Also, as I previously reported, cufftbench() still uses 1-2 full cores. Is that a bug or the nature of the function? |
|
|
|
|
|
#1428 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Code:
bill@Gravemind:~/CUDALucas∰∂ CUDALucas -cufftbench $((256*128)) $((65535*128)) $((256*128)) CUFFT bench start = 32768 end = 8388480 distance = 32768 <good output snipped> CUFFT_Z2Z size= 6815744 time= 17.126163 msec CUFFT_Z2Z size= 6848512 time= 21.510880 msec CUFFT_Z2Z size= 6881280 time= 13.638905 msec CUFFT_Z2Z size= 6914048 time= 699.387634 msec CUFFT_Z2Z size= 6946816 time= 22.775032 msec CUFFT_Z2Z size= 6979584 time= 30.465769 msec CUFFT_Z2Z size= 7012352 time= 37.825619 msec CUFFT_Z2Z size= 7045120 time= 20.284300 msec CUFFT_Z2Z size= 7077888 time= 12.884492 msec CUFFT_Z2Z size= 7110656 time= 18.780321 msec CUFFT_Z2Z size= 7143424 time= 39.204491 msec CUFFT_Z2Z size= 7176192 time= 31.473606 msec CUFFT_Z2Z size= 7208960 time= 18.138344 msec CUFFT_Z2Z size= 7241728 time= 23.035593 msec CUFFT_Z2Z size= 7274496 time= 22.267868 msec CUDALucas.cu(1066) : cufftSafeCall() CUFFT error 2: CUFFT_ALLOC_FAILED CUFFT_INVALID_TYPE CUFFT_INVALID_VALUE CUFFT_INTERNAL_ERROR CUFFT_EXEC_FAILED CUFFT_SETUP_FAILED CUFFT_INVALID_SIZE CUFFT_UNALIGNED_DATA CUFFT Unknown error code Last fiddled with by Dubslow on 2012-06-18 at 05:06 |
|
|
|
|
|
#1429 |
|
Jul 2009
Tokyo
2×5×61 Posts |
Hi ,Dubslow
I believe you can read source code. |
|
|
|
|
|
#1430 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Yes I can, but I don't have the first clue about CUDA in general or cufft in particular. Just in case, I did look through it and I see the line that's causing the issue, but I have no idea what's wrong or how to fix it.
Code:
void
cufftbench (int cufftbench_s, int cufftbench_e, int cufftbench_d)
{
cudaEvent_t start, stop;
double *x;
float outerTime;
int i, j;
printf ("CUFFT bench start = %d end = %d distance = %d\n", cufftbench_s,
cufftbench_e, cufftbench_d);
cutilSafeCall (cudaMalloc ((void **) &g_x, sizeof (double) * cufftbench_e));
x = ((double *) malloc (sizeof (double) * cufftbench_e + 1));
for (i = 0; i <= cufftbench_e; i++)
x[i] = 0;
cutilSafeCall (cudaMemcpy
(g_x, x, sizeof (double) * cufftbench_e,
cudaMemcpyHostToDevice));
cutilSafeCall (cudaEventCreate (&start));
cutilSafeCall (cudaEventCreate (&stop));
for (j = cufftbench_s; j <= cufftbench_e; j += cufftbench_d)
{
cufftSafeCall (cufftPlan1d (&plan, j / 2, CUFFT_Z2Z, 1));
cufftSafeCall (cufftExecZ2Z
(plan, (cufftDoubleComplex *) g_x,
(cufftDoubleComplex *) g_x, CUFFT_INVERSE));
cutilSafeCall (cudaEventRecord (start, 0));
for (i = 0; i < 100; i++)
cufftSafeCall (cufftExecZ2Z
(plan, (cufftDoubleComplex *) g_x,
(cufftDoubleComplex *) g_x, CUFFT_INVERSE));
cutilSafeCall (cudaEventRecord (stop, 0));
cutilSafeCall (cudaEventSynchronize (stop));
cutilSafeCall (cudaEventElapsedTime (&outerTime, start, stop));
printf ("CUFFT_Z2Z size= %d time= %f msec\n", j, outerTime / 100);
cufftSafeCall (cufftDestroy (plan));
}
cutilSafeCall (cudaFree ((char *) g_x));
cutilSafeCall (cudaEventDestroy (start));
cutilSafeCall (cudaEventDestroy (stop));
free ((char *) x);
}
Last fiddled with by Dubslow on 2012-06-19 at 05:08 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |