![]() |
|
|
#1431 |
|
Jun 2003
2·3·7·112 Posts |
There is a bug in cuda_safecalls.h
Code:
#ifdef _CUFFT_H_
inline void __cufftSafeCall( cufftResult err, const char *file, const int line ) {
if( CUFFT_SUCCESS != err) {
fprintf(stderr, "%s(%i) : cufftSafeCall() CUFFT error %d: ",
file, line, (int)err);
switch (err) {
case CUFFT_INVALID_PLAN: fprintf(stderr, "CUFFT_INVALID_PLAN\n");
case CUFFT_ALLOC_FAILED: fprintf(stderr, "CUFFT_ALLOC_FAILED\n");
case CUFFT_INVALID_TYPE: fprintf(stderr, "CUFFT_INVALID_TYPE\n");
case CUFFT_INVALID_VALUE: fprintf(stderr, "CUFFT_INVALID_VALUE\n");
case CUFFT_INTERNAL_ERROR: fprintf(stderr, "CUFFT_INTERNAL_ERROR\n");
case CUFFT_EXEC_FAILED: fprintf(stderr, "CUFFT_EXEC_FAILED\n");
case CUFFT_SETUP_FAILED: fprintf(stderr, "CUFFT_SETUP_FAILED\n");
case CUFFT_INVALID_SIZE: fprintf(stderr, "CUFFT_INVALID_SIZE\n");
case CUFFT_UNALIGNED_DATA: fprintf(stderr, "CUFFT_UNALIGNED_DATA\n");
default: fprintf(stderr, "CUFFT Unknown error code\n");
}
exit(-1);
}
}
#endif
Last fiddled with by axn on 2012-06-19 at 05:42 Reason: CUFFT_ALLOC_FAILED |
|
|
|
|
|
#1432 |
|
Romulan Interpreter
Jun 2011
Thailand
25B516 Posts |
Some more "todo"
for you:- log files. With old method we were forced to use batch files, and those had the advantage that usually the last line in the batch was a "pause" line. So, when the CL crashed, we still could see the error. With the new method with worktodo files, we do not need to use the batch anymore, but when there is a crash, we can't see the error (window is closed). It should be nice to have a log file, whose name and eventually verbosity level could be set from the ini file. - binaries for the beta, if you want it tested by third parties, it should be nice if you supply all the testing tools for it and not force the guy to crawl through the building process. OTOH, I am now doing first time LL with two cards in parallel (to be sure of the residue). I am at 21M iter from 46M and all matching up to now. |
|
|
|
|
|
#1433 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
722110 Posts |
Sigh... it had been on my personal todo list to clean up a lot of the functions, which would be necessary to produce logging functionality... it'd probably take me a couple of days.
If you've been following the SourceForge, you should be aware I've been having a lot of issues with it. With one of the suggested fixes, I appear to be able to interact with Subversion just fine, but SourceForge's display of the repository does not display correctly. I have only to implement the results file locking functionality, which as you might have seen from mfakto's thread should be as simple as copy and paste. Once that was done and I was satisfied I wasn't going to make any more minor/cosmetic changes to the code, then I was going to post the code here for compiling as well as committing it to SVN. (It should be sometime in the next 12 hours.) (PS flash, do not use https to commit to SourceForge; there's a small chance that it could corrupt the repository.) Last fiddled with by Dubslow on 2012-06-21 at 15:42 |
|
|
|
|
|
#1434 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Well, here it is. svn/trunk has been updated to r32, which of course isn't showing up properly in SourceForge per the link in the above post. As such, here's the 2.04 Beta code attached here for flash. It should be bug-free, but of course that's the whole point of a Beta
![]() At a minimum, README and CUDALucas.ini should be shipped with any executables, the rest of the files are optional. (Of course, none of us really need the README, but hey.) ____________________________________________________________ Changes from 2.03 is pretty much new features: The "-i" option from 2.03 (print device info) has been moved to "-info", and in its place we have the same "-i <ini file name>" from mfakto. This means you can run two instances from the same directory, although they each need different work files. It is safe to have two (or more) different instances writing to the same work file, thanks to Bdot's file locking code. It is NOT safe to have two different instances testing the same exponent. LaurV, if you want that ability, how should the checkpoint files be named? (Perhaps "cxxxxxxxx-<ini name>" or "cxxxxxxxx.<device number>"? Such re-naming would only happen if the -i option was used.) The "total time" estimate has been added as LaurV requested. The estimated total time is printed whenever you pause a test, and when it finishes. However 2.04 can still read 2.03's save files, will recognize them as such, and won't print any messages in that case. All FFT lengths are now printed as a multiple of 1024, e.g. "n = 1440K" instead of "n = 1474560". The FFT length selection code has been spiffed up, though it isn't near the ability of Prime95's jump tables. To help mitigate that, the average error of the first 1000 iterations of every test is calculated, the same as with Prime95's "soft crossover" points. You can also now specify FFT length for an individual assignment via the work file. To do that, add a field to the "Test=..." assignment line in the work file. To use (e.g.) a 1440K length for a test, the line should look like Code:
Test=<assignment key>,<exponent>,1440K The work file line parser now works apositionally, so there's no fuss about "Test=N/A,26204951" like in 2.03 with the comma counts and what not. It's smart enough to recognize an AID or exponent (or FFT length) ![]() The results file default is now "results.txt" (not "result.txt") due to a minor case of OCD on my part ![]() The AID (if any, not including "N/A") is now printed in the results file. (I plan to add V5UserID and ComputerID much like mfakto in 2.05, in preparation for when Christenson autmoates mfaktc.) The "err = " line now prints the maximum error since the last checkpoint, not the maximum error since the last (re)start. ---------------------------------------------------------------------------- What I'm looking for mostly is any cosmetic changes anybody wants, or things to be specifiable via command line/ini file. Examples: Should the estimated time be presented differently? Should I have an option to not print the info from the initial round off test? Should ResultsFile be specifiable via command line? Should the err printed be specifiable via ini or cmd line? ...or anything else you want changed/added. Please, don't hesitate. ![]() (PS I still have no idea what might cause cufftbench to crash. I didn't change that except for the 'break's that axn pointed out were missing.) Last fiddled with by Dubslow on 2012-06-21 at 23:34 Reason: PS |
|
|
|
|
|
#1435 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
11100001101012 Posts |
I tried to edit it into the previous post, but it was too long.
Code:
bill@Gravemind:~/CUDALucas/test/test∰∂ cat worktodo.txt DoubleCheck=N/A,216091,24K,1,69 Test=12K,N/A,69,216091 Test=86243,4K,CA9CAECD26710FC828DFBBB8 DoubleCheck=CA9CAECD26710FC828DFBBB8________,26458577,69,1 bill@Gravemind:~/CUDALucas/test/test∰∂ ./CUDALucas Starting M216091 fft length = 24K Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length. Iteration 100, average error = 0.00000, max error = 0.00000 Iteration 200, average error = 0.00000, max error = 0.00000 Iteration 300, average error = 0.00000, max error = 0.00000 Iteration 400, average error = 0.00000, max error = 0.00000 Iteration 500, average error = 0.00000, max error = 0.00000 Iteration 600, average error = 0.00000, max error = 0.00000 Iteration 700, average error = 0.00000, max error = 0.00000 Iteration 800, average error = 0.00000, max error = 0.00000 Iteration 900, average error = 0.00000, max error = 0.00000 Iteration 1000, average error = 0.00000 < 0.25 (max error = 0.00000), continuing test. Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1236 ms/iter, ETA 0:22) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:18) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1173 ms/iter, ETA 0:16) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1175 ms/iter, ETA 0:14) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:11) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:09) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1178 ms/iter, ETA 0:07) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1177 ms/iter, ETA 0:04) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:02) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1179 ms/iter, ETA 0:00) M( 216091 )P, n = 24K, CUDALucas v2.04 Beta, estimated total time = 0:26 Starting M216091 fft length = 12K Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length. Iteration 100, average error = 0.00310, max error = 0.00415 Iteration 200, average error = 0.00341, max error = 0.00427 Iteration 300, average error = 0.00349, max error = 0.00415 Iteration 400, average error = 0.00351, max error = 0.00415 Iteration 500, average error = 0.00355, max error = 0.00430 Iteration 600, average error = 0.00355, max error = 0.00415 Iteration 700, average error = 0.00356, max error = 0.00421 Iteration 800, average error = 0.00358, max error = 0.00439 Iteration 900, average error = 0.00361, max error = 0.00452 Iteration 1000, average error = 0.00361 < 0.25 (max error = 0.00427), continuing test. Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:02 real, 0.0967 ms/iter, ETA 0:17) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:02 real, 0.0927 ms/iter, ETA 0:14) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:01 real, 0.0930 ms/iter, ETA 0:13) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 12K, CUDALucas v2.04 Beta err = 0.0044 (0:02 real, 0.0928 ms/iter, ETA 0:11) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 12K, CUDALucas v2.04 Beta err = 0.0044 (0:02 real, 0.0929 ms/iter, ETA 0:09) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 12K, CUDALucas v2.04 Beta err = 0.0045 (0:02 real, 0.0930 ms/iter, ETA 0:07) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 12K, CUDALucas v2.04 Beta err = 0.0046 (0:02 real, 0.0931 ms/iter, ETA 0:05) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:02 real, 0.0928 ms/iter, ETA 0:03) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 12K, CUDALucas v2.04 Beta err = 0.0044 (0:02 real, 0.0928 ms/iter, ETA 0:01) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 12K, CUDALucas v2.04 Beta err = 0.0041 (0:01 real, 0.0928 ms/iter, ETA 0:00) M( 216091 )P, n = 12K, CUDALucas v2.04 Beta, estimated total time = 0:20 Starting M86243 fft length = 4K Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length. Iteration 100, average error = 0.16577, max error = 0.20703 Iteration 200, average error = 0.18153, max error = 0.25000 Iteration 300, average error = 0.18649, max error = 0.25586 Iteration 400, average error = 0.18799, max error = 0.26172 Iteration 500, average error = 0.19219, max error = 0.25000 Iteration 600, average error = 0.19283, max error = 0.25000 Iteration 700, average error = 0.19504, max error = 0.25000 Iteration 800, average error = 0.19555, max error = 0.22656 Iteration 900, average error = 0.19565, max error = 0.21973 Iteration 1000, average error = 0.19539 < 0.25 (max error = 0.23438), continuing test. Iteration 20000 M( 86243 )C, 0x233a5255467a4c6e, n = 4K, CUDALucas v2.04 Beta err = 0.2188 (0:01 real, 0.0651 ms/iter, ETA 0:03) Iteration 40000 M( 86243 )C, 0x70b63ef639328851, n = 4K, CUDALucas v2.04 Beta err = 0.2344 (0:01 real, 0.0616 ms/iter, ETA 0:02) Iteration 60000 M( 86243 )C, 0x25a4a96c66e7f897, n = 4K, CUDALucas v2.04 Beta err = 0.2812 (0:02 real, 0.0617 ms/iter, ETA 0:01) Iteration 80000 M( 86243 )C, 0xdd477c413184da18, n = 4K, CUDALucas v2.04 Beta err = 0.2219 (0:01 real, 0.0618 ms/iter, ETA 0:00) M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Beta, estimated total time = 0:05 Starting M26458577 fft length = 1440K Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length. Iteration 100, average error = 0.08321, max error = 0.11523 Iteration 200, average error = 0.09496, max error = 0.11719 Iteration 300, average error = 0.09992, max error = 0.12500 Iteration 400, average error = 0.10188, max error = 0.12500 Iteration 500, average error = 0.10267, max error = 0.11792 Iteration 600, average error = 0.10390, max error = 0.12109 Iteration 700, average error = 0.10444, max error = 0.11914 Iteration 800, average error = 0.10462, max error = 0.11230 Iteration 900, average error = 0.10468, max error = 0.11328 Iteration 1000, average error = 0.10509 < 0.25 (max error = 0.12305), continuing test. Iteration 20000 M( 26458577 )C, 0xc0983d54298faf9e, n = 1440K, CUDALucas v2.04 Beta err = 0.1211 (1:50 real, 5.4642 ms/iter, ETA 40:06:03) Iteration 40000 M( 26458577 )C, 0xc1989436f611c782, n = 1440K, CUDALucas v2.04 Beta err = 0.1172 (1:48 real, 5.4034 ms/iter, ETA 39:37:28) ^C SIGINT caught, writing checkpoint. Estimated time spent so far: 3:53 bill@Gravemind:~/CUDALucas/test/test∰∂ ./CUDALucas Continuing work from a partial result of M26458577 fft length = 1440K iteration = 42802 Iteration 60000 M( 26458577 )C, 0xdcc248d33c956284, n = 1440K, CUDALucas v2.04 Beta err = 0.1230 (1:33 real, 4.6479 ms/iter, ETA 34:03:31) Iteration 80000 M( 26458577 )C, 0x281ebeaddc336d4b, n = 1440K, CUDALucas v2.04 Beta err = 0.1211 (1:48 real, 5.4052 ms/iter, ETA 39:34:42) ^C SIGINT caught, writing checkpoint. Estimated time spent so far: 7:16 bill@Gravemind:~/CUDALucas/test/test∰∂ cat results.txt M( 216091 )P, n = 24K, CUDALucas v2.04 Beta M( 216091 )P, n = 12K, CUDALucas v2.04 Beta M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Beta, AID: CA9CAECD26710FC828DFBBB8 bill@Gravemind:~/CUDALucas/test/test∰∂ |
|
|
|
|
|
#1436 |
|
Romulan Interpreter
Jun 2011
Thailand
226658 Posts |
Very good, man! Msft would be proud :D
Now seriously, you are putting a lot of effort into it. Do you never sleep? |
|
|
|
|
|
#1439 | |
|
Jun 2005
3×43 Posts |
Quote:
C is on life support in Visual Studio. Any reason not to build the code as C++? |
|
|
|
|
|
|
#1440 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
722110 Posts |
|
|
|
|
|
|
#1441 | |
|
"Jerry"
Nov 2011
Vancouver, WA
100011000112 Posts |
Quote:
@Dubslow: As soon as you fix my 'mistake' on SourceForge, I'll get the code uploaded and see if I can get the 2.04 Beta binaries there as well. If not,I'll post here until I can get that fixed. |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |