mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

axn 2012-06-19 05:42

There is a bug in cuda_safecalls.h
[CODE]#ifdef _CUFFT_H_
inline void __cufftSafeCall( cufftResult err, const char *file, const int line ) {
if( CUFFT_SUCCESS != err) {
fprintf(stderr, "%s(%i) : cufftSafeCall() CUFFT error %d: ",
file, line, (int)err);
switch (err) {
case CUFFT_INVALID_PLAN: fprintf(stderr, "CUFFT_INVALID_PLAN\n");
case CUFFT_ALLOC_FAILED: fprintf(stderr, "CUFFT_ALLOC_FAILED\n");
case CUFFT_INVALID_TYPE: fprintf(stderr, "CUFFT_INVALID_TYPE\n");
case CUFFT_INVALID_VALUE: fprintf(stderr, "CUFFT_INVALID_VALUE\n");
case CUFFT_INTERNAL_ERROR: fprintf(stderr, "CUFFT_INTERNAL_ERROR\n");
case CUFFT_EXEC_FAILED: fprintf(stderr, "CUFFT_EXEC_FAILED\n");
case CUFFT_SETUP_FAILED: fprintf(stderr, "CUFFT_SETUP_FAILED\n");
case CUFFT_INVALID_SIZE: fprintf(stderr, "CUFFT_INVALID_SIZE\n");
case CUFFT_UNALIGNED_DATA: fprintf(stderr, "CUFFT_UNALIGNED_DATA\n");
default: fprintf(stderr, "CUFFT Unknown error code\n");
}
exit(-1);
}
}
#endif
[/CODE]
"break" missing after each and every case statement. That is why the error dumped all the code, starting from CUFFT_ALLOC_FAILED

LaurV 2012-06-21 13:09

Some more "todo" :razz: for you:
- log files. With old method we were forced to use batch files, and those had the advantage that usually the last line in the batch was a "pause" line. So, when the CL crashed, we still could see the error. With the new method with worktodo files, we do not need to use the batch anymore, but when there is a crash, we can't see the error (window is closed). It should be nice to have a log file, whose name and eventually verbosity level could be set from the ini file.
- binaries for the beta, if you want it tested by third parties, it should be nice if you supply all the testing tools for it and not force the guy to crawl through the building process.

OTOH, I am now doing first time LL with two cards in parallel (to be sure of the residue). I am at 21M iter from 46M and all matching up to now.

Dubslow 2012-06-21 15:39

Sigh... it had been on my personal todo list to clean up a lot of the functions, which would be necessary to produce logging functionality... it'd probably take me a couple of days.

If you've been following the SourceForge, you should be aware I've been having [URL="https://sourceforge.net/apps/trac/sourceforge/ticket/26276"]a lot of issues[/URL] with it. With one of the suggested fixes, I appear to be able to interact with [i]Subversion[/i] just fine, but SourceForge's [i]display[/i] of the repository does not display correctly.

I have only to implement the results file locking functionality, which as you might have seen from mfakto's thread should be as simple as copy and paste. Once that was done and I was satisfied I wasn't going to make any more minor/cosmetic changes to the code, then I was going to post the code here for compiling as well as committing it to SVN. (It should be sometime in the next 12 hours.)

(PS flash, do not use https to commit to SourceForge; there's a small chance that it could corrupt the repository.)

Dubslow 2012-06-21 23:32

2.04 (Beta)
 
1 Attachment(s)
[QUOTE=Dubslow;302866]It should be sometime in the next 12 hours...[/QUOTE]
Well, here it is. svn/trunk has been updated to r32, which of course isn't showing up properly in SourceForge per the link in the above post. As such, here's the 2.04 Beta code attached here for flash. It [i]should[/i] be bug-free, but of course that's the whole point of a Beta :smile:

At a minimum, README and CUDALucas.ini should be shipped with any executables, the rest of the files are optional. (Of course, none of us really need the README, but hey.)
____________________________________________________________

Changes from 2.03 is pretty much new features:

The "-i" option from 2.03 (print device info) has been moved to "-info", and in its place we have the same "-i <ini file name>" from mfakto. This means you can run two instances from the same directory, although they each need different work files. It is safe to have two (or more) different instances writing to the same work file, thanks to Bdot's file locking code. It is NOT safe to have two different instances testing the same exponent. LaurV, if you want that ability, how should the checkpoint files be named? (Perhaps "cxxxxxxxx-<ini name>" or "cxxxxxxxx.<device number>"? Such re-naming would only happen if the -i option was used.)

The "total time" estimate has been added as LaurV requested. The estimated total time is printed whenever you pause a test, and when it finishes. However 2.04 can still read 2.03's save files, will recognize them as such, and won't print any messages in that case.

All FFT lengths are now printed as a multiple of 1024, e.g. "n = 1440K" instead of "n = 1474560".

The FFT length selection code has been spiffed up, though it isn't near the ability of Prime95's jump tables. To help mitigate that, the average error of the first 1000 iterations of every test is calculated, the same as with Prime95's "soft crossover" points.

You can also now specify FFT length for an individual assignment via the work file. To do that, add a field to the "Test=..." assignment line in the work file. To use (e.g.) a 1440K length for a test, the line should look like
[code]Test=<assignment key>,<exponent>,1440K[/code]
Note that no space is allowed between the number (1440) and the K. You must have a K or M (e.g. "...,<exponent>,3M" for a 3M length) for the program to recognize the field as an FFT length. This feature should render the FFTLength ini option and the -f command line option obsolete, though of course they still work for backwards compatibility.

The work file line parser now works apositionally, so there's no fuss about "Test=N/A,26204951" like in 2.03 with the comma counts and what not. It's smart enough to recognize an AID or exponent (or FFT length) :smile:

The results file default is now "results.txt" (not "result.txt") due to a minor case of OCD on my part :razz:

The AID (if any, not including "N/A") is now printed in the results file. (I plan to add V5UserID and ComputerID much like mfakto in 2.05, in preparation for when Christenson autmoates mfaktc.)

The "err = " line now prints the maximum error since the last checkpoint, not the maximum error since the last (re)start.

----------------------------------------------------------------------------

What I'm looking for mostly is any cosmetic changes anybody wants, or things to be specifiable via command line/ini file. Examples:

Should the estimated time be presented differently?
Should I have an option to not print the info from the initial round off test?
Should ResultsFile be specifiable via command line?
Should the err printed be specifiable via ini or cmd line?

...or anything else you want changed/added. Please, don't hesitate. :smile:

(PS I still have no idea what might cause cufftbench to crash. I didn't change that except for the 'break's that axn pointed out were missing.)

Dubslow 2012-06-22 00:13

Demonstration
 
I tried to edit it into the previous post, but it was too long.

[code]bill@Gravemind:~/CUDALucas/test/test∰∂ cat worktodo.txt
DoubleCheck=N/A,216091,24K,1,69
Test=12K,N/A,69,216091
Test=86243,4K,CA9CAECD26710FC828DFBBB8
DoubleCheck=CA9CAECD26710FC828DFBBB8________,26458577,69,1
bill@Gravemind:~/CUDALucas/test/test∰∂ ./CUDALucas

Starting M216091 fft length = 24K
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length.
Iteration 100, average error = 0.00000, max error = 0.00000
Iteration 200, average error = 0.00000, max error = 0.00000
Iteration 300, average error = 0.00000, max error = 0.00000
Iteration 400, average error = 0.00000, max error = 0.00000
Iteration 500, average error = 0.00000, max error = 0.00000
Iteration 600, average error = 0.00000, max error = 0.00000
Iteration 700, average error = 0.00000, max error = 0.00000
Iteration 800, average error = 0.00000, max error = 0.00000
Iteration 900, average error = 0.00000, max error = 0.00000
Iteration 1000, average error = 0.00000 < 0.25 (max error = 0.00000), continuing test.
Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1236 ms/iter, ETA 0:22)
Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:18)
Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1173 ms/iter, ETA 0:16)
Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1175 ms/iter, ETA 0:14)
Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:11)
Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:09)
Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1178 ms/iter, ETA 0:07)
Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1177 ms/iter, ETA 0:04)
Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:02 real, 0.1172 ms/iter, ETA 0:02)
Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 24K, CUDALucas v2.04 Beta err = 0.0000 (0:03 real, 0.1179 ms/iter, ETA 0:00)
M( 216091 )P, n = 24K, CUDALucas v2.04 Beta, estimated total time = 0:26

Starting M216091 fft length = 12K
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length.
Iteration 100, average error = 0.00310, max error = 0.00415
Iteration 200, average error = 0.00341, max error = 0.00427
Iteration 300, average error = 0.00349, max error = 0.00415
Iteration 400, average error = 0.00351, max error = 0.00415
Iteration 500, average error = 0.00355, max error = 0.00430
Iteration 600, average error = 0.00355, max error = 0.00415
Iteration 700, average error = 0.00356, max error = 0.00421
Iteration 800, average error = 0.00358, max error = 0.00439
Iteration 900, average error = 0.00361, max error = 0.00452
Iteration 1000, average error = 0.00361 < 0.25 (max error = 0.00427), continuing test.
Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:02 real, 0.0967 ms/iter, ETA 0:17)
Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:02 real, 0.0927 ms/iter, ETA 0:14)
Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:01 real, 0.0930 ms/iter, ETA 0:13)
Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 12K, CUDALucas v2.04 Beta err = 0.0044 (0:02 real, 0.0928 ms/iter, ETA 0:11)
Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 12K, CUDALucas v2.04 Beta err = 0.0044 (0:02 real, 0.0929 ms/iter, ETA 0:09)
Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 12K, CUDALucas v2.04 Beta err = 0.0045 (0:02 real, 0.0930 ms/iter, ETA 0:07)
Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 12K, CUDALucas v2.04 Beta err = 0.0046 (0:02 real, 0.0931 ms/iter, ETA 0:05)
Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 12K, CUDALucas v2.04 Beta err = 0.0042 (0:02 real, 0.0928 ms/iter, ETA 0:03)
Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 12K, CUDALucas v2.04 Beta err = 0.0044 (0:02 real, 0.0928 ms/iter, ETA 0:01)
Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 12K, CUDALucas v2.04 Beta err = 0.0041 (0:01 real, 0.0928 ms/iter, ETA 0:00)
M( 216091 )P, n = 12K, CUDALucas v2.04 Beta, estimated total time = 0:20

Starting M86243 fft length = 4K
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length.
Iteration 100, average error = 0.16577, max error = 0.20703
Iteration 200, average error = 0.18153, max error = 0.25000
Iteration 300, average error = 0.18649, max error = 0.25586
Iteration 400, average error = 0.18799, max error = 0.26172
Iteration 500, average error = 0.19219, max error = 0.25000
Iteration 600, average error = 0.19283, max error = 0.25000
Iteration 700, average error = 0.19504, max error = 0.25000
Iteration 800, average error = 0.19555, max error = 0.22656
Iteration 900, average error = 0.19565, max error = 0.21973
Iteration 1000, average error = 0.19539 < 0.25 (max error = 0.23438), continuing test.
Iteration 20000 M( 86243 )C, 0x233a5255467a4c6e, n = 4K, CUDALucas v2.04 Beta err = 0.2188 (0:01 real, 0.0651 ms/iter, ETA 0:03)
Iteration 40000 M( 86243 )C, 0x70b63ef639328851, n = 4K, CUDALucas v2.04 Beta err = 0.2344 (0:01 real, 0.0616 ms/iter, ETA 0:02)
Iteration 60000 M( 86243 )C, 0x25a4a96c66e7f897, n = 4K, CUDALucas v2.04 Beta err = 0.2812 (0:02 real, 0.0617 ms/iter, ETA 0:01)
Iteration 80000 M( 86243 )C, 0xdd477c413184da18, n = 4K, CUDALucas v2.04 Beta err = 0.2219 (0:01 real, 0.0618 ms/iter, ETA 0:00)
M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Beta, estimated total time = 0:05

Starting M26458577 fft length = 1440K
Running careful round off test for 1000 iterations. If average error >= 0.25, the test will restart with a larger FFT length.
Iteration 100, average error = 0.08321, max error = 0.11523
Iteration 200, average error = 0.09496, max error = 0.11719
Iteration 300, average error = 0.09992, max error = 0.12500
Iteration 400, average error = 0.10188, max error = 0.12500
Iteration 500, average error = 0.10267, max error = 0.11792
Iteration 600, average error = 0.10390, max error = 0.12109
Iteration 700, average error = 0.10444, max error = 0.11914
Iteration 800, average error = 0.10462, max error = 0.11230
Iteration 900, average error = 0.10468, max error = 0.11328
Iteration 1000, average error = 0.10509 < 0.25 (max error = 0.12305), continuing test.
Iteration 20000 M( 26458577 )C, 0xc0983d54298faf9e, n = 1440K, CUDALucas v2.04 Beta err = 0.1211 (1:50 real, 5.4642 ms/iter, ETA 40:06:03)
Iteration 40000 M( 26458577 )C, 0xc1989436f611c782, n = 1440K, CUDALucas v2.04 Beta err = 0.1172 (1:48 real, 5.4034 ms/iter, ETA 39:37:28)
^C SIGINT caught, writing checkpoint. Estimated time spent so far: 3:53

bill@Gravemind:~/CUDALucas/test/test∰∂ ./CUDALucas

Continuing work from a partial result of M26458577 fft length = 1440K iteration = 42802
Iteration 60000 M( 26458577 )C, 0xdcc248d33c956284, n = 1440K, CUDALucas v2.04 Beta err = 0.1230 (1:33 real, 4.6479 ms/iter, ETA 34:03:31)
Iteration 80000 M( 26458577 )C, 0x281ebeaddc336d4b, n = 1440K, CUDALucas v2.04 Beta err = 0.1211 (1:48 real, 5.4052 ms/iter, ETA 39:34:42)
^C SIGINT caught, writing checkpoint. Estimated time spent so far: 7:16

bill@Gravemind:~/CUDALucas/test/test∰∂ cat results.txt
M( 216091 )P, n = 24K, CUDALucas v2.04 Beta
M( 216091 )P, n = 12K, CUDALucas v2.04 Beta
M( 86243 )C, 0x2de7056ebffee28b, n = 4K, CUDALucas v2.04 Beta, AID: CA9CAECD26710FC828DFBBB8
bill@Gravemind:~/CUDALucas/test/test∰∂[/code]

LaurV 2012-06-22 00:24

Very good, man! Msft would be proud :D
Now seriously, you are putting a lot of effort into it.
Do you never sleep?

flashjh 2012-06-22 04:30

I am trying to compile... there are still some lingering issues because of the change from /Tp (C++) during compile. See [URL="http://andre.stechert.org/urwhatu/2006/01/error_c2143_syn.html"]here[/URL] for an example.

I will work on it some more tomorrow.

Dubslow 2012-06-22 04:46

[QUOTE=flashjh;302949]See [URL="http://andre.stechert.org/urwhatu/2006/01/error_c2143_syn.html"]here[/URL] for an example.[/QUOTE]

...compiler error. That's [i]way[/i] stupid. Blegh...

Edit: Perhaps if you copy/paste the errors here?

kjaget 2012-06-22 13:58

[QUOTE=Dubslow;302951]...compiler error. That's [i]way[/i] stupid. Blegh...

Edit: Perhaps if you copy/paste the errors here?[/QUOTE]

It's working like a C89 compiler should. Declaring variables anywhere but the start of a scope isn't C, even if some compilers allow it as an extension to the language. Annoying, but I think GCC will do the same thing if you set it up to be a true C compiler (i.e. -ansi -pedantic).

C is on life support in Visual Studio. Any reason not to build the code as C++?

Dubslow 2012-06-22 18:32

[QUOTE=kjaget;303001]Any reason not to build the code as C++?[/QUOTE]

Because it's not C++? :razz:

I realize that it [i]shouldn't[/i] cause problems, but there are a few obscure cases where C++ compiler would compile C code differently than a C compiler, and I don't really want to look for those.

flashjh 2012-06-23 18:45

[QUOTE=Dubslow;303023]Because it's not C++? :razz:

I realize that it [i]shouldn't[/i] cause problems, but there are a few obscure cases where C++ compiler would compile C code differently than a C compiler, and I don't really want to look for those.[/QUOTE]

The C++ idiosyncrasies have been fixed. It compiles fine now as C only.

@Dubslow: As soon as you fix my 'mistake' on SourceForge, I'll get the code uploaded and see if I can get the 2.04 Beta binaries there as well. If not,I'll post here until I can get that fixed.


All times are UTC. The time now is 23:16.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.