mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

Dubslow 2012-08-04 22:12

Okay, could you please post those save files? I'll put error-handling on the list for 2.05. Whenever you're awake, please also give the program a test. I'll run my own test, but the more the merrier (especially since we're on different platforms). If we can in fact change FFT lengths like that, then it'll make the error handling a lot easier.

Just to be clear, here's the todo list for 2.05:
[code]4. Separate check() from a hypothetical test(int q, char* expectres, int iters) function used for self testing and roundoff testing
4a. Refine print_bits() to print only the residue (related to 9a)
10. Extending 4, implement logging abilities.
6. Add V5UserID and ComputerID ini file options for later use
8. Figure out compiling arches/versions
9. Add option to not print residues at checkpoints? Option to skip extra initial error checking?
11. Extend the self test to include a lot more expos to test all FFT lengths, as well as near crossovers.
11a. Get a better idea of where crossovers are necessary.
12. Reinstall signal handler
13. Add an error handler for deep in the test
14. Print overall maxerr at end of test[/code]
Because of the triviality of 12 and 14, I'll do those after the filelocking gets fixed. (Still need a Windows compiler while flash is MIA...)

flashjh 2012-08-05 02:31

[QUOTE=Dubslow;306951](Still need a Windows compiler while flash is MIA...)[/QUOTE]
[COLOR=black][FONT=Verdana][COLOR=black][FONT=Verdana]I should be able to get back into things. Sorry for the disappearance; things got really crazy around here and I've barely been able to keep up with factoring work.[/FONT][/COLOR]
[/FONT][/COLOR]

Dubslow 2012-08-05 02:38

[QUOTE=flashjh;306976][COLOR=black][FONT=Verdana][COLOR=black][FONT=Verdana]I should be able to get back into things. Sorry for the disappearance; things got really crazy around here and I've barely been able to keep up with factoring work.[/FONT][/COLOR]
[/FONT][/COLOR][/QUOTE]

Just in time :smile: It seems you've survived whatever mess it is/was.

As for debugging, my only suggestion would be try the version I originally committed with Bdot's function definitions and see if that works. I think I saw someone mention it in that mess way up in the thread, but the mess was so confusing, especially since I wasn't following it too closely... take your time, I spose :smile:

kladner 2012-08-11 14:35

I just completed a sixth DC with 'CUDALucas-2.04-Beta-3.2-sm_13-x64', at least for the last 2-3. I have checked residues before reporting and all have matched.

Am I correct in thinking that the version above (3.2-sm_13) is preferred? This is running on a GTX 460 with driver 285.62.

flashjh 2012-08-11 14:38

@dubslow: what needs to be done now?

Dubslow 2012-08-11 14:43

[QUOTE=kladner;307655]
Am I correct in thinking that the version above (3.2-sm_13) is preferred? This is running on a GTX 460 with driver 285.62.[/QUOTE]
Whichever of the versions is fastest. If you'd like, try them all and tell us which is best :smile:

[QUOTE=flashjh;307656]@dubslow: what needs to be done now?[/QUOTE]
The filelocking on Windows (i.e. the lock file isn't getting deleted). AFAIK, that was never fixed (but I'd be happy to be wrong :smile:).

kladner 2012-08-11 14:52

[QUOTE=Dubslow;307657]Whichever of the versions is fastest. If you'd like, try them all and tell us which is best :smile:.............[/QUOTE]

I'll experiment with that if there's no problem with changing between "x.x-sm_x" varieties in mid run.

Dubslow 2012-08-11 15:43

[QUOTE=kladner;307658]I'll experiment with that if there's no problem with changing between "x.x-sm_x" varieties in mid run.[/QUOTE]

I can't see why it'd make a difference. :smile:

flashjh 2012-08-11 16:02

[QUOTE=Dubslow;307660]I can't see why it'd make a difference. :smile:[/QUOTE]

Yes, you can switch between them.

kladner 2012-08-11 16:26

OK. Thanks guys. I went on and started the comparisons since I realized that I've got all check files saved.

So far, 3.2-sm_13 is better than 4.0-sm_20 by maybe half a millisecond. I can't say exactly until I run the latter again. I let it get to the second report, but then absent-mindedly copied the time for the first.

kladner 2012-08-11 17:57

Here's what I got:
[CODE]GTX 460 (GF104) @ 715MHz (factory OC), Polite 15, Priority Normal, GPU usage 98-99%
Driver 285.62

M27680xxx FFT 1536K

CUDALucas-2.04-Beta-3.2-sm_13-x64 5.5720 ms/iter
CUDALucas-2.04Beta-4.0-sm_20-x64 6.0391 ms/iter
CUDALucas-2.04Beta-4.1-sm_21-x64 6.1662 ms/iter
CUDALucas-2.04-Beta-4.2-sm_30-x64 device_number >= device_count ... exiting
(This is probably a driver problem)[/CODE]I would upgrade the drivers except I just fought my way out of a can of worms with the correct installation of nVidia drivers. They still aren't quite right. From the first, the GTX 570 has the boxes in GPUZ for CUDA and DirectCompute unchecked, even though CUDA is clearly working with mfaktc on the card. When the 460 was absent the 570 reported correctly.


All times are UTC. The time now is 23:15.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.