![]() |
[QUOTE=wombatman;394445]Yeah, I had used the batch file method too. Just wondered if there was a way to just have it run without crashing. Thanks![/QUOTE]
Take a look at [URL="http://www.mersenneforum.org/showthread.php?p=364436#post364436"]this[/URL]. It has been over a year since I did the testing and I don't know how CUDA 6.5 will do with everything. Either way, it really appears to be a driver issue. Let me know what you find... :smile: BTW - Yes, my binaries are for Windows. |
Yeah, I'll try and run through paces. The little bit I've found so far (and you may already know all this) is that the error comes from the Timeout Detection and Recovery ([url]http://http.developer.nvidia.com/NsightVisualStudio/2.2/Documentation/UserGuide/HTML/Content/Timeout_Detection_Recovery.htm[/url]). So basically, if the GPU stops responding for 2 seconds (by default), Windows restarts the driver, which is the error we get.
You can increase the delay time either through the registry or using NSight Monitor (under Options/General). Increasing it to 20 seconds got me through the benchmark of -r 0 and up to 5000K on the cufftbench 1 8192 6. But I still hung and crashed. The error given is below: [CODE]CUDALucas.cu(2366) : cudaSafeCall() Runtime API error 30: unknown error. CUDALucas.cu(1049) : cudaSafeCall() Runtime API error 46: all CUDA-capable devices are busy or unavailable.[/CODE] The first line, 2366, refers to: [CODE]cutilSafeCall (cudaEventRecord (stop, 0)); err = cutilSafeCall1(cudaEventSynchronize (stop));[/CODE] The second, 1049, refers to: [CODE]cutilSafeCall (cudaMalloc ((void **) &g_x, sizeof (double) * n));//size_d));[/CODE] in alloc_gpu_mem function. So maybe there is an issue with the synchronizing that causes a hang/error, and when you try to fix it by restarting the device, everything happens too quickly and you can't do the memory allocation? As you might imagine, I'm totally guessing here. Edit: Also, it's worth mentioning that when I turned TDR off completely, I still got a hang from CUDALucas. So TDR is not directly responsible (I think) for the error. It's just what we see when the driver restarts. Also, the point at which CUDALucas hangs is inconsistent, even with the same command line parameters. |
Here's what I know about the bug.
It hangs during a cufft call. It is specific to compute 2.0 cards. It is most likely not a problem with cufft: cuftt4.2 with Nvidia driver 295.?? works cufft4.2 with >30?.?? drivers show the bug In Linux, we can recover by resetting the device inside CUDALucas. In Windows, the devices are deactivated after the timeout so instead of continuing merrily on our way, we get that memory allocation error you are seeing. CUDALucas needs to be restarted to continue. [COLOR=#000000][FONT=sans-serif] [/FONT][/COLOR] |
[QUOTE=flashjh;394429]2.05 Beta CUDA 6.5 x64 is [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"]here[/URL] (passed self test)
I can also build the CUDA 7.0 binaries, but 7.0 is still Release Candidate, so if you experience bugs... CUDA Libs are [URL="https://sourceforge.net/projects/cudalucas/files/CUDA%20Libs/"]here[/URL]. I'll upload the 7.0 libs when 7.0 is final.[/QUOTE] Thanks! Reworking that doublecheck now (38 hours to do M38635771,71,1 on a GTX970). |
[QUOTE=owftheevil;394530]Here's what I know about the bug.
It hangs during a cufft call. It is specific to compute 2.0 cards. It is most likely not a problem with cufft: cuftt4.2 with Nvidia driver 295.?? works cufft4.2 with >30?.?? drivers show the bug In Linux, we can recover by resetting the device inside CUDALucas. In Windows, the devices are deactivated after the timeout so instead of continuing merrily on our way, we get that memory allocation error you are seeing. CUDALucas needs to be restarted to continue. [COLOR=#000000][FONT=sans-serif] [/FONT][/COLOR][/QUOTE] How odd that it is specific to 2.0 cards (which is what I have). For what it's worth, I started a CUDALucas run overnight (M65911957 with FFT size of 3584K), and at least as far back as I can see, which is around 2-3 hours, it has not errored out. So maybe increasing the TdrDelay registry value helps? |
[QUOTE=wombatman;394540]How odd that it is specific to 2.0 cards (which is what I have). For what it's worth, I started a CUDALucas run overnight (M65911957 with FFT size of 3584K), and at least as far back as I can see, which is around 2-3 hours, it has not errored out. So maybe increasing the TdrDelay registry value helps?[/QUOTE]
If you use this batch file it will count the restarts and put the number in the title of the window. You can also send it to a log, if you want. [CODE] @echo off Set count=0 Set program=CUDALucas2.05Beta-CUDA5.0-Win32-r60 :loop [LEFT]TITLE %program% Current Reset Count = %count% [/LEFT] Set /A count+=1 rem echo %count% >> log.txt rem echo %count% %program%.exe GOTO loop [/CODE]For what it's worth, I did a lot of testing and found that the restart problem, though irritating, didn't affect the results. So once you get it going, you should be ok. It was a hassle when trying to setup the cufftbench though. |
[STRIKE][QUOTE=owftheevil;394530]
It is specific to compute 2.0 cards.[/QUOTE] The 970 is [URL="http://en.wikipedia.org/wiki/CUDA#Supported_GPUs"]CC 5.2[/URL], should it be affected? @wombatman, do you want a CUDA 6.5, CC 5.2 only build to see it it's any faster?[/STRIKE] Never mind... got people mixed up Us old 2.0 card holders need to get with the times :smile: |
I only have a CC 2.0 card, so I wouldn't be able to run it, unfortunately.
|
[QUOTE=monsted;394533]Thanks! Reworking that doublecheck now (38 hours to do M38635771,71,1 on a GTX970).[/QUOTE]
@monsted, do you want a CUDA 6.5, CC 5.2 only build to see it it's any faster? I put them on [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"]SourceForge[/URL] for you, if you want to try. |
[QUOTE=flashjh;394554]@monsted, do you want a CUDA 6.5, CC 5.2 only build to see it it's any faster?
I put them on [URL="https://sourceforge.net/projects/cudalucas/files/2.05%20Beta/"]SourceForge[/URL] for you, if you want to try.[/QUOTE] Tried it out, but it doesn't seem to have made any noticable difference.ms/It is just about 3.6080 with both binaries. It does cut down the size of the binary, so i'm guessing it just doesn't carry the cores it wouldn't use anyway? |
[QUOTE=monsted;394685]Tried it out, but it doesn't seem to have made any noticable difference.ms/It is just about 3.6080 with both binaries.
It does cut down the size of the binary, so i'm guessing it just doesn't carry the cores it wouldn't use anyway?[/QUOTE] Yes this binary was built only for 5.2. I tried the self test on a 580, it runs but gives all 0 residues. |
| All times are UTC. The time now is 23:06. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.