![]() |
[QUOTE=Joshua2;203235][code]error (line 195): CUDA_ERROR_UNKNOWN[/code]
I tried twice and I got this error both times?[/QUOTE] Check that the ptx files from msieve are in the directory that you are working in. If not, copy them there. |
I had all 4 in my directory. I removed them and I get error line 151 file not found so its not that I'm guessing. With my original error I get a windows message saying kernel crashed and nvidia was recovered after maybe 20 seconds of thinking. I have GTX 275
|
I think CUDA_ERROR_UNKNOWN is a sign that the GPU code crashed while running. I'll try to investigate locally.
|
its reproducible for me, but I don't see any crash log files. I was trying but same number worked for someone else[code]6805362893736004031478480651829214338493080836379804529992054517869989193736891609130750957354816012909862839815014712354705998904676563244757467581729953604645013238303945872220207597531[/code] someone said that unknown cuda error was fixed by casting to void* instead of void** I don't know if that's applicable to you.
|
I'm using a binary compiled in late November (Nov 26 to be exact), and it's running fine on a Tesla C1060. I'm not sure what has changed since then.
|
I'm using the binary at the beginning of this thread. Is there a later one? I have win7 x64. I tried random number taking off some of the digits and it seems to work!? (Of course its the wrong number then) I got this error (line 195): CUDA_ERROR_LAUNCH_TIMEOUT once instead of the unknown error. Maybe the number is too big for the memory or something?
|
From [url]http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/cudatoolkit_release_notes_windows.txt[/url]. haven't been able to do anything helpful messing around in my registry (info is mostly vista not win 7) or display settings
[code] Known Issues -------------------------------------------------------------------------------- Vista and Server 2008 Specific Issues: o In order to run CUDA on a non-TESLA GPU, either the Windows desktop must be extended onto the GPU, or the GPU must be selected as the PhysX GPU. o [B]Individual kernels are limited to a 2-second runtime by Windows Vista[/B]. Kernels that run for longer than 2 seconds will trigger the Timeout Detection and Recovery (TDR) mechanism. For more information, see http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx. GPUs without a display attached are not subject to the 2 second runtime restriction. For this reason it is recommended that CUDA be run on a GPU that is NOT attached to a display and does not have the Windows desktop extended onto it. In this case, the system must contain at least one NVIDIA GPU that serves as the primary graphics adapter. [B]Thus, for devices like S1070 that do not have an attached display, users may disable the Windows TDR timeout. Disabling the TDR timeout will allow kernels to run for extended periods of time without triggering an error.[/B] The following is an example .reg script: Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers] "TdrLevel"=dword:00000000 o The maximum size of a single allocation created by cudaMalloc or cuMemAlloc is limited to: MIN ( ( System Memory Size in MB - 512 MB ) / 2, PAGING_BUFFER_SEGMENT_SIZE ) For Vista, PAGING_BUFFER_SEGMENT_SIZE is approximately 2GB. XP Specific Issues: o Individual GPU program launches are limited to a run time of less than 5 seconds on a GPU with a display attached. Exceeding this time limit usually causes a launch failure reported through the CUDA driver or the CUDA runtime. GPUs without a display attached are not subject to the 5 second runtime restriction. For this reason it is recommended that CUDA be run on a GPU that is NOT attached to a display and does not have the Windows desktop extended onto it. In this case, the system must contain at least one NVIDIA GPU that serves as the primary graphics adapter. Issues Common to XP and Vista: o GPU enumeration order on multi-GPU systems is non-deterministic and may change with this or future releases. Users should make sure to enumerate all CUDA-capable GPUs in the system and select the most appropriate one(s) to use. o Applications that try to use too much memory may cause a CUDA memcopy or kernel to fail with the error CUDA_ERROR_OUT_OF_MEMORY. If this happens, the CUDA Context is placed into an error state and must be destroyed and recreated if the application wants to continue using CUDA. o Malloc may fail due to running out of virtual memory space. The address space limitation is fixed by a Microsoft issued hotfix. Please install the patch located at http://support.microsoft.com/kb/940105 if this is an issue. Windows Vista SP1 includes this hotfix. o When compiling a source file that includes vector_types.h with the Microsoft compiler on a 32-bit Windows system, the 16-byte aligned vector types are not properly aligned at 16 bytes. o It is a known issue that cudaThreadExit() may not be called implicitly on host thread exit. Due to this, developers are recommended to explicitly call cudaThreadExit() while the issue is being resolved. o For maximum performance when using multiple byte sizes to access the same data, coalesce adjacent loads and stores when possible rather than using a union or individual byte accesses. Accessing the data via a union may result in the compiler reserving extra memory for the object, and accessing the data as individual bytes may result in non-coalesced accesses. This will be improved in a future compiler release. [/code] |
I have downloaded the Windows executable on the top of this thread (October 28th, 2009) on a laptop with GeForce 9500 GS and Windows 7-64 bit. The package worked immediately. The archive had 4 .ptx files.
My question is: is there a newer/faster/better precompiled Windows version to test out there? :smile: Is this version suitable to run under the new Python script? Luigi |
Try it with my local snapshot:
[url]www.boo.net/~jasonp/msieve144a_gpu.zip[/url] It now uses 6 PTX files, but I doubt this will do anything about the crash. PS: Most of the time a GPU kernel takes around a second in my experience, and the amount of work a single kernel call performs is scaled up or down depending on how much work Nvidia's driver says the GPU can handle. In windows XP the watchdog timeout is 5 seconds, but if it is 2 seconds in Vista then I can see a chance for a miscalculation to force the GPU to be rebooted. |
that version didn't work either. i have driver nvidia 195.62. I don't really think its my problem, since folding@home and boinc cuda stuff works fine. Never had trouble with a game either. Even other numbers I have tested work with msieve gpu. It appears that just numbers above a certain size don't work.
|
@Joshua2: I couldn't get your input to fail with the latest SVN; of course the latest code now has a lot of GPU changes.
@everyone: thanks to TheJudger I've figured out how to build inline asm into the GPU code, so there's been a lot of churn in that part of the source. I've added 48-bit and 72-bit arithmetic to the GPU code, which uses the native multiply size of current GPUs more effectively. When applicable (inputs < 135 digits, inputs ~175-190 digits) it's 35% faster than 64-bit or 96-bit arithmetic. Give it a try if you can build from source. |
| All times are UTC. The time now is 15:48. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.