mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Msieve with GPU support (https://www.mersenneforum.org/showthread.php?t=12562)

jrk 2010-01-26 03:33

[QUOTE=Joshua2;203235][code]error (line 195): CUDA_ERROR_UNKNOWN[/code]
I tried twice and I got this error both times?[/QUOTE]

Check that the ptx files from msieve are in the directory that you are working in. If not, copy them there.

Joshua2 2010-01-26 03:45

I had all 4 in my directory. I removed them and I get error line 151 file not found so its not that I'm guessing. With my original error I get a windows message saying kernel crashed and nvidia was recovered after maybe 20 seconds of thinking. I have GTX 275

jasonp 2010-01-26 04:36

I think CUDA_ERROR_UNKNOWN is a sign that the GPU code crashed while running. I'll try to investigate locally.

Joshua2 2010-01-26 06:42

its reproducible for me, but I don't see any crash log files. I was trying but same number worked for someone else[code]6805362893736004031478480651829214338493080836379804529992054517869989193736891609130750957354816012909862839815014712354705998904676563244757467581729953604645013238303945872220207597531[/code] someone said that unknown cuda error was fixed by casting to void* instead of void** I don't know if that's applicable to you.

frmky 2010-01-26 06:55

I'm using a binary compiled in late November (Nov 26 to be exact), and it's running fine on a Tesla C1060. I'm not sure what has changed since then.

Joshua2 2010-01-26 08:54

I'm using the binary at the beginning of this thread. Is there a later one? I have win7 x64. I tried random number taking off some of the digits and it seems to work!? (Of course its the wrong number then) I got this error (line 195): CUDA_ERROR_LAUNCH_TIMEOUT once instead of the unknown error. Maybe the number is too big for the memory or something?

Joshua2 2010-01-26 09:52

From [url]http://developer.download.nvidia.com/compute/cuda/2_3/toolkit/docs/cudatoolkit_release_notes_windows.txt[/url]. haven't been able to do anything helpful messing around in my registry (info is mostly vista not win 7) or display settings
[code]
Known Issues
--------------------------------------------------------------------------------

Vista and Server 2008 Specific Issues:

o In order to run CUDA on a non-TESLA GPU, either the Windows desktop
must be extended onto the GPU, or the GPU must be selected as the
PhysX GPU.

o [B]Individual kernels are limited to a 2-second runtime by Windows
Vista[/B]. Kernels that run for longer than 2 seconds will trigger
the Timeout Detection and Recovery (TDR) mechanism. For more
information, see
http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx.

GPUs without a display attached are not subject to the 2 second
runtime restriction. For this reason it is recommended that
CUDA be run on a GPU that is NOT attached to a display and
does not have the Windows desktop extended onto it. In this
case, the system must contain at least one NVIDIA GPU that
serves as the primary graphics adapter. [B]Thus, for devices like S1070
that do not have an attached display, users may disable the Windows TDR
timeout. Disabling the TDR timeout will allow kernels to run for
extended periods of time without triggering an error.[/B]

The following is an example .reg script:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers]
"TdrLevel"=dword:00000000

o The maximum size of a single allocation created by cudaMalloc
or cuMemAlloc is limited to:
MIN ( ( System Memory Size in MB - 512 MB ) / 2, PAGING_BUFFER_SEGMENT_SIZE )
For Vista, PAGING_BUFFER_SEGMENT_SIZE is approximately 2GB.

XP Specific Issues:

o Individual GPU program launches are limited to a run time
of less than 5 seconds on a GPU with a display attached.
Exceeding this time limit usually causes a launch failure
reported through the CUDA driver or the CUDA runtime. GPUs
without a display attached are not subject to the 5 second
runtime restriction. For this reason it is recommended that
CUDA be run on a GPU that is NOT attached to a display and
does not have the Windows desktop extended onto it. In this
case, the system must contain at least one NVIDIA GPU that
serves as the primary graphics adapter.

Issues Common to XP and Vista:

o GPU enumeration order on multi-GPU systems is non-deterministic and
may change with this or future releases. Users should make sure to
enumerate all CUDA-capable GPUs in the system and select the most
appropriate one(s) to use.

o Applications that try to use too much memory may cause a
CUDA memcopy or kernel to fail with the error
CUDA_ERROR_OUT_OF_MEMORY. If this happens, the CUDA Context is
placed into an error state and must be destroyed and recreated
if the application wants to continue using CUDA.

o Malloc may fail due to running out of virtual memory space.
The address space limitation is fixed by a Microsoft issued
hotfix. Please install the patch located at
http://support.microsoft.com/kb/940105 if this is an issue.
Windows Vista SP1 includes this hotfix.

o When compiling a source file that includes vector_types.h with the Microsoft
compiler on a 32-bit Windows system, the 16-byte aligned vector types are not
properly aligned at 16 bytes.

o It is a known issue that cudaThreadExit() may not be called implicitly on
host thread exit. Due to this, developers are recommended to explicitly
call cudaThreadExit() while the issue is being resolved.

o For maximum performance when using multiple byte sizes to access the
same data, coalesce adjacent loads and stores when possible rather
than using a union or individual byte accesses. Accessing the data via
a union may result in the compiler reserving extra memory for the object,
and accessing the data as individual bytes may result in non-coalesced
accesses. This will be improved in a future compiler release.
[/code]

ET_ 2010-01-26 11:27

I have downloaded the Windows executable on the top of this thread (October 28th, 2009) on a laptop with GeForce 9500 GS and Windows 7-64 bit. The package worked immediately. The archive had 4 .ptx files.

My question is: is there a newer/faster/better precompiled Windows version to test out there? :smile:

Is this version suitable to run under the new Python script?

Luigi

jasonp 2010-01-26 14:12

Try it with my local snapshot:

[url]www.boo.net/~jasonp/msieve144a_gpu.zip[/url]

It now uses 6 PTX files, but I doubt this will do anything about the crash.

PS: Most of the time a GPU kernel takes around a second in my experience, and the amount of work a single kernel call performs is scaled up or down depending on how much work Nvidia's driver says the GPU can handle. In windows XP the watchdog timeout is 5 seconds, but if it is 2 seconds in Vista then I can see a chance for a miscalculation to force the GPU to be rebooted.

Joshua2 2010-01-26 17:13

that version didn't work either. i have driver nvidia 195.62. I don't really think its my problem, since folding@home and boinc cuda stuff works fine. Never had trouble with a game either. Even other numbers I have tested work with msieve gpu. It appears that just numbers above a certain size don't work.

jasonp 2010-02-16 13:22

@Joshua2: I couldn't get your input to fail with the latest SVN; of course the latest code now has a lot of GPU changes.

@everyone: thanks to TheJudger I've figured out how to build inline asm into the GPU code, so there's been a lot of churn in that part of the source. I've added 48-bit and 72-bit arithmetic to the GPU code, which uses the native multiply size of current GPUs more effectively. When applicable (inputs < 135 digits, inputs ~175-190 digits) it's 35% faster than 64-bit or 96-bit arithmetic. Give it a try if you can build from source.


All times are UTC. The time now is 15:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.