![]() |
The CUDA compiler refuses to make any binary under CC3.0, so 2.0 is not possible without getting an earlier CUDA SDK version.
Here is a binary for 3.0: [url]https://cubox.me/files/gimps/CC3.0-CUDAPm1.exe[/url] |
[QUOTE=Cubox;511483]The CUDA compiler refuses to make any binary under CC3.0, so 2.0 is not possible without getting an earlier CUDA SDK version.[/QUOTE]Congrats on getting a build to work. Sounds like you're making good progress.
I think you would need a CUDA 8 or earlier SDK for CC2.0 output. It won't address CC2.0 from 10.1 SDK, but have you tried multiple CC level & PTX output? Supposedly multiple versions can be included, so a variety of gpus can have what's optimal for them available, I think from the same exe file. Then from the CUDA 8 SDK, CC2.0-6.5 or so could be covered in one exe. About half my gpus are CC2.x. In a CUDA 9.2 SDK mfaktc makefile, it looks like [CODE]# generate code for various compute capabilities # not available in cuda9.2 NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc) # not available in cuda9.2 NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all! NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code NVCCFLAGS += --generate-code arch=compute_52,code=sm_52 # CC 5.2 GPUs will use this code NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.1+ GPUs will use this code, GTX 10xx for example NVCCFLAGS += --generate-code arch=compute_70,code=sm_70 # CC 7.x GPUs will use this code [/CODE] |
I thought you could only have one CC level per binary. I'll try multiples. I was going to make one binary for each CC level...
If I make Cuda8 work, what I can do one binary that will cover 2.0-6.5 like you mentioned, and then one from the latest Cuda (10.1 right now) that will do 3.0-max |
The links I give with links to builds can become dead links at any time.
Here is my latest build, using CUDA 10.1, with all compute capabilities from 3.0 to 7.1 (used the list kriesel took from mfaktc). Changed the version name in the code to v0.22.notstable, since it is not tested at all, but has no changes from 0.22 code (yet). It should behave the same: [url]https://cubox.dev/files/gimps/CUDAPm1.exe[/url] |
[QUOTE=Cubox;511547]The links I give with links to builds can become dead links at any time.
Here is my latest build, using CUDA 10.1, with all compute capabilities from 3.0 to 7.1 (used the list kriesel took from mfaktc). Changed the version name in the code to v0.22.notstable, since it is not tested at all, but has no changes from 0.22 code (yet). It should behave the same: [URL]https://cubox.dev/files/gimps/CUDAPm1.exe[/URL][/QUOTE] I see a couple of your recent build links are already 404's. My CUDA 9.2 example from an edited mfaktc makefile ended at cc 7.0. I couldn't tell whether you added cc7.5 for your CUDA 10.1 build. From the NVIDIA "Turing Compatibility Guide" available at the documentation url for the CUDA10 SDK, [CODE]nvcc.exe -ccbin "C:\vs2010\VC\bin" -Xcompiler "/EHsc /W3 /nologo /O2 /Zi /MT" -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 --compile -o "Release\mykernel.cu.obj" "mykernel.cu"[/CODE]Note the last -gencode line, which looks to the casual glance to be duplication, but isn't. As the Turing compatibility guide explains (and previous ones also), code=compute_xx is PTX (future-proofing), while code=sm_xx is cubin which is compute-capability specific. (There's a just-in-time final compile of PTX to cubin, if I've understood correctly enough.) Some do these compiles from batch files, for example, [CODE]nvcc -ccbin "C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\bin" -cubin -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/O2,/Zi,/MT -I"C:\CUDA\include" -I./ -I"C:\Program Files\NVIDIA Corporation\NVIDIA SDK 10\NVIDIA CUDA SDK\common\inc" %1 :I use it from command line like this : runnvcccubin.bat file_name.cu :You may need to change the Visual Studio and NVIDIA SDK directories to make it work in your environment. :https://devtalk.nvidia.com/default/topic/368105/cuda-occupancy-calculator-helps-pick-optimal-thread-block-size/[/CODE]Not sure if that's helpful; see the included url for context. Best reference I've found yet for correlating CC, CUDA, gpu family, gpu model etc is [URL]https://en.wikipedia.org/wiki/CUDA[/URL] |
Rebuild the binary (same URL) with those CCs:
[C] -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75[/C] |
Card : Geforce 1050Ti
OS Win7 x64 Latest Nvidia drivers I need to download cufft64_10.dll to make CudaPm-1 work Dll size is 110 MB |
You can find all the CUDA DLLs you need at
[url]https://download.mersenne.ca/CUDA-DLLs[/url] Confusingly-named DLL, but you're looking for the CUDA v10.1 versions: [url]https://download.mersenne.ca/CUDA-DLLs/CUDA-10.1[/url] |
[QUOTE=Cubox;511578]Rebuild the binary (same URL) with those CCs:
[C] -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_61,code=sm_61 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75[/C][/QUOTE] I didn't mean for you to drop 3.0, 3.5 from the 10.1 build, it was just a quick copy/paste verbatim from the reference. Do you plan to do a CUDA8 SDK build going back to cc2.0? Something like [CODE]#CUDA SDK 8.0 cc 2.0-6.1 +PTX, requires compatible drivers and dlls # CC 1.x unsupported; other omitted CC steps 2.1, 3.2, 3.7, 5.3, 6.0, 6.2, 7.2 NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code; cc2.0 has 32 alu lanes (int & single precision) # NVCCFLAGS += --generate-code arch=compute_21,code=sm_21 # CC 2.1 GPUs could use this code in SP apps like mfaktc; cc2.1 has 48 alu lanes NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code NVCCFLAGS += --generate-code arch=compute_52,code=sm_52 # CC 5.2 GPUs will use this code NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.1+ GPUs will use this code, GTX 10xx for example NVCCFLAGS += --generate-code arch=compute_61,code=compute_61 # future-proof with PTX, eg CC 7.x+ GPUs will use this code[/CODE][CODE]#CUDA SDK 10.x cc 3.0-7.5 +PTX, requires compatible drivers and dlls # CC 2.x and lower unsupported; other omitted CC steps 3.2, 3.7, 5.3, 6.0, 6.2, 7.2 NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code NVCCFLAGS += --generate-code arch=compute_52,code=sm_52 # CC 5.2 GPUs will use this code NVCCFLAGS += --generate-code arch=compute_61,code=sm_61 # CC 6.1+ GPUs will use this code, GTX 10xx for example NVCCFLAGS += --generate-code arch=compute_70,code=sm_70 # CC 7. GPUs will use this code NVCCFLAGS += --generate-code arch=compute_75,code=sm_75 # CC 7.5+ GPUs will use this code, RTX 20xx for example NVCCFLAGS += --generate-code arch=compute_75,code=compute_75 # future-proof with PTX, eg CC 8.x+ GPUs will use this code[/CODE]An older cudapm1 make file I had here went all the way back to cc1.3, but I think there are few if any of those cards left. |
Added CC 3.0 and 3.5 to latest build (same URL)
Flags are: [C]compute_30,sm_30 compute_35,sm_35 compute_50,sm_50 compute_52,sm_52 compute_60,sm_60 compute_61,sm_61 compute_70,sm_70 compute_75,sm_75 compute_75,compute_75[/C] [url]https://cubox.dev/files/gimps/[/url] contains latest exe, with the DLLs required to run the program and the .ini you can have. Regarding CUDA8, it does not support VS2017, only VS2015. It's a lot of work to support it, I'd rather spend the time actually editing the code. |
Many thanks to all of the developers of CUDAPm1. I have V.22 running an assignment on my dinky GT1030. First P-1 assignment in years for me.
|
| All times are UTC. The time now is 23:18. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.