mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   CUDA 9 and sm_20 (https://www.mersenneforum.org/showthread.php?t=22866)

Happy5214 2021-01-28 02:34

[QUOTE=Brian Gladman;570224]Anyone who is having problems in building Windows x64 CUDA versions of MSIEVE or GMP-ECM using Visual Studio 2019 is welcome to report their issues here and I will do what I can to help.[/QUOTE]
FWIW I think xilman's post is a more appropriate quote than mine, since I was describing a Kubuntu (Linux) computer. Do you have any advice for us Linux users?

Gimarel 2021-01-28 06:31

[QUOTE=Happy5214;570302]FWIW I think xilman's post is a more appropriate quote than mine, since I was describing a Kubuntu (Linux) computer. Do you have any advice for us Linux users?[/QUOTE]
I'm using debian and a GTX 2060 Super.
First you have to get rid of the cub that comes with msieve and use the cub that comes with cuda. On debian I did these commands in the msieve directory:
[CODE]cd cub
rm -rf cub
ln -s /usr/include/cub[/CODE]Then you need the following patch:
[CODE]Index: Makefile
===================================================================
--- Makefile (Revision 1037)
+++ Makefile (Arbeitskopie)
@@ -34,6 +34,9 @@
-DMSIEVE_SVN_VERSION="\"$(SVN_VERSION)\"" \
-I. -Iaprcl -Iinclude -Ignfs -Ignfs/poly -Ignfs/poly/stage1

+CUDA = 1
+NO_ZLIB = 1
+
# tweak the compile flags

ifeq ($(ECM),1)
@@ -197,10 +200,7 @@
#---------------------------------- GPU file lists -------------------------

GPU_OBJS += \
- stage1_core_sm20.ptx \
- stage1_core_sm30.ptx \
- stage1_core_sm35.ptx \
- stage1_core_sm50.ptx \
+ stage1_core_sm75.ptx \
cub/built

#---------------------------------- NFS file lists -------------------------
@@ -346,5 +346,8 @@
stage1_core_sm50.ptx: $(NFS_GPU_HDR)
$(NVCC) -arch sm_50 -ptx -o $@ $<

+stage1_core_sm75.ptx: $(NFS_GPU_HDR)
+ $(NVCC) -arch sm_75 -ptx -o $@ $<
+
cub/built:
- cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=200,300,350,520 && cd ..
+ cd cub && make WIN=$(WIN) WIN64=$(WIN64) sm=750 && cd ..
Index: cub/Makefile
===================================================================
--- cub/Makefile (Revision 1037)
+++ cub/Makefile (Arbeitskopie)
@@ -16,7 +16,7 @@
NVCC = "$(shell which nvcc)"
CUDA_ROOT = $(shell dirname $(NVCC))/../
EXT = so
- NVCCFLAGS += -Xptxas -v -Xcudafe -\# -shared -Xptxas -abi=no \
+ NVCCFLAGS += -Xptxas -v -Xcudafe -\# -shared \
-Xcompiler -fPIC -Xcompiler -fvisibility=hidden
endif

@@ -27,6 +27,10 @@
SM_ARCH = 200
endif

+ifeq (750, $(findstring 750, $(SM_ARCH)))
+ SM_TARGETS += -gencode=arch=compute_75,code=\"sm_75,compute_75\"
+ SM_DEF += -DSM750
+endif
ifeq (520, $(findstring 520, $(SM_ARCH)))
SM_TARGETS += -gencode=arch=compute_52,code=\"sm_52,compute_52\"
SM_DEF += -DSM520
Index: gnfs/poly/stage1/stage1_sieve_gpu.c
===================================================================
--- gnfs/poly/stage1/stage1_sieve_gpu.c (Revision 1037)
+++ gnfs/poly/stage1/stage1_sieve_gpu.c (Arbeitskopie)
@@ -1113,7 +1113,7 @@
CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm35.ptx"))
}
else if (d->gpu_info->compute_version_major >= 5) {
- CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm50.ptx"))
+ CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm75.ptx"))
}
else
{
[/CODE]Note that the msieve built this way only works with CUDA arch 7.5.

Gimarel 2021-01-28 06:47

For ecm to compile with current CUDA I had to apply this patch:
[CODE]Index: cudakernel_default.cu
===================================================================
--- cudakernel_default.cu (Revision 3092)
+++ cudakernel_default.cu (Arbeitskopie)
@@ -7,7 +7,7 @@
carry_t cytemp;
unsigned int thm1;

- while(__any(cy[threadIdx.x])!=0)
+ while(__any_sync(__activemask(), cy[threadIdx.x])!=0)
{
thm1 = (threadIdx.x - 1) % ECM_GPU_NB_DIGITS;
cytemp = cy[thm1];
[/CODE]I don't know for sure if this is correct, but it works.

Happy5214 2021-01-28 09:28

Thank you, thank you, thank you! They both build now. yafu doesn't build, though. I'll report that to Ben.

firejuggler 2021-01-28 10:06

3 Attachment(s)
I do have the PTX but I still have trouble with the sort_engine...


edit : adding second screenshot

Gimarel 2021-01-28 10:25

I think that the 1660 Ti needs a different shader model. But I don't know which.

firejuggler 2021-01-28 10:37

a quick google search tell me it is shader model 6.5... so I should modify the sm_arch to 65?

Gimarel 2021-01-28 15:38

[QUOTE=firejuggler;570323]a quick google search tell me it is shader model 6.5... so I should modify the sm_arch to 65?[/QUOTE]
The CUDA Samples contain a deviceQuery progam. The compile contains these options for CUDA 11.1:
[CODE]-gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86[/CODE]

bsquared 2021-06-08 16:28

[QUOTE=Brian Gladman;570224]Anyone who is having problems in building Windows x64 CUDA versions of MSIEVE or GMP-ECM using Visual Studio 2019 is welcome to report their issues here and I will do what I can to help.[/QUOTE]

After adjusting the project files to point to my nvidia cuda toolkit location, I just built msieve with compute_75. I have an up-to-date sort_engine.dll and stage1_core_sm75.ptx file. But when trying to run msieve-gpu.exe it gives me this error:

[CODE]Msieve v. 1.54 (SVN 998)
Tue Jun 8 11:24:02 2021
random seeds: 2f8aea7c 74608973
factoring 138924029959401366454963864059579437250850355925904953363654825080008713183159095653855715163496880698665441863162263 (117 digits)
searching for 15-digit factors
commencing number field sieve (117-digit input)
commencing number field sieve polynomial selection
polynomial degree: 5
max stage 1 norm: 2.24e+18
max stage 2 norm: 7.93e+14
min E-value: 4.22e-10
poly select deadline: 6326
time limit set to 1.76 CPU-hours
expecting poly E from 5.27e-10 to > 6.06e-10
searching leading coefficients from 1 to 213804
using GPU 0 (TITAN RTX)
selected card has CUDA arch 7.5
deadline: 6326 CPU-seconds per coefficient
error (line 1116): CUDA_ERROR_FILE_NOT_FOUND[/CODE]

I'm not sure what file it's not finding or where it's not finding it.

Gimarel 2021-06-08 16:41

My SVN revision is a bit newer, but the solution is the same:
[CODE]Index: gnfs/poly/stage1/stage1_sieve_gpu.c
===================================================================
--- gnfs/poly/stage1/stage1_sieve_gpu.c (Revision 1030)
+++ gnfs/poly/stage1/stage1_sieve_gpu.c (Arbeitskopie)
@@ -1113,7 +1113,7 @@
CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm35.ptx"))
}
else if (d->gpu_info->compute_version_major >= 5) {
- CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm50.ptx"))
+ CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm75.ptx"))
}
else
{
[/CODE]

bsquared 2021-06-08 16:59

Working now, thanks!


All times are UTC. The time now is 01:26.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.