![]() |
|
|
#650 |
|
Jul 2009
Tokyo
2·5·61 Posts |
Ver 1.41
Fix minor Bug. |
|
|
|
|
|
#651 |
|
Jul 2009
Tokyo
2×5×61 Posts |
1.41 binarie file for Linux64.
|
|
|
|
|
|
#652 |
|
Dec 2009
Peine, Germany
331 Posts |
I can now successfully compile a 32 bit version but not 64 bit.
![]() What I have done so far: 32 bit: 1. Installed WinXP 32 bit VM 2. Installed Nvidia GPU Toolkit and GPU SDK (e.g. version 4.0) 3. Installed Make for Windows 4. Installed MS Visual Studio 2010 Express Edition 5. Set Path for nvcc, make and cl.exe (from VS/bin) 6. Adapted makefiles taken from apsen's 1.2b for my needs. (He posted several sets for CUDA3.2/4.0 and 32/64bit ) 7. Compile with make ... --> SUCCESS. But test with 216.091 failed due to wrong residue. 64 bit: 1. Used my real Win7 64 bit 2.-6. as for 32 bit, with the exception of the special 64 bit flag (-m64 and -Dx64, see apsen's makefiles) 7. Compile with make ... --> FAILED: Code:
nvcc fatal due to (null) configuration file For now, I have to give up. ![]() So again, could anybody with a real MS VS (2010) edition please compile CUDALucas or has a suggestion for me?
|
|
|
|
|
|
#653 |
|
Dec 2009
Peine, Germany
331 Posts |
To add for step 7:
For 32 bit, I had to call vcvars32.bat... (mspdb100.dll missing solution) For 64 bit, I have nothing locally, only vcvarsall.bat with dead references to the other batch files. Main wish is to be able to self compile CUDALucas for win without spending money... |
|
|
|
|
|
#654 |
|
Dec 2009
Peine, Germany
331 Posts |
New day, new try. I'm still trying to compile CUDALucas 1.4.1 with CUDA 4.0 for Win64.
First of all, I've installed VS Studio 2010 Professional Trial. Former error has vanished ('null'). Then, I've found apsen's compile readme, find attached, too. As a consequence of this I swapped to Visual Studio x64 Command Prompt. I made some changes to my makefile: I changed my includes to "-I$(CUDA)/include" "-I$(CUDA)/include/cudart". Makefile: Code:
CUDA_VERSION = 4.0
CUDA_ARCH = sm_21
BIT = WIN64
CUDA_BIT = x64
CUDA = C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v$(CUDA_VERSION)
NVIDIA_SDK = C:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK $(CUDA_VERSION)
VCINSTALLDIR = C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC
LIBS = "$(CUDA)/lib/$(CUDA_BIT)/cudart.lib" "$(CUDA)/lib/$(CUDA_BIT)/cufft.lib"
CUFLAGS = -m64 --ptxas-options=-v "-ccbin=$(VCINSTALLDIR)/bin" -D$(BIT) -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=$(CUDA_ARCH) -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-I$(CUDA)/include" "-I$(CUDA)/include/cudart" "-I$(NVIDIA_SDK)/C/common/inc" -D__x86_64_ -O3
LINK = link
LFLAGS = /nologo /LTCG #/ltcg:pgo
CUSRC = CUDALucas.cu setup.cu rw.cu balance.cu zero.cu
CUOBJS = $(CUSRC:.cu=.obj)
CUDALucas.exe: $(CUOBJS)
$(LINK) $(LFLAGS) $^ $(LIBS) /out:$@
%.obj: %.cu
nvcc -c $< -o $@ $(CUFLAGS)
2. If I try to compile with these includes I get lots of errors, see below. 3. If I try to compile without these include it compiles successfully (see below) but does not run correctly as residue mismatches, see below. Compile failing with includes: Code:
nvcc -c CUDALucas.cu -o CUDALucas.obj -m64 --ptxas-options=-v "-ccbin=C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.0/include" "-IC:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.0/include/cudart" "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.0/C/common/inc" -D__x86_64_ -O3 c:\program files\nvidia gpu computing toolkit\cuda\v4.0\include\driver_types.h(3 87): error: "cudaErrorSetOnActiveProcess" has already been declared in the curre nt scope [...many more...] c:\program files\nvidia gpu computing toolkit\cuda\v4.0\include\driver_types.h(8 09): error: "cudaComputeMode" has already been declared in the current scope Error limit reached. 100 errors detected in the compilation of "C:/Users/FAMILI~1/AppData/Local/Temp/ tmpxft_000013c4_00000000-7_CUDALucas.cpp2.i". Compilation terminated. make: *** [CUDALucas.obj] Fehler 4 Code:
nvcc -c CUDALucas.cu -o CUDALucas.obj -m64 --ptxas-options=-v "-ccbin=C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.0/C/common/inc" -D__x86_64_ -O3 tmpxft_000010bc_00000000-14_CUDALucas.ii CUDALucas.cu(602) : warning C4018: '<' : signed/unsigned mismatch CUDALucas.cu(609) : warning C4018: '<' : signed/unsigned mismatch CUDALucas.cu(669) : warning C4018: '>=' : signed/unsigned mismatch CUDALucas.cu(793) : warning C4244: 'argument' : conversion from 'float' to 'size_t', possible loss of data nvcc -c setup.cu -o setup.obj -m64 --ptxas-options=-v "-ccbin=C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.0/C/common/inc" -D__x86_64_ -O3 tmpxft_000002c0_00000000-14_setup.ii nvcc -c rw.cu -o rw.obj -m64 --ptxas-options=-v "-ccbin=C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.0/C/common/inc" -D__x86_64_ -O3 tmpxft_0000015c_00000000-14_rw.ii rw.cu(1479) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1479) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1479) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1479) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1479) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1479) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1615) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1622) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1629) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1637) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1645) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1653) : warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data rw.cu(1826) : warning C4018: '>' : signed/unsigned mismatch rw.cu(1917) : warning C4018: '>' : signed/unsigned mismatch nvcc -c balance.cu -o balance.obj -m64 --ptxas-options=-v "-ccbin=C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.0/C/common/inc" -D__x86_64_ -O3 tmpxft_00001048_00000000-14_balance.ii nvcc -c zero.cu -o zero.obj -m64 --ptxas-options=-v "-ccbin=C:/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:/ProgramData/NVIDIA Corporation/NVIDIA GPU Computing SDK 4.0/C/common/inc" -D__x86_64_ -O3 tmpxft_00001244_00000000-14_zero.ii link /nologo /LTCG CUDALucas.obj setup.obj rw.obj balance.obj zero.obj "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.0/lib/x64/cudart.lib" "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.0/lib/x64/cufft.lib" /out:CUDALucas.exe Generating code Finished generating code Code:
F:\Eigene Dateien\Computing\cudalucas.1.4.1\win64\CUDA4.0\sm_21>CUDALucas.exe 216091 Iteration 10000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 20000 2.8 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 30000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 40000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 50000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 60000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 70000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 80000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 90000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 100000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 110000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 120000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 130000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 140000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 150000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 160000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 170000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 180000 2.5 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 190000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffe, n = 524288, CUDALucas v1.41 Iteration 200000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 Iteration 210000 2.6 msec/Iter M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 M( 216091 )C, 0xfffffffffffffffd, n = 524288, CUDALucas v1.41 CUDALucas: Could not find a checkpoint file to resume from Any ideas? |
|
|
|
|
|
#655 |
|
Dec 2009
Peine, Germany
331 Posts |
No comments please: My self-made makefile must have an error. I took apsen's one without modifications, pathes fit without changes. Now it seems to work:
GTX 560 Ti @ default clock: Code:
F:\Eigene Dateien\Computing\cudalucas.1.4.1\win64\CUDA4.0\sm_13>CUDALucas.cuda4.0.sm_13.WIN64.exe 216091 Iteration 10000 2.4 msec/Iter M( 216091 )C, 0x30247786758b8792, n = 524288, CUDALucas v1.41 Iteration 20000 2.4 msec/Iter M( 216091 )C, 0x13e968bf40fda4d7, n = 524288, CUDALucas v1.41 Iteration 30000 2.4 msec/Iter M( 216091 )C, 0x540772c2abb7833a, n = 524288, CUDALucas v1.41 Iteration 40000 2.3 msec/Iter M( 216091 )C, 0xc26da9695ac418c1, n = 524288, CUDALucas v1.41 Iteration 50000 2.4 msec/Iter M( 216091 )C, 0x95ce3ff44abdd1e5, n = 524288, CUDALucas v1.41 Iteration 60000 2.5 msec/Iter M( 216091 )C, 0x99aa87c495daffe7, n = 524288, CUDALucas v1.41 Iteration 70000 2.4 msec/Iter M( 216091 )C, 0x505d249be3145893, n = 524288, CUDALucas v1.41 Iteration 80000 2.5 msec/Iter M( 216091 )C, 0xddf612c72037b8a1, n = 524288, CUDALucas v1.41 Iteration 90000 2.4 msec/Iter M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 524288, CUDALucas v1.41 Iteration 100000 2.4 msec/Iter M( 216091 )C, 0x4de7f101ee1cb7a5, n = 524288, CUDALucas v1.41 Iteration 110000 2.5 msec/Iter M( 216091 )C, 0x10aa3286c0b03369, n = 524288, CUDALucas v1.41 Iteration 120000 2.4 msec/Iter M( 216091 )C, 0x3981b56788b529e2, n = 524288, CUDALucas v1.41 Iteration 130000 2.4 msec/Iter M( 216091 )C, 0x80438af231f8fccd, n = 524288, CUDALucas v1.41 Iteration 140000 2.4 msec/Iter M( 216091 )C, 0x669382faea06df89, n = 524288, CUDALucas v1.41 Iteration 150000 2.4 msec/Iter M( 216091 )C, 0x1b73cb121df7d6fa, n = 524288, CUDALucas v1.41 Iteration 160000 2.6 msec/Iter M( 216091 )C, 0xb391010f29c70ee1, n = 524288, CUDALucas v1.41 Iteration 170000 2.4 msec/Iter M( 216091 )C, 0x04055d84a77be1d8, n = 524288, CUDALucas v1.41 Iteration 180000 2.5 msec/Iter M( 216091 )C, 0xe3d74c104f02967d, n = 524288, CUDALucas v1.41 Iteration 190000 2.4 msec/Iter M( 216091 )C, 0x54b2a8b9cb149f9f, n = 524288, CUDALucas v1.41 Iteration 200000 2.4 msec/Iter M( 216091 )C, 0xf433496947b7b103, n = 524288, CUDALucas v1.41 Iteration 210000 2.4 msec/Iter M( 216091 )C, 0xcfe091c8f59f8a7b, n = 524288, CUDALucas v1.41 M( 216091 )P, n = 524288, CUDALucas v1.41 CUDALucas: Could not find a checkpoint file to resume from Starting to test this compile... P.S.: We should try the non-power-of-2 FFT sizes. |
|
|
|
|
|
#656 |
|
Dec 2009
Peine, Germany
331 Posts |
And one compiled for Compute Capability 2.1 (sm_21) instead of 1.3. I don't know yet wheter this is relevant or not. File size differs by 3 KB and performance..:
Code:
F:\Eigene Dateien\Computing\cudalucas.1.4.1\win64\CUDA4.0\sm_21>CUDALucas.cuda4.0.sm_21.WIN64.exe 216091 Iteration 10000 2.4 msec/Iter M( 216091 )C, 0x30247786758b8792, n = 524288, CUDALucas v1.41 Iteration 20000 2.3 msec/Iter M( 216091 )C, 0x13e968bf40fda4d7, n = 524288, CUDALucas v1.41 Iteration 30000 2.5 msec/Iter M( 216091 )C, 0x540772c2abb7833a, n = 524288, CUDALucas v1.41 Iteration 40000 2.4 msec/Iter M( 216091 )C, 0xc26da9695ac418c1, n = 524288, CUDALucas v1.41 Iteration 50000 2.4 msec/Iter M( 216091 )C, 0x95ce3ff44abdd1e5, n = 524288, CUDALucas v1.41 Iteration 60000 2.4 msec/Iter M( 216091 )C, 0x99aa87c495daffe7, n = 524288, CUDALucas v1.41 Iteration 70000 2.5 msec/Iter M( 216091 )C, 0x505d249be3145893, n = 524288, CUDALucas v1.41 Iteration 80000 2.5 msec/Iter M( 216091 )C, 0xddf612c72037b8a1, n = 524288, CUDALucas v1.41 Iteration 90000 2.4 msec/Iter M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 524288, CUDALucas v1.41 Iteration 100000 2.4 msec/Iter M( 216091 )C, 0x4de7f101ee1cb7a5, n = 524288, CUDALucas v1.41 Iteration 110000 2.6 msec/Iter M( 216091 )C, 0x10aa3286c0b03369, n = 524288, CUDALucas v1.41 Iteration 120000 2.5 msec/Iter M( 216091 )C, 0x3981b56788b529e2, n = 524288, CUDALucas v1.41 Iteration 130000 2.4 msec/Iter M( 216091 )C, 0x80438af231f8fccd, n = 524288, CUDALucas v1.41 Iteration 140000 2.4 msec/Iter M( 216091 )C, 0x669382faea06df89, n = 524288, CUDALucas v1.41 Iteration 150000 2.4 msec/Iter M( 216091 )C, 0x1b73cb121df7d6fa, n = 524288, CUDALucas v1.41 Iteration 160000 2.4 msec/Iter M( 216091 )C, 0xb391010f29c70ee1, n = 524288, CUDALucas v1.41 Iteration 170000 2.4 msec/Iter M( 216091 )C, 0x04055d84a77be1d8, n = 524288, CUDALucas v1.41 Iteration 180000 2.4 msec/Iter M( 216091 )C, 0xe3d74c104f02967d, n = 524288, CUDALucas v1.41 Iteration 190000 2.4 msec/Iter M( 216091 )C, 0x54b2a8b9cb149f9f, n = 524288, CUDALucas v1.41 Iteration 200000 2.5 msec/Iter M( 216091 )C, 0xf433496947b7b103, n = 524288, CUDALucas v1.41 Iteration 210000 2.3 msec/Iter M( 216091 )C, 0xcfe091c8f59f8a7b, n = 524288, CUDALucas v1.41 M( 216091 )P, n = 524288, CUDALucas v1.41 CUDALucas: Could not find a checkpoint file to resume from |
|
|
|
|
|
#657 | |
|
"Jerry"
Nov 2011
Vancouver, WA
46316 Posts |
Quote:
EDIT: Test wokrs good. Command line changed from 1.2b. Don't need -t anymore and to specify GPU 1 or 2 I had to use -D00 or -D01. I'll compare results to my old runs in a bit. Last fiddled with by flashjh on 2011-12-29 at 23:18 |
|
|
|
|
|
|
#658 | |
|
"Jerry"
Nov 2011
Vancouver, WA
46316 Posts |
Quote:
4.11 build is slower than 1.2b for me. I had to install CUDA 4.0 to get the lastest .dll files. Times are below 1.2b with nVidia 290.53 Code:
C:\CUDA>cuda12b -t 216091 CUDALucas: Could not find a checkpoint file to resume from Iteration 10000 M( 216091 )C, 0x30247786758b8792, n = 524288, CUDALucas v1.2b (0:13 real, 1.3126 ms/iter, ETA 4:22) Iteration 20000 M( 216091 )C, 0x13e968bf40fda4d7, n = 524288, CUDALucas v1.2b (0:13 real, 1.2928 ms/iter, ETA 4:05) Iteration 30000 M( 216091 )C, 0x540772c2abb7833a, n = 524288, CUDALucas v1.2b (0:13 real, 1.2923 ms/iter, ETA 3:52) Iteration 40000 M( 216091 )C, 0xc26da9695ac418c1, n = 524288, CUDALucas v1.2b (0:13 real, 1.2925 ms/iter, ETA 3:39) Iteration 50000 M( 216091 )C, 0x95ce3ff44abdd1e5, n = 524288, CUDALucas v1.2b (0:13 real, 1.2923 ms/iter, ETA 3:26) Iteration 60000 M( 216091 )C, 0x99aa87c495daffe7, n = 524288, CUDALucas v1.2b (0:13 real, 1.2924 ms/iter, ETA 3:13) Iteration 70000 M( 216091 )C, 0x505d249be3145893, n = 524288, CUDALucas v1.2b (0:13 real, 1.2924 ms/iter, ETA 3:00) Iteration 80000 M( 216091 )C, 0xddf612c72037b8a1, n = 524288, CUDALucas v1.2b (0:13 real, 1.2920 ms/iter, ETA 2:47) Iteration 90000 M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 524288, CUDALucas v1.2b (0:13 real, 1.2923 ms/iter, ETA 2:35) Iteration 100000 M( 216091 )C, 0x4de7f101ee1cb7a5, n = 524288, CUDALucas v1.2b (0:12 real, 1.2921 ms/iter, ETA 2:22) Iteration 110000 M( 216091 )C, 0x10aa3286c0b03369, n = 524288, CUDALucas v1.2b (0:13 real, 1.2924 ms/iter, ETA 2:09) Iteration 120000 M( 216091 )C, 0x3981b56788b529e2, n = 524288, CUDALucas v1.2b (0:13 real, 1.2924 ms/iter, ETA 1:56) Iteration 130000 M( 216091 )C, 0x80438af231f8fccd, n = 524288, CUDALucas v1.2b (0:13 real, 1.2924 ms/iter, ETA 1:43) Iteration 140000 M( 216091 )C, 0x669382faea06df89, n = 524288, CUDALucas v1.2b (0:13 real, 1.2925 ms/iter, ETA 1:30) Iteration 150000 M( 216091 )C, 0x1b73cb121df7d6fa, n = 524288, CUDALucas v1.2b (0:13 real, 1.2923 ms/iter, ETA 1:17) Iteration 160000 M( 216091 )C, 0xb391010f29c70ee1, n = 524288, CUDALucas v1.2b (0:13 real, 1.2952 ms/iter, ETA 1:04) Iteration 170000 M( 216091 )C, 0x04055d84a77be1d8, n = 524288, CUDALucas v1.2b (0:13 real, 1.3023 ms/iter, ETA 0:52) Iteration 180000 M( 216091 )C, 0xe3d74c104f02967d, n = 524288, CUDALucas v1.2b (0:13 real, 1.2959 ms/iter, ETA 0:38) Iteration 190000 M( 216091 )C, 0x54b2a8b9cb149f9f, n = 524288, CUDALucas v1.2b (0:13 real, 1.2948 ms/iter, ETA 0:25) Iteration 200000 M( 216091 )C, 0xf433496947b7b103, n = 524288, CUDALucas v1.2b (0:13 real, 1.2940 ms/iter, ETA 0:12) Iteration 210000 M( 216091 )C, 0xcfe091c8f59f8a7b, n = 524288, CUDALucas v1.2b (0:13 real, 1.2941 ms/iter, ETA 0:00) M( 216091 )P, n = 524288, CUDALucas v1.2b no more input Code:
C:\CUDA>cuda411 216091 Iteration 10000 1.5 msec/Iter M( 216091 )C, 0x30247786758b8792, n = 524288, CUDALucas v1.41 Iteration 20000 1.5 msec/Iter M( 216091 )C, 0x13e968bf40fda4d7, n = 524288, CUDALucas v1.41 Iteration 30000 1.5 msec/Iter M( 216091 )C, 0x540772c2abb7833a, n = 524288, CUDALucas v1.41 Iteration 40000 1.5 msec/Iter M( 216091 )C, 0xc26da9695ac418c1, n = 524288, CUDALucas v1.41 Iteration 50000 1.4 msec/Iter M( 216091 )C, 0x95ce3ff44abdd1e5, n = 524288, CUDALucas v1.41 Iteration 60000 1.5 msec/Iter M( 216091 )C, 0x99aa87c495daffe7, n = 524288, CUDALucas v1.41 Iteration 70000 1.5 msec/Iter M( 216091 )C, 0x505d249be3145893, n = 524288, CUDALucas v1.41 Iteration 80000 1.5 msec/Iter M( 216091 )C, 0xddf612c72037b8a1, n = 524288, CUDALucas v1.41 Iteration 90000 1.5 msec/Iter M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 524288, CUDALucas v1.41 Iteration 100000 1.5 msec/Iter M( 216091 )C, 0x4de7f101ee1cb7a5, n = 524288, CUDALucas v1.41 Iteration 110000 1.5 msec/Iter M( 216091 )C, 0x10aa3286c0b03369, n = 524288, CUDALucas v1.41 Iteration 120000 1.4 msec/Iter M( 216091 )C, 0x3981b56788b529e2, n = 524288, CUDALucas v1.41 Iteration 130000 1.5 msec/Iter M( 216091 )C, 0x80438af231f8fccd, n = 524288, CUDALucas v1.41 Iteration 140000 1.5 msec/Iter M( 216091 )C, 0x669382faea06df89, n = 524288, CUDALucas v1.41 Iteration 150000 1.5 msec/Iter M( 216091 )C, 0x1b73cb121df7d6fa, n = 524288, CUDALucas v1.41 Iteration 160000 1.5 msec/Iter M( 216091 )C, 0xb391010f29c70ee1, n = 524288, CUDALucas v1.41 Iteration 170000 1.4 msec/Iter M( 216091 )C, 0x04055d84a77be1d8, n = 524288, CUDALucas v1.41 Iteration 180000 1.5 msec/Iter M( 216091 )C, 0xe3d74c104f02967d, n = 524288, CUDALucas v1.41 Iteration 190000 1.5 msec/Iter M( 216091 )C, 0x54b2a8b9cb149f9f, n = 524288, CUDALucas v1.41 Iteration 200000 1.5 msec/Iter M( 216091 )C, 0xf433496947b7b103, n = 524288, CUDALucas v1.41 Iteration 210000 1.4 msec/Iter M( 216091 )C, 0xcfe091c8f59f8a7b, n = 524288, CUDALucas v1.41 M( 216091 )P, n = 524288, CUDALucas v1.41 CUDALucas: Could not find a checkpoint file to resume from Code:
C:\CUDA>cuda411 -D1 216091 Iteration 10000 1.5 msec/Iter M( 216091 )C, 0x30247786758b8792, n = 524288, CUDALucas v1.41 Iteration 20000 1.4 msec/Iter M( 216091 )C, 0x13e968bf40fda4d7, n = 524288, CUDALucas v1.41 Iteration 30000 1.4 msec/Iter M( 216091 )C, 0x540772c2abb7833a, n = 524288, CUDALucas v1.41 Iteration 40000 1.4 msec/Iter M( 216091 )C, 0xc26da9695ac418c1, n = 524288, CUDALucas v1.41 Iteration 50000 1.5 msec/Iter M( 216091 )C, 0x95ce3ff44abdd1e5, n = 524288, CUDALucas v1.41 Iteration 60000 1.4 msec/Iter M( 216091 )C, 0x99aa87c495daffe7, n = 524288, CUDALucas v1.41 Iteration 70000 1.4 msec/Iter M( 216091 )C, 0x505d249be3145893, n = 524288, CUDALucas v1.41 Iteration 80000 1.4 msec/Iter M( 216091 )C, 0xddf612c72037b8a1, n = 524288, CUDALucas v1.41 Iteration 90000 1.5 msec/Iter M( 216091 )C, 0xb5d8309a1ce9e2b6, n = 524288, CUDALucas v1.41 Iteration 100000 1.4 msec/Iter M( 216091 )C, 0x4de7f101ee1cb7a5, n = 524288, CUDALucas v1.41 Iteration 110000 1.4 msec/Iter M( 216091 )C, 0x10aa3286c0b03369, n = 524288, CUDALucas v1.41 Iteration 120000 1.5 msec/Iter M( 216091 )C, 0x3981b56788b529e2, n = 524288, CUDALucas v1.41 Iteration 130000 1.4 msec/Iter M( 216091 )C, 0x80438af231f8fccd, n = 524288, CUDALucas v1.41 Iteration 140000 1.4 msec/Iter M( 216091 )C, 0x669382faea06df89, n = 524288, CUDALucas v1.41 Iteration 150000 1.4 msec/Iter M( 216091 )C, 0x1b73cb121df7d6fa, n = 524288, CUDALucas v1.41 Iteration 160000 1.5 msec/Iter M( 216091 )C, 0xb391010f29c70ee1, n = 524288, CUDALucas v1.41 Iteration 170000 1.4 msec/Iter M( 216091 )C, 0x04055d84a77be1d8, n = 524288, CUDALucas v1.41 Iteration 180000 1.4 msec/Iter M( 216091 )C, 0xe3d74c104f02967d, n = 524288, CUDALucas v1.41 Iteration 190000 1.4 msec/Iter M( 216091 )C, 0x54b2a8b9cb149f9f, n = 524288, CUDALucas v1.41 Iteration 200000 1.5 msec/Iter M( 216091 )C, 0xf433496947b7b103, n = 524288, CUDALucas v1.41 Iteration 210000 1.4 msec/Iter M( 216091 )C, 0xcfe091c8f59f8a7b, n = 524288, CUDALucas v1.41 M( 216091 )P, n = 524288, CUDALucas v1.41 CUDALucas: Could not find a checkpoint file to resume from Last fiddled with by flashjh on 2011-12-30 at 00:12 |
|
|
|
|
|
|
#659 | |
|
Dec 2009
Peine, Germany
5138 Posts |
Quote:
Code:
F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.4.1\src>make "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2/bin/nvcc" -c CUDALucas. cu -o CUDALucas.cuda3.2.sm_13.WIN64.obj -m64 --ptxas-options=-v "-ccbin=C:\Progr am Files (x86)\Microsoft Visual Studio 10.0\VC\/bin" -DWIN64 -Xcompiler /EHsc,/ W3,/nologo,/Ox,/Oy,/GL -arch=sm_13 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EX PONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUB LE "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2/include" "-IC:\Pr ogram Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2/include/cudart" "-IC:/Program Data/NVIDIA Corporation/NVIDIA GPU Computing SDK 3.2/C/common/inc" -D__x86_64__ -O3 nvcc fatal : nvcc cannot find a supported cl version. Only MSVC 8.0 and MSVC 9 .0 are supported make: *** [CUDALucas.cuda3.2.sm_13.WIN64.obj] Fehler -1 Maybe one of our precious CUDA experts (TheJudger...) has an opinion...? There were tries to speed CUDALucas up. User Ethan (EO) eliminated one CudaMemCpy() call but that made no great difference. For me, it only became laggy. I cannot run the CUDA profiler (yet) but I'm wondering about the comparetively weak utilisation of 90%... |
|
|
|
|
|
|
#660 |
|
Dec 2009
Peine, Germany
331 Posts |
Code:
F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.4.1\win64\CUDA4.0\sm_21>CUDALucas.cuda4.0.sm_21.WIN64.exe -c10000 44128291 err = 0.363031, increasing n from 2359296 Iteration 10000 11.6 msec/Iter M( 44128291 )C, 0xed435ed5c3d2f2d9, n = 2621440, CUDALucas v1.41 Iteration 20000 11.1 msec/Iter M( 44128291 )C, 0xa13d48abcc4f92bc, n = 2621440, CUDALucas v1.41 Iteration 30000 10.9 msec/Iter M( 44128291 )C, 0xa3d71bee2f8c9ebd, n = 2621440, CUDALucas v1.41 Iteration 40000 10.8 msec/Iter M( 44128291 )C, 0x8c175ce356b74bf2, n = 2621440, CUDALucas v1.41 ... it now takes one full core of my Core i5-750 (task demands ~20% of my 4 core CPU). Further tests show that this only seems to happen if the "-c" switch is given in command line. *relived* Code:
F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.4.1\win64\CUDA4.0\sm_21>CUDALucas.cuda4.0.sm_21.WIN64.exe -c10000 3750 0083 CUDALucas: Could not find a checkpoint file to resume from F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.4.1\win64\CUDA4.0\sm_21>CUDALucas.cuda4.0.sm_21.WIN64.exe 37500083 CUDALucas: Resuming from checkpoint file c37500083 caso 2 Iteration 10000 0.3 msec/Iter M( 37500083 )C, 0x1c79920a5816ec18, n = 2097152, CUDALucas v1.41 Iteration 20000 8.3 msec/Iter M( 37500083 )C, 0xd14e10268cf86636, n = 2097152, CUDALucas v1.41 F:\Eigene Dateien\Computing\CUDALucas\cudalucas.1.4.1\win64\CUDA4.0\sm_21>CUDALucas.cuda4.0.sm_21.WIN64.exe 37500083 CUDALucas: Resuming from checkpoint file c37500083 caso 2 Iteration 30000 2.4 msec/Iter M( 37500083 )C, 0x4a2607f9f4ed9248, n = 2097152, CUDALucas v1.41 Iteration 40000 8.4 msec/Iter M( 37500083 )C, 0x914ca98d0383db54, n = 2097152, CUDALucas v1.41 Iteration 50000 8.3 msec/Iter M( 37500083 )C, 0xc9bdc23854802d33, n = 2097152, CUDALucas v1.41 By the way, the iteration times are so low as I didn't do complete 10000 runs. Kind of a bug. Last but not least, utilisation for state-of-the-art expos (40M range) is 97% as before. Low utilisation is understandable for small FFT sizes... Last fiddled with by Brain on 2011-12-30 at 14:22 Reason: Last but not least, utilisation |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Don't DC/LL them with CudaLucas | LaurV | Data | 131 | 2017-05-02 18:41 |
| CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 | Brain | GPU Computing | 13 | 2016-02-19 15:53 |
| CUDALucas: which binary to use? | Karl M Johnson | GPU Computing | 15 | 2015-10-13 04:44 |
| settings for cudaLucas | fairsky | GPU Computing | 11 | 2013-11-03 02:08 |
| Trying to run CUDALucas on Windows 8 CP | Rodrigo | GPU Computing | 12 | 2012-03-07 23:20 |