![]() |
[QUOTE=Brain;290955]I get the following:
[CODE]------------------------ Compile output for 1.58 ------------------------ "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/bin/nvcc" -c CUDALucas.cu -o CUDALucas.cuda4.1.sm_21.WIN64.obj -m64 --ptxas-options=-v "-ccbin=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/include" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/include/cudart" "-IQ:\NVIDIA GPU Computing SDK 4.1/C/common/inc" -D__x86_64__ -O3 tmpxft_00000ca8_00000000-14_CUDALucas.ii CUDALucas.cu(524) : warning C4244: 'argument' : conversion from 'float' to 'size_t', possible loss of data CUDALucas.cu(845) : warning C4018: '<' : signed/unsigned mismatch CUDALucas.cu(1359) : warning C4018: '<' : signed/unsigned mismatch CUDALucas.cu(1560) : warning C4018: '<' : signed/unsigned mismatch cl /Ox /Oy /GL /W4 /fp:fast /nologo /c /Tp timeval.c /Fotimeval.WIN64.obj timeval.c link /nologo /LTCG CUDALucas.cuda4.1.sm_21.WIN64.obj timeval.WIN64.obj "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/lib/x64/cudart.lib" "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/lib/x64/cufft.lib" /out:CUDALucas.cuda4.1.sm_21.WIN64.exe Generating code Finished generating code [/CODE] Ehm, hard to remember. Nvidia tools automically set path, make was done manually, cl.exe probably also manually, not sure. Path is now: [CODE] ------------------------ Path settings ------------------------ C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Windows\Microsoft.NET\Framework64\v3.5;C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\;C:\Program Files\Perl\site\bin;C:\Program Files\Perl\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\;C:\Program Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Windows Live\Shared;C:\Program Files (x86)\QuickTime\QTSystem\;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\;C:\Program Files\Perl\site\bin;C:\Program Files\Perl\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\;C:\Program Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Windows Live\Shared;C:\Program Files (x86)\QuickTime\QTSystem\;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\GnuWin32\bin;C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin [/CODE] Hard to read, I know. Me lazy.[/QUOTE] No worries |
Updated code and versions required?
1 Attachment(s)
I updated the source to remove the warnings. Lines 524, 845, 1359, & 1560. There's probably a setting to ignore them, but I'd rather not.
I included the source files this time. Testing shows a match with 1.49, but it's still not done. If anyone tests these, let me know how they work for you. Also, what versions are still needed/requested: Right now I'm setup for 64 bit/4.1/sm2.0 & sm2.1. I can accomodate others if needed. Does anyone need 32 bit anymore? |
[CODE]Processing result: M( 27996233 )C, 0x0469884db1349fec, n = 1572864, CUDALucas v1.54
LL test successfully completes double-check of M27996233[/CODE]Switching to flashjh's 1.58. |
[QUOTE=Brain;290851]One way to compile CUDALucas for Win64:
0. Have Win7 64 bit 1. Install Nvidia GPU Toolkit (e.g. version 4.1) 2. Install Nvidia GPU SDK (e.g. version 4.1) 3. Install Make for Windows 4. Install MS Visual Studio 2010 Professional Trial Edition (needed for 64bit, trial will not run out as only command line usage) 5. Set Path for nvcc, make and cl.exe (from VS/bin) 6. Edit given makefile for Win64: Adapt CUDA and SM parameter (e.g. 4.1/2.0). Rename it to makefile. 7. Enter "make" in console being in the CUDALucas/src directory. 8. Delete *.obj files. 9. Find the exe and be happy. This should be it. The day will come I won't be there to compile it. So a backup person/compiler will be needed. Any volunteers?[/QUOTE] Step 2 is not needed since my changes have been incorporated. That dependency has been removed by bringing couple of macros from SDK in the cuda_safecall.h (Thanks to Ethan for for that file). Also you could run "make -f MAKEFILENAME" instead of renaming... I also could compile CUDA 4.0 Windows versions at the moment but I do not monitor this thread as closely... Thanks, Andriy |
[QUOTE=apsen;291072]Step 2 is not needed since my changes have been incorporated. That dependency has been removed by bringing couple of macros from SDK in the cuda_safecall.h (Thanks to Ethan for for that file).
Also you could run "make -f MAKEFILENAME" instead of renaming... I also could compile CUDA 4.0 Windows versions at the moment but I do not monitor this thread as closely... Thanks, Andriy[/QUOTE] Thanks for the update. I'll remove the Nvidia GPU SDK and make sure everything still compiles. That will save some room on my drive. |
[QUOTE=Brain;290955] Hard to read, I know. Me lazy.[/QUOTE]
UNIX has nice tools for text processing. Just splitting it up: chris@dhcppc0:~/ggnfs/tests> tr <t ';' '\n' C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64 C:\Windows\Microsoft.NET\Framework64\v4.0.30319 C:\Windows\Microsoft.NET\Framework64\v3.5 C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools C:\Program Files (x86)\HTML Help Workshop C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64 C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64 C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\ C:\Program Files\Perl\site\bin C:\Program Files\Perl\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\ C:\Program Files\Common Files\Microsoft Shared\Windows Live C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live C:\Windows\system32 C:\Windows C:\Windows\System32\Wbem C:\Windows\System32\WindowsPowerShell\v1.0\ C:\Program Files (x86)\Windows Live\Shared C:\Program Files (x86)\QuickTime\QTSystem\ C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\ C:\Program Files\Microsoft SQL Server\100\Tools\Binn\ C:\Program Files\Microsoft SQL Server\100\DTS\Binn\ C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin C:\Program Files\TortoiseSVN\bin C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\ C:\Program Files\Perl\site\bin C:\Program Files\Perl\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\ C:\Program Files\Common Files\Microsoft Shared\Windows Live C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live C:\Windows\system32 C:\Windows C:\Windows\System32\Wbem C:\Windows\System32\WindowsPowerShell\v1.0\ C:\Program Files (x86)\Windows Live\Shared C:\Program Files (x86)\QuickTime\QTSystem\ C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\ C:\Program Files\Microsoft SQL Server\100\Tools\Binn\ C:\Program Files\Microsoft SQL Server\100\DTS\Binn\ C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin C:\Program Files\TortoiseSVN\bin C:\Program Files (x86)\GnuWin32\bin C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin And sorted to remove duplicates. chris@dhcppc0:~/ggnfs/tests> tr <t ';' '\n' | sort -u C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin C:\Program Files\Common Files\Microsoft Shared\Windows Live C:\Program Files\Microsoft SQL Server\100\DTS\Binn\ C:\Program Files\Microsoft SQL Server\100\Tools\Binn\ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\ C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\ C:\Program Files\Perl\bin C:\Program Files\Perl\site\bin C:\Program Files\TortoiseSVN\bin C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live C:\Program Files (x86)\GnuWin32\bin C:\Program Files (x86)\HTML Help Workshop C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64 C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64 C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\ C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64 C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common C:\Program Files (x86)\QuickTime\QTSystem\ C:\Program Files (x86)\Windows Live\Shared C:\Windows C:\Windows\Microsoft.NET\Framework64\v3.5 C:\Windows\Microsoft.NET\Framework64\v4.0.30319 C:\Windows\system32 C:\Windows\System32\Wbem C:\Windows\System32\WindowsPowerShell\v1.0\ Chris K |
[QUOTE=apsen;291072]Step 2 is not needed since my changes have been incorporated. That dependency has been removed by bringing couple of macros from SDK in the cuda_safecall.h (Thanks to Ethan for for that file).
Also you could run "make -f MAKEFILENAME" instead of renaming... I also could compile CUDA 4.0 Windows versions at the moment but I do not monitor this thread as closely... Thanks, Andriy[/QUOTE] With this info I removed the Nvidia GPU SDK and installed CUDA Toolkit 4.0. I can now compile 64-bit 4.0 / 4.1 | sm 1.3 / 2.0 / 2.1. [Break] So I've been testing resume. It works for quite a while and then I get a mismatch residue between 1.49 and 1.58. They still have ~11 hours to finish to see which one is correct (hopefully 1.58). I'll post when I know. I still haven't let 1.58 run all the way through by itself yet, that is next. EDIT: I temporarily cleared my path in Windows and setup the makefile to work independent of Windows' path. This should help for future installs/compiles as the path gets messy (as seen above, thanks Chris K). |
1 Attachment(s)
Hi,
Ver 1.61 reduce memory access. Ver1.58 on GTX-460: [code] Iteration 10000 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.58 (0:39 real, 3.9227 ms/iter, ETA 10:52:28) Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.58 (0:45 real, 4.4715 ms/iter, ETA 18:36:22) Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.58 (1:02 real, 6.2355 ms/iter, ETA 34:36:25) Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.58 (1:18 real, 7.7141 ms/iter, ETA 53:31:38) Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.58 (1:27 real, 8.6816 ms/iter, ETA 72:17:54) Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.58 (1:29 real, 8.9629 ms/iter, ETA 87:05:21) [/code] Ver1.61 on GTX-460: [code] Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.61 (0:41 real, 4.0819 ms/iter, ETA 16:59:07) Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.61 (0:59 real, 5.8862 ms/iter, ETA 32:40:05) Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.61 (1:13 real, 7.2828 ms/iter, ETA 50:32:03) Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.61 (1:23 real, 8.2897 ms/iter, ETA 69:02:05) Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.61 (1:25 real, 8.5208 ms/iter, ETA 82:47:36) [/code] |
[QUOTE=msft;291133]Hi,
Ver 1.61 reduce memory access. Ver1.58 on GTX-460: [code] Iteration 10000 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.58 (0:39 real, 3.9227 ms/iter, ETA 10:52:28) Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.58 (0:45 real, 4.4715 ms/iter, ETA 18:36:22) Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.58 (1:02 real, 6.2355 ms/iter, ETA 34:36:25) Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.58 (1:18 real, 7.7141 ms/iter, ETA 53:31:38) Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.58 (1:27 real, 8.6816 ms/iter, ETA 72:17:54) Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.58 (1:29 real, 8.9629 ms/iter, ETA 87:05:21) [/code] Ver1.61 on GTX-460: [code] Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.61 (0:41 real, 4.0819 ms/iter, ETA 16:59:07) Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.61 (0:59 real, 5.8862 ms/iter, ETA 32:40:05) Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.61 (1:13 real, 7.2828 ms/iter, ETA 50:32:03) Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.61 (1:23 real, 8.2897 ms/iter, ETA 69:02:05) Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.61 (1:25 real, 8.5208 ms/iter, ETA 82:47:36) [/code][/QUOTE] I tried to compile but received some warnings I need to fix before I'll post the binaries. Shouldn't be too long. |
v1.61 binaries
1 Attachment(s)
Zip includes v1.61 Win64[LIST][*]CUDA 4.0 / SM 2.0[*]CUDA 4.1 / SM 2.0[*]CUDA 4.1 / SM 2.1[/LIST]All untested.
Source was updated because msft changed s_inv from a double to float. Added (float) to lines 453, 460, 476, 481, 1013, 1020, 1036, & 1041. I don't know the code well enough to know if data could be lost going from double to float, but that's what testing is for. Also, does adding (float) negate the speedup by changing s_inv from a double to float?... I could just ignore the warnings. Jerry |
[QUOTE=flashjh;291141]Zip includes v1.61 Win64[/QUOTE]
You are very fast (and night active) ;-). |
| All times are UTC. The time now is 23:10. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.