mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

flashjh 2012-02-26 16:27

[QUOTE=Brain;290955]I get the following:
[CODE]------------------------
Compile output for 1.58
------------------------
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/bin/nvcc" -c CUDALucas.cu -o CUDALucas.cuda4.1.sm_21.WIN64.obj -m64 --ptxas-options=-v "-ccbin=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\/bin" -DWIN64 -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=sm_21 -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/include" "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/include/cudart" "-IQ:\NVIDIA GPU Computing SDK 4.1/C/common/inc" -D__x86_64__ -O3
tmpxft_00000ca8_00000000-14_CUDALucas.ii
CUDALucas.cu(524) : warning C4244: 'argument' : conversion from 'float' to 'size_t', possible loss of data
CUDALucas.cu(845) : warning C4018: '<' : signed/unsigned mismatch
CUDALucas.cu(1359) : warning C4018: '<' : signed/unsigned mismatch
CUDALucas.cu(1560) : warning C4018: '<' : signed/unsigned mismatch
cl /Ox /Oy /GL /W4 /fp:fast /nologo /c /Tp timeval.c /Fotimeval.WIN64.obj
timeval.c
link /nologo /LTCG CUDALucas.cuda4.1.sm_21.WIN64.obj timeval.WIN64.obj "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/lib/x64/cudart.lib" "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1/lib/x64/cufft.lib" /out:CUDALucas.cuda4.1.sm_21.WIN64.exe
Generating code
Finished generating code
[/CODE]
Ehm, hard to remember. Nvidia tools automically set path, make was done manually, cl.exe probably also manually, not sure. Path is now:
[CODE]
------------------------
Path settings
------------------------
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Windows\Microsoft.NET\Framework64\v3.5;C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE;C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64;C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\;C:\Program Files\Perl\site\bin;C:\Program Files\Perl\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\;C:\Program Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Windows Live\Shared;C:\Program Files (x86)\QuickTime\QTSystem\;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\;C:\Program Files\Perl\site\bin;C:\Program Files\Perl\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\;C:\Program Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Windows Live\Shared;C:\Program Files (x86)\QuickTime\QTSystem\;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin;C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\Tools\Binn\;C:\Program Files\Microsoft SQL Server\100\DTS\Binn\;;C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin;C:\Program Files\TortoiseSVN\bin;C:\Program Files (x86)\GnuWin32\bin;C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin
[/CODE]
Hard to read, I know. Me lazy.[/QUOTE]

No worries

flashjh 2012-02-26 20:36

Updated code and versions required?
 
1 Attachment(s)
I updated the source to remove the warnings. Lines 524, 845, 1359, & 1560. There's probably a setting to ignore them, but I'd rather not.

I included the source files this time. Testing shows a match with 1.49, but it's still not done.

If anyone tests these, let me know how they work for you.

Also, what versions are still needed/requested: Right now I'm setup for 64 bit/4.1/sm2.0 & sm2.1. I can accomodate others if needed.

Does anyone need 32 bit anymore?

Brain 2012-02-27 05:26

[CODE]Processing result: M( 27996233 )C, 0x0469884db1349fec, n = 1572864, CUDALucas v1.54
LL test successfully completes double-check of M27996233[/CODE]Switching to flashjh's 1.58.

apsen 2012-02-27 17:07

[QUOTE=Brain;290851]One way to compile CUDALucas for Win64:
0. Have Win7 64 bit
1. Install Nvidia GPU Toolkit (e.g. version 4.1)
2. Install Nvidia GPU SDK (e.g. version 4.1)
3. Install Make for Windows
4. Install MS Visual Studio 2010 Professional Trial Edition (needed for 64bit, trial will not run out as only command line usage)
5. Set Path for nvcc, make and cl.exe (from VS/bin)
6. Edit given makefile for Win64: Adapt CUDA and SM parameter (e.g. 4.1/2.0). Rename it to makefile.
7. Enter "make" in console being in the CUDALucas/src directory.
8. Delete *.obj files.
9. Find the exe and be happy.

This should be it.

The day will come I won't be there to compile it. So a backup person/compiler will be needed. Any volunteers?[/QUOTE]

Step 2 is not needed since my changes have been incorporated. That dependency has been removed by bringing couple of macros from SDK in the cuda_safecall.h (Thanks to Ethan for for that file).

Also you could run "make -f MAKEFILENAME" instead of renaming...


I also could compile CUDA 4.0 Windows versions at the moment but I do not monitor this thread as closely...

Thanks,
Andriy

flashjh 2012-02-27 17:29

[QUOTE=apsen;291072]Step 2 is not needed since my changes have been incorporated. That dependency has been removed by bringing couple of macros from SDK in the cuda_safecall.h (Thanks to Ethan for for that file).

Also you could run "make -f MAKEFILENAME" instead of renaming...


I also could compile CUDA 4.0 Windows versions at the moment but I do not monitor this thread as closely...

Thanks,
Andriy[/QUOTE]

Thanks for the update.

I'll remove the Nvidia GPU SDK and make sure everything still compiles. That will save some room on my drive.

chris2be8 2012-02-27 18:05

[QUOTE=Brain;290955] Hard to read, I know. Me lazy.[/QUOTE]

UNIX has nice tools for text processing.

Just splitting it up:
chris@dhcppc0:~/ggnfs/tests> tr <t ';' '\n'
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64
C:\Windows\Microsoft.NET\Framework64\v4.0.30319
C:\Windows\Microsoft.NET\Framework64\v3.5
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages
C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE
C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools
C:\Program Files (x86)\HTML Help Workshop
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\
C:\Program Files\Perl\site\bin
C:\Program Files\Perl\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\
C:\Program Files\Common Files\Microsoft Shared\Windows Live
C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live
C:\Windows\system32
C:\Windows
C:\Windows\System32\Wbem
C:\Windows\System32\WindowsPowerShell\v1.0\
C:\Program Files (x86)\Windows Live\Shared
C:\Program Files (x86)\QuickTime\QTSystem\
C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin
C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\Microsoft SQL Server\100\DTS\Binn\

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin
C:\Program Files\TortoiseSVN\bin
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\
C:\Program Files\Perl\site\bin
C:\Program Files\Perl\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\
C:\Program Files\Common Files\Microsoft Shared\Windows Live
C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live
C:\Windows\system32
C:\Windows
C:\Windows\System32\Wbem
C:\Windows\System32\WindowsPowerShell\v1.0\
C:\Program Files (x86)\Windows Live\Shared
C:\Program Files (x86)\QuickTime\QTSystem\
C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin
C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\Microsoft SQL Server\100\DTS\Binn\

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin
C:\Program Files\TortoiseSVN\bin
C:\Program Files (x86)\GnuWin32\bin
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin

And sorted to remove duplicates.
chris@dhcppc0:~/ggnfs/tests> tr <t ';' '\n' | sort -u

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\common\bin
C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.1\C\common\bin
C:\Program Files\Common Files\Microsoft Shared\Windows Live
C:\Program Files\Microsoft SQL Server\100\DTS\Binn\
C:\Program Files\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.1\libnvvp\
C:\Program Files\Perl\bin
C:\Program Files\Perl\site\bin
C:\Program Files\TortoiseSVN\bin
C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live
C:\Program Files (x86)\GnuWin32\bin
C:\Program Files (x86)\HTML Help Workshop
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 Tools\x64
C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\bin\x64
C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\
C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE
C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\Tools
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\amd64
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\VCPackages
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Program Files (x86)\QuickTime\QTSystem\
C:\Program Files (x86)\Windows Live\Shared
C:\Windows
C:\Windows\Microsoft.NET\Framework64\v3.5
C:\Windows\Microsoft.NET\Framework64\v4.0.30319
C:\Windows\system32
C:\Windows\System32\Wbem
C:\Windows\System32\WindowsPowerShell\v1.0\

Chris K

flashjh 2012-02-27 18:58

[QUOTE=apsen;291072]Step 2 is not needed since my changes have been incorporated. That dependency has been removed by bringing couple of macros from SDK in the cuda_safecall.h (Thanks to Ethan for for that file).

Also you could run "make -f MAKEFILENAME" instead of renaming...


I also could compile CUDA 4.0 Windows versions at the moment but I do not monitor this thread as closely...

Thanks,
Andriy[/QUOTE]

With this info I removed the Nvidia GPU SDK and installed CUDA Toolkit 4.0.

I can now compile 64-bit 4.0 / 4.1 | sm 1.3 / 2.0 / 2.1.

[Break]
So I've been testing resume. It works for quite a while and then I get a mismatch residue between 1.49 and 1.58. They still have ~11 hours to finish to see which one is correct (hopefully 1.58). I'll post when I know.

I still haven't let 1.58 run all the way through by itself yet, that is next.

EDIT: I temporarily cleared my path in Windows and setup the makefile to work independent of Windows' path. This should help for future installs/compiles as the path gets messy (as seen above, thanks Chris K).

msft 2012-02-28 03:05

1 Attachment(s)
Hi,
Ver 1.61
reduce memory access.

Ver1.58 on GTX-460:
[code]
Iteration 10000 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.58 (0:39 real, 3.9227 ms/iter, ETA 10:52:28)
Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.58 (0:45 real, 4.4715 ms/iter, ETA 18:36:22)
Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.58 (1:02 real, 6.2355 ms/iter, ETA 34:36:25)
Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.58 (1:18 real, 7.7141 ms/iter, ETA 53:31:38)
Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.58 (1:27 real, 8.6816 ms/iter, ETA 72:17:54)
Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.58 (1:29 real, 8.9629 ms/iter, ETA 87:05:21)
[/code]
Ver1.61 on GTX-460:
[code]
Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.61 (0:41 real, 4.0819 ms/iter, ETA 16:59:07)
Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.61 (0:59 real, 5.8862 ms/iter, ETA 32:40:05)
Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.61 (1:13 real, 7.2828 ms/iter, ETA 50:32:03)
Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.61 (1:23 real, 8.2897 ms/iter, ETA 69:02:05)
Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.61 (1:25 real, 8.5208 ms/iter, ETA 82:47:36)
[/code]

flashjh 2012-02-28 03:26

[QUOTE=msft;291133]Hi,
Ver 1.61
reduce memory access.

Ver1.58 on GTX-460:
[code]
Iteration 10000 M( 10000000 )C, 0x55318a84ffd14bc7, n = 786432, CUDALucas v1.58 (0:39 real, 3.9227 ms/iter, ETA 10:52:28)
Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.58 (0:45 real, 4.4715 ms/iter, ETA 18:36:22)
Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.58 (1:02 real, 6.2355 ms/iter, ETA 34:36:25)
Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.58 (1:18 real, 7.7141 ms/iter, ETA 53:31:38)
Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.58 (1:27 real, 8.6816 ms/iter, ETA 72:17:54)
Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.58 (1:29 real, 8.9629 ms/iter, ETA 87:05:21)
[/code]
Ver1.61 on GTX-460:
[code]
Iteration 10000 M( 15000000 )C, 0x7a34e75acea86da1, n = 1048576, CUDALucas v1.61 (0:41 real, 4.0819 ms/iter, ETA 16:59:07)
Iteration 10000 M( 20000000 )C, 0xb6475f8cb0888740, n = 1310720, CUDALucas v1.61 (0:59 real, 5.8862 ms/iter, ETA 32:40:05)
Iteration 10000 M( 25000000 )C, 0x667565040b5b7aa3, n = 1572864, CUDALucas v1.61 (1:13 real, 7.2828 ms/iter, ETA 50:32:03)
Iteration 10000 M( 30000000 )C, 0xbf70feed29774eba, n = 1835008, CUDALucas v1.61 (1:23 real, 8.2897 ms/iter, ETA 69:02:05)
Iteration 10000 M( 35000000 )C, 0x1f1fb94d69da44f8, n = 2097152, CUDALucas v1.61 (1:25 real, 8.5208 ms/iter, ETA 82:47:36)
[/code][/QUOTE]

I tried to compile but received some warnings I need to fix before I'll post the binaries. Shouldn't be too long.

flashjh 2012-02-28 04:33

v1.61 binaries
 
1 Attachment(s)
Zip includes v1.61 Win64[LIST][*]CUDA 4.0 / SM 2.0[*]CUDA 4.1 / SM 2.0[*]CUDA 4.1 / SM 2.1[/LIST]All untested.

Source was updated because msft changed s_inv from a double to float. Added (float) to lines 453, 460, 476, 481, 1013, 1020, 1036, & 1041. I don't know the code well enough to know if data could be lost going from double to float, but that's what testing is for. Also, does adding (float) negate the speedup by changing s_inv from a double to float?... I could just ignore the warnings.

Jerry

Brain 2012-02-28 05:40

[QUOTE=flashjh;291141]Zip includes v1.61 Win64[/QUOTE]
You are very fast (and night active) ;-).


All times are UTC. The time now is 23:10.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.