![]() |
Newer X64 build needed
Hi,
The latest usable X64 binary on Jeff Gilchrist's page is SVN 883, and the latest one for Core 2 CPUs is 1.50. (There are SVN 939 and 942 binaries, but those are buggy and crash in the linear algebra.) On this forum, there exists a build from SVN 946, but it's built without ECM support. 1.52 is available on SourceForge, but only 32-bit. In my testing, 32-bit builds are a lot slower, especially the ECM. I don't know if Msieve's SF maintainer reads this forum, but if he does, please make a 64-bit build of the official 1.52! |
Msieve's maintainer does read this forum, and should release v1.53 since it's been languishing for a year. I also now have the capability to build x64 binaries so those should move to SF too.
The slower ECM does not surprise me, but in my experience latter-day GMP has a lot of difficulty building on 64-bit windows. |
[QUOTE=jasonp;420520]
The slower ECM does not surprise me, but in my experience latter-day GMP has a lot of difficulty building on 64-bit windows.[/QUOTE] It requires an up-to-date mingw-64/msys toolchain. ATH seems to know quite a bit about how to make it work (e.g. [url]http://www.mersenneforum.org/showthread.php?t=4087&page=36);[/url] maybe he'd be willing to take questions if something doesn't work for you. I don't know how this reconciles with the GPU portion of the msieve build... or if there is a way to get latter-day GMP to compile with microsoft/visual studio... |
Yeah, GMP is not an issue with MinGW-64. I had a harder time building MSieve, especially in terms of the issues when using a CUDA version newer than 5.5.
|
Actually I learned all about 64 bit compiling with Mingw64 from mainly WraithX but also from wombatman, so they know a lot more than me. I have never compiled a CUDA application but it was a long time since I tried.
|
I installed CUDA 6.5 (since 7.0 and 7.5 does not work for mfaktc I assumed it was best to avoid them for msieve as well).
I added the CUDA path and various flags to the makefile but it fails when it reaches the nvcc step. Do I really need Visual Studio installed to compile a CUDA application when I'm trying to compile with Msys2+Mingw64? I have VS2015 but it needs 2010/2012/2013: [CODE]"/C/CUDA6.5/bin/nvcc" -arch sm_11 -ptx -o stage1_core_sm11.ptx gnfs/poly/stage1/ stage1_core_gpu/stage1_core.cu nvcc warning : The 'compute_11', 'compute_12', 'compute_13', 'sm_11', 'sm_12', a nd 'sm_13' architectures are deprecated, and may be removed in a future release. nvcc fatal : nvcc cannot find a supported version of Microsoft Visual Studio. Only the versions 2010, 2012, and 2013 are supported Makefile:307: recipe for target 'stage1_core_sm11.ptx' failed make: *** [stage1_core_sm11.ptx] Error 1[/CODE] |
My problem is not that GMP doesn't work, it's that Windows 7 has some kind of background service that holds onto executables just long enough for the configure script to fail trying to delete temporary exe's that it generates. Thus configure doesn't work and I can't get to the next step.
People all over the internet have the same problem, and I've tried all of the solutions they propose and nothing works. I can compile a generic build of 64-bit MPIR using Visual Studio, but the last time I tried compiling the assembler code YASM had some kind of problem (late last year). Getting something that can execute GMP function calls on 64-bit windows has been incredibly frustrating for me, it has exceeded the time limit I could devote to it every single time. I could probably cross-compile a 64-bit build on a widows XP system, but the latest CUDA doesn't work with MSVC2008 and MSVC2010 doesn't work on XP. Hence the year-long stalemate where I can't update the CUDA code to work with CUDA > 5.5 either. |
Have you tried MSYS2 with Mingw64?:
[URL="https://sourceforge.net/p/msys2/wiki/MSYS2%20installation/"]https://sourceforge.net/p/msys2/wiki/MSYS2%20installation/[/URL] I ran these commands to update it initially: [CODE]Run "msys2_shell.bat" update-core Restart MSYS2 (using msys2_shell.bat) pacman -Su Restart MSYS2 (using mingw64_shell.bat and use this to start MSYS2 from now on) pacman -S mingw-w64-x86_64-gcc pacman -S mingw-w64-x86_64-make pacman -S mingw-w64-x86_64-libtool pacman -S autoconf pacman -S automake pacman -S make[/CODE] If this does not work or if you just want to try it quick, I saved my installation after these steps above: [URL="http://hoegge.dk/gmp/msys64.zip"]msys64.zip[/URL] (It's a 7-zip file but the site does not allow .7z extention. Rename it to msys64.7z) It is simply the msys64 folder which should be extracted to C: and then run mingw64_shell.bat. I compile GMP 6.1.0 on a Haswell with: [CODE]./configure ABI=64 CC=gcc CFLAGS="-O3 -m64 -mavx -mavx2 -mfma -march=haswell -mtune=haswell" --build=haswell-w64-mingw32 --enable-static --disable-shared make make install make check[/CODE] It works fine with MSYS2 in both Windows 7 and 10. |
I see you're using Haswell-specific optimizations. Please make also a generic build, or at least something usable on Core 2. (Mine has SSE4.1, but not all Core 2s have it.)
|
I compiled the newest svn988 [B]without CUDA support[/B] with haswell, sandy bridge and core2 flags. I got a fair number of warnings most are "unused-parameter" but there are other that might be serious?
I will not post any binaries until we check if the warnings are ok: [CODE]common/lanczos/lanczos.c: In function 'dump_lanczos_state': common/lanczos/lanczos.c:485:21: warning: unused parameter 'packed_matrix' [-Wunused-parameter] packed_matrix_t *packed_matrix, ^ common/lanczos/lanczos.c:488:11: warning: unused parameter 'n' [-Wunused-parameter] uint32 n, uint32 max_n, uint32 dim_solved, uint32 iter, ^ common/lanczos/lanczos.c: In function 'read_lanczos_state': common/lanczos/lanczos.c:643:21: warning: unused parameter 'packed_matrix' [-Wunused-parameter] packed_matrix_t *packed_matrix, ^ common/lanczos/lanczos.c:646:11: warning: unused parameter 'n' [-Wunused-parameter] uint32 n, uint32 max_n, uint32 *dim_solved, ^ common/lanczos/lanczos_io.c: In function 'dump_matrix': common/lanczos/lanczos_io.c:173:10: warning: unused parameter 'sparse_weight' [-Wunused-parameter] uint64 sparse_weight) { ^ common/lanczos/lanczos_io.c: In function 'file_cache_get_next': common/lanczos/lanczos_io.c:372:45: warning: unused parameter 'obj' [-Wunused-parameter] static void file_cache_get_next(msieve_obj *obj, FILE *fp, ^ common/lanczos/lanczos_io.c:375:12: warning: unused parameter 'read_submatrix' [-Wunused-parameter] uint32 read_submatrix) { ^ common/lanczos/lanczos_io.c: In function 'read_matrix': common/lanczos/lanczos_io.c:438:23: warning: variable 'mpi_nrows' set but not used [-Wunused-but-set-variable] uint32 mpi_resclass, mpi_nrows; ^ common/lanczos/lanczos_io.c:438:9: warning: variable 'mpi_resclass' set but not used [-Wunused-but-set-variable] uint32 mpi_resclass, mpi_nrows; ^ common/lanczos/lanczos_matmul0.c: In function 'packed_matrix_init': common/lanczos/lanczos_matmul0.c:419:12: warning: unused variable 'j' [-Wunused-variable] uint32 i, j; ^ common/lanczos/lanczos_matmul0.c: In function 'mul_MxN_Nx64': common/lanczos/lanczos_matmul0.c:619:23: warning: unused parameter 'scratch' [-Wunused-parameter] uint64 *b, uint64 *scratch) { ^ common/lanczos/lanczos_matmul1.c: In function 'mul_packed_core': common/lanczos/lanczos_matmul1.c:308:38: warning: unused parameter 'thread_num' [-Wunused-parameter] void mul_packed_core(void *data, int thread_num) ^ common/lanczos/lanczos_matmul1.c: In function 'mul_packed_small_core': common/lanczos/lanczos_matmul1.c:348:44: warning: unused parameter 'thread_num' [-Wunused-parameter] void mul_packed_small_core(void *data, int thread_num) ^ common/lanczos/lanczos_matmul2.c: In function 'mul_trans_packed_core': common/lanczos/lanczos_matmul2.c:319:44: warning: unused parameter 'thread_num' [-Wunused-parameter] void mul_trans_packed_core(void *data, int thread_num) ^ common/lanczos/lanczos_matmul2.c: In function 'mul_trans_packed_small_core': common/lanczos/lanczos_matmul2.c:358:50: warning: unused parameter 'thread_num' [-Wunused-parameter] void mul_trans_packed_small_core(void *data, int thread_num) ^ common/lanczos/lanczos_vv.c: In function 'mul_Nx64_64x64_acc': common/lanczos/lanczos_vv.c:201:9: warning: unused variable 'i' [-Wunused-variable] uint32 i; ^ common/lanczos/lanczos_vv.c: In function 'outer_thread_run': common/lanczos/lanczos_vv.c:210:46: warning: unused parameter 'thread_num' [-Wunused-parameter] static void outer_thread_run(void *data, int thread_num) ^ common/lanczos/lanczos_vv.c: In function 'inner_thread_run': common/lanczos/lanczos_vv.c:427:46: warning: unused parameter 'thread_num' [-Wunused-parameter] static void inner_thread_run(void *data, int thread_num) ^ common/lanczos/lanczos_vv.c: In function 'tmul_64xN_Nx64': common/lanczos/lanczos_vv.c:441:12: warning: unused variable 'j' [-Wunused-variable] uint32 i, j; ^ common/smallfact/smallfact.c: In function 'trial_factor': common/smallfact/smallfact.c:22:9: warning: variable 'factor_found' set but not used [-Wunused-but-set-variable] uint32 factor_found = 0; ^ common/minimize.c: In function 'solve_dmatrix': common/minimize.c:421:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < n; i++) ^ common/minimize.c:424:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < n - 1; i++) { ^ common/minimize.c:431:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (j = i + 1; j < n; j++) { ^ common/minimize.c:444:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (j = i + 1; j < n; j++) { ^ common/minimize.c:448:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (k = i + 1; k < n; k++) { ^ common/minimize.c:460:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (j = i + 1; j < n; j++) { ^ common/savefile.c: In function 'savefile_open': common/savefile.c:131:9: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] s->fp = gzopen(name_gz, open_string); ^ common/savefile.c:157:10: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] s->fp = gzopen(s->name, "a"); ^ common/savefile.c:163:9: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types] s->fp = gzopen(s->name, open_string); ^ common/savefile.c: In function 'savefile_close': common/savefile.c:182:49: warning: passing argument 1 of 'gzclose' from incompatible pointer type [-Wincompatible-pointer-types] s->is_a_FILE ? fclose((FILE *)s->fp) : gzclose(s->fp); ^ In file included from include/util.h:46:0, from include/msieve.h:24, from include/common.h:18, from common/savefile.c:15: C:/msys64/mingw64/include/zlib.h:1511:24: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **' ZEXTERN int ZEXPORT gzclose OF((gzFile file)); ^ common/savefile.c: In function 'savefile_eof': common/savefile.c:193:53: warning: passing argument 1 of 'gzeof' from incompatible pointer type [-Wincompatible-pointer-types] return (s->is_a_FILE ? feof((FILE *)s->fp) : gzeof(s->fp)); ^ In file included from include/util.h:46:0, from include/msieve.h:24, from include/common.h:18, from common/savefile.c:15: C:/msys64/mingw64/include/zlib.h:1475:21: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **' ZEXTERN int ZEXPORT gzeof OF((gzFile file)); ^ common/savefile.c: In function 'savefile_read_line': common/savefile.c:251:9: warning: passing argument 1 of 'gzgets' from incompatible pointer type [-Wincompatible-pointer-types] gzgets(s->fp, buf, (int)max_len); ^ In file included from include/util.h:46:0, from include/msieve.h:24, from include/common.h:18, from common/savefile.c:15: C:/msys64/mingw64/include/zlib.h:1372:24: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **' ZEXTERN char * ZEXPORT gzgets OF((gzFile file, char *buf, int len)); ^ common/savefile.c: In function 'savefile_flush': common/savefile.c:279:10: warning: passing argument 1 of 'gzputs' from incompatible pointer type [-Wincompatible-pointer-types] gzputs(s->fp, s->buf); ^ In file included from include/util.h:46:0, from include/msieve.h:24, from include/common.h:18, from common/savefile.c:15: C:/msys64/mingw64/include/zlib.h:1364:21: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **' ZEXTERN int ZEXPORT gzputs OF((gzFile file, const char *s)); ^ common/savefile.c: In function 'savefile_rewind': common/savefile.c:298:50: warning: passing argument 1 of 'gzrewind' from incompatible pointer type [-Wincompatible-pointer-types] s->is_a_FILE ? rewind((FILE *)s->fp) : gzrewind(s->fp); ^ In file included from include/util.h:46:0, from include/msieve.h:24, from include/common.h:18, from common/savefile.c:15: C:/msys64/mingw64/include/zlib.h:1447:24: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **' ZEXTERN int ZEXPORT gzrewind OF((gzFile file)); ^ common/util.c: In function 'aligned_malloc': common/util.c:40:9: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] addr = (unsigned long)ptr; ^ common/util.c: In function 'get_cpu_type': common/util.c:456:17: warning: variable 'model' set but not used [-Wunused-but-set-variable] uint8 family, model; ^ mpqs/relation.c: In function 'qs_filter_relations': mpqs/relation.c:807:9: warning: variable 'poly_saved' set but not used [-Wunused-but-set-variable] uint32 poly_saved; ^ mpqs/relation.c:804:21: warning: variable 'curr_poly_idx' set but not used [-Wunused-but-set-variable] uint32 curr_a_idx, curr_poly_idx, curr_rel; ^ gnfs/poly/poly.c: In function 'read_poly': gnfs/poly/poly.c:28:8: warning: variable 'status' set but not used [-Wunused-but-set-variable] int32 status = 0; ^ gnfs/poly/poly_skew.c: In function 'sizeopt_callback': gnfs/poly/poly_skew.c:85:37: warning: unused parameter 'deg' [-Wunused-parameter] static void sizeopt_callback(uint32 deg, mpz_t *alg_coeffs, mpz_t *rat_coeffs, ^ gnfs/poly/stage2/optimize_deg6.c: In function 'poly_eval': gnfs/poly/stage2/optimize_deg6.c:208:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (k = 0; k < pow; k++) ^ gnfs/poly/stage2/optimize_deg6.c: In function 'fill_powers': gnfs/poly/stage2/optimize_deg6.c:406:17: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (j = 2; j <= max_pow; j++) ^ gnfs/poly/stage2/optimize_deg6.c: In function 'optimize_initial_deg6': gnfs/poly/stage2/optimize_deg6.c:722:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < (1 << (num_vars - 1)); i++) { ^ gnfs/ffpoly.c: In function 'get_zeros_rec': gnfs/ffpoly.c:594:9: warning: 'g[0u].degree' may be used uninitialized in this function [-Wmaybe-uninitialized] poly_t g, xpow; ^ gnfs/relation.c:23:8: warning: always_inline function might not be inlinable [-Wattributes] uint32 divide_factor_out(mpz_t polyval, uint64 p, ^ [/CODE] |
In the attempt to compile Msieve with CUDA support I installed Visual Studio 2013 and CUDA 6.5 on my laptop, and it got a bit further than it did with VS2015 on my desktop.
But it fails when reaching the "sm_10, compute_10" step, can this be disabled as this is deprecated I believe? [CODE] make[1]: Entering directory '/home/ATH/msieve/b40c' "C:\CUDA6.5/bin/nvcc" -gencode=arch=compute_10,code=\"sm_10,compute_10\" -o sort_engine_sm10.dll sort_engine.cu -Xptxas -v -Xcudafe -# -shared -Xptxas -abi=no -I"C:\CUDA6.5/include" -I. -O3 nvcc fatal : Unsupported gpu architecture 'compute_10' Makefile:42: recipe for target 'sort_engine_sm10.dll' failed make[1]: *** [sort_engine_sm10.dll] Error 1 make[1]: Leaving directory '/home/ATH/msieve/b40c' Makefile:316: recipe for target 'b40c/built' failed make: *** [b40c/built] Error 2[/CODE] |
It can be, but MSieve doesn't work with CUDA 6.5.
|
Ok, which version should I try then?
I read in the mfaktc thread that it does not work with CUDA 7.0 and 7.5. Are those ok for Msieve? |
I know 5.5 works. I don't think any version after that currently works. I believe I tried 6.0, 6.5, and 7.0RC. It may be worthwhile to try 7.0 (official release) and 7.5 to see if they're ok. I don't fully understand (or remember, frankly) exactly what the problem is, but it will build everything fine. When you run it, however, it immediately throws an error.
|
Damn of course CUDA 5.5 does not work with VS2013, so I have to uninstall both CUDA6.5 and VS2013 and install CUDA 5.5 and VS2012 :( VS and CUDA sucks.
|
Yes, the jumping around of versions is very annoying.
I didn't know about MSYS2; it's nice. Configuring GMP has problems that look very much like I saw with Mingw64, but at least this distribution has a precompiled GMP available, and that's good enough. Msieve builds without issue, GMP-ECM configure fails but I can get it to succeed by running Sysinternals Process Monitor simultaneously, I think because it slows down the configure so the rapid-delete-after-create doesn't happen. Maybe it's the AV software on the machine. Hopefully this will be enough to replace the b40c library with CUB, so latter-day Nvidia toolkits can work again. |
Sweet! :smile:
If you want some testing as you update, I'd be happy to help. |
[QUOTE=jasonp;420767]I didn't know about MSYS2; it's nice. Configuring GMP has problems that look very much like I saw with Mingw64, but at least this distribution has a precompiled GMP available, and that's good enough.[/QUOTE]
It is strange you have such problems compiling GMP and GMP-ECM. I think I saw the issue you have long ago in the original MSYS but MSYS2 has worked for me on like 4 different computers with Windows 7, 8.1 and 10. Maybe try and stop the antivirus software just while compiling GMP? Which CUDA and Visual Studio versions are you using to compile Msieve? Do you have time in the next days/week to check if the warnings I posted are ok on the non-gpu msieve versions I compiled? My CUDA compilation got even further now with CUDA5.5 and VS2012, it managed to build "stage1_core_sm11.ptx", "stage1_core_sm13.ptx" and "stage1_core_sm20.ptx" now. But it still failed eventually: [CODE]gcc -O3 -m64 -mavx -fomit-frame-pointer -march=sandybridge -mtune=sandybridge -D_FILE_OFFSET_BITS=64 -DNDEBUG -D_LARGEFILE64_SOURCE -Wall -W -DMSIEVE_SVN_VERSION="\"988\"" -I. -Iaprcl -Iinclude -Ignfs -Ignfs/poly -Ignfs/poly/stage1 -I/usr/local/include -I/c/CUDA5.5/include -DHAVE_GMP_ECM -I"/C/CUDA5.5/include" -Ib40c -DHAVE_CUDA demo.c -o msieve \ libmsieve.a -lecm "/C/CUDA5.5/lib/win32/cuda.lib" -lz -lgmp -lm -lpthread -L/usr/local/lib libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x45): undefined reference to `cuEventDestroy_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x5a): undefined reference to `cuEventDestroy_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x74): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x86): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x98): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xaa): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xd0): undefined reference to `cuStreamDestroy_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xec): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x148): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x155): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x177): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x199): undefined reference to `cuCtxDestroy_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x4f7): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x508): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x51c): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x534): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x687): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x69f): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x743): undefined reference to `cuCtxCreate_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x769): undefined reference to `cuModuleLoad' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x818): undefined reference to `cuStreamCreate' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x82e): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x85e): undefined reference to `cuMemsetD8_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x966): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x981): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x9e0): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa24): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa38): undefined reference to `cuMemAlloc_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa4e): more undefined references to `cuMemAlloc_v2' follow libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa83): undefined reference to `cuEventCreate' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa9c): undefined reference to `cuEventCreate' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xb0e): undefined reference to `cuModuleLoad' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xb7a): undefined reference to `cuModuleLoad' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xd6a): undefined reference to `cuFuncSetBlockShape' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1301): undefined reference to `cuMemcpyHtoDAsync_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1323): undefined reference to `cuMemcpyHtoDAsync_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x158a): undefined reference to `cuEventRecord' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x15bf): undefined reference to `cuMemcpyHtoDAsync_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x15ec): undefined reference to `cuMemsetD8Async' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1874): undefined reference to `cuFuncSetBlockShape' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1892): undefined reference to `cuLaunchGridAsync' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a1a): undefined reference to `cuLaunchGridAsync' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a32): undefined reference to `cuEventRecord' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a46): undefined reference to `cuEventSynchronize' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a69): undefined reference to `cuEventElapsedTime' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1aca): undefined reference to `cuMemcpyDtoHAsync_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1adb): undefined reference to `cuStreamSynchronize' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1d37): undefined reference to `cuMemsetD8Async' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x2347): undefined reference to `cuMemFree_v2' libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x236c): undefined reference to `cuMemAlloc_v2' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x22b): undefined reference to `cuInit' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x23b): undefined reference to `cuDeviceGetCount' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x268): undefined reference to `cuDeviceGet' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x287): undefined reference to `cuDeviceGetName' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x2a1): undefined reference to `cuDeviceComputeCapability' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x2b5): undefined reference to `cuDeviceGetProperties' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x31e): undefined reference to `cuDeviceTotalMem_v2' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x339): undefined reference to `cuDeviceGetAttribute' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x354): undefined reference to `cuDeviceGetAttribute' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x36a): undefined reference to `cuDeviceGetAttribute' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x527): undefined reference to `cuModuleGetFunction' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x53d): undefined reference to `cuFuncGetAttribute' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x5dd): undefined reference to `cuParamSetSize' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x6e0): undefined reference to `cuParamSetv' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x720): undefined reference to `cuParamSetv' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x75a): undefined reference to `cuParamSeti' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x79a): undefined reference to `cuParamSeti' libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x7e8): undefined reference to `cuParamSetv' collect2.exe: error: ld returned 1 exit status Makefile:266: recipe for target 'all' failed make: *** [all] Error 1[/CODE] |
Is there a 64-bit cuda.lib?
|
Yes, there is a "cuda.lib" in the "Cuda5.5\lib\x64" directory.
Edit: Thank you! The CUDA libs was set to win32 in the Makefile: CUDA_LIBS = "$(CUDA_ROOT)/lib/win32/cuda.lib" I changed win32 to x64 and now it compiled!!!!! |
So I got the CUDA version compiled and it works. Trying the polynomial selection it is running with 0% CPU load and 99% GPU load, but during compiling it still had all the same warnings as posted earlier.
|
With CUDA 5.5, right?
|
I assume so. The makefile would need to change to remove SM1.x compiling and link the 64-bit library.
ATH: this is a work machine and I don't have administrator access. If I could turn off the AV software, malware could as well :) |
Various classes of malware are known not to require administrator access to turn off so-called security software, that said :smile:
|
Here are the Msieve svn 988 versions I managed to compile for Haswell, Sandy Bridge and Core2.
Remember I got a lot of warnings during compiling, see post #10. [URL="hoegge.dk/mersenne/msieve-svn988-haswell.zip"]msieve-svn988-haswell.zip[/URL] [URL="hoegge.dk/mersenne/msieve-svn988-sandybridge.zip"]msieve-svn988-sandybridge.zip[/URL] [URL="hoegge.dk/mersenne/msieve-svn988-core2.zip"]msieve-svn988-core2.zip[/URL] |
Tried the Core 2 version, it complains about missing zlib1.dll. I guess zlib should be compiled in statically.
|
I did not compile zlib, it was preinstalled in MSYS2. I added "zlib1.dll" from MSYS2 to the zip files. Try and download it again.
|
Thank you, it appears to work now.
However, is there a way to disable GPU poly selection? It's trying and failing because I have a CUDA 1.x card. |
I think as long as you don't use the "-g" parameter it should use CPU.
|
I tried that, it uses GPU no matter that. It doesn't even fall back in case of failure.
Perhaps it would work properly on a computer with a non-Nvidia GPU, but it's completely unusable if you have an older Nvidia. |
Gotcha. If you go here: [url]http://gilchrist.ca/jeff/factoring/index.html[/url] you can find an i7/i5 version of MSieve 1.52. It's not the absolute latest version, but it should get the job done for you.
|
Yes, the SVN 883 version I mentioned in the 1st post (SVN 939 is good for poly selection only, it crashes in linear algebra). And it won't run on Core 2 - for that, the last version is 1.50 only.
|
[QUOTE=jasonp;420607]My problem is not that GMP doesn't work, it's that Windows 7 has some kind of background service that holds onto executables just long enough for the configure script to fail trying to delete temporary exe's that it generates. Thus configure doesn't work and I can't get to the next step.
[/QUOTE] I have a similar problem using MSDEV. I always suspected the Virus scanner is checking the new EXE for a virus. It doesn't know the difference between downloading an EXE from the internet and a tool chain writing an EXE file. |
Here are the Msieve svn988 I compiled a few days ago without CUDA support (before I figured out how to):
[URL="hoegge.dk/mersenne/msieve-svn988-nogpu-haswell.zip"]msieve-svn988-nogpu-haswell.zip[/URL] [URL="hoegge.dk/mersenne/msieve-svn988-nogpu-sandybridge.zip"]msieve-svn988-nogpu-sandybridge.zip[/URL] [URL="hoegge.dk/mersenne/msieve-svn988-nogpu-core2.zip"]msieve-svn988-nogpu-core2.zip[/URL] |
This is embarrassing...
I'm trying to compile MSieve on Ubuntu 14.04 using the following (after doing make clean):
[CODE]make all ECM=1 CUDA=1[/CODE] But it's throwing errors related to omp_get_thread_limit and other "omp" related functions. Is it trying to compile with MPI or am I forgetting something else? |
Something in the compile chain wants OpenMP (probably GMP-ECM). Try adding -lomp to the link line (there will probably be more dependencies) or compiling with -mopenmp.
|
Will do. I did notice it was something with GMP-ECM and compiled it with just CUDA. That worked fine, but when I try and actually run msieve with a C134, I get a CUDA_ERROR_OUT_OF_MEMORY (4x) followed by CUDA_ERROR_DEINITIALIZED immediately as stage 1 of poly selection starts. This is with CUDA 5.5. Any idea what that might be from?
|
Might have figured it out. I, by way of habit from Windows, ran Msieve with "-t 16" at first, which is when the error popped up. When I took the "-t" parameter out completely, it seems to be fine. I'm going to let it run overnight and confirm that that's the case.
|
Yes, stage 1 of poly selection is multithreaded and needs a limit on the number of threads that make sense. I would never use more than 4, and beyond about C140 there's basically no difference in stage 1 throughput compared to one thread.
|
Yeah, it's more that on Windows you can put something like 16, and it will work fine (or at least not throw any errors--maybe performance suffers).
|
The non-GPU version works fine, thanks.
|
Does anyone want to test SVN 991 with a CUDA toolkit > 5.5? This replaces the GPU sorting code with CUB and streamlines some of the GPU setup. 64-bit windows builds should compile with 'make all WIN=1 WIN64=1 CUDA=1 NO_ZLIB=1'.
Single-threaded and multithreaded runs work fine for me on a range of input sizes with the v6.5 toolkit. It would be nice if someone can also confirm that it builds and runs okay in linux. |
What about CUDA 7.0 and 7.5? Are they still not compatible?
|
Working on building it with 7.0, but running into some issues (not with compilation, but immediate crash when run). I'm going to try a few things first.
[STRIKE] Edit: Yeah, still an immediate crash when compiled in MSieve. I recall some sort of issue with crossing between 32-bit and 64-bit dlls or something. Also working on getting it compiled in VS2012, but I'm running into issues there with mpir.lib and libecm.lib throwing unresolved external symbol errors at the last step of linking.[/STRIKE] Edit 2: Alright, everything compiled fine in VS2012 and seems to work fine (I can start it and watch poly numbers go screaming by on the screen for a few seconds before I stop it). I'll do more testing in a bit, but it looks like you have successfully updated Msieve! Thanks a ton for taking the time to do it. |
MSieve 991M compiled with CUDA 7
1 Attachment(s)
Here's the exe along with the pthreads dll and sort engine dll and ptx files. I added a CC5.2 ptx compilation. Don't know which are needed or not, but everything is there except for the cudart dlls. They'd be too big for posting here, I think. CPU is an Ivy Bridge.
|
[QUOTE=ATH;421211]What about CUDA 7.0 and 7.5? Are they still not compatible?[/QUOTE]
I made it work with CUDA 7.5 by editing the makefiles to remove CC 1.1 and 1.3: [url]http://mersenneforum.org/showpost.php?p=416072&postcount=20[/url] Chris |
SVN991 compiled fine with CUDA 7.5 without tinkering with the Makefile beyond the usual parameters. It even worked with ZLIB on. What is the benefit of Zlib, it compresses relations? It is better to leave NO_ZLIB=1 ?
It seems to work, it searches for a poly at least. I tried to use parameters from this old RSA896 thread: [URL="http://www.mersenneforum.org/showthread.php?t=17460"]http://www.mersenneforum.org/showthread.php?t=17460[/URL] but it is not finding the same polynomials or I do not know what I'm doing, which is far more likely. Anyone have some more recent parameters that should find a polynomial just for a test? |
Compiling zlib in allows the binary to read and write compressed relation files.
I don't think there's a controlled test you can run that will find a known polynomial. There's a lot of checking in polyselect stage 2, so if you find any polynomials at all it's probably working fine. Of course with a hot modern GPU you will find stage 1 hits so fast that performing stage 2 will leave the GPU mostly idle. Sorry to everyone that it took so long to get back to a working state. |
[QUOTE=ATH;421344]SVN991 compiled fine with CUDA 7.5 without tinkering with the Makefile beyond the usual parameters. It even worked with ZLIB on. What is the benefit of Zlib, it compresses relations? It is better to leave NO_ZLIB=1 ?
[/QUOTE] If you have enough space, and don't intend to move the files much around, and don't have a terrible slow hdd, then yes. Disadvantages, as shown by the former sentence, are that the uncompressed files are big, taking a lot of space and being difficult to move from folders to folders, or share on the web, the are slow to read when resuming the work, etc. Advantages is that if the zipped file crash - and the probability is not null, because the files are big and not fast to handle/read/write - then you can most probably say bye-bye to all your relations, but when the file is in clear, the crashed lines are just ignored, and most of the relations are still recoverable. |
Here are the svn 991 files compiled without Zlib:
[URL="hoegge.dk/mersenne/msieve-svn991-cuda75-haswell.zip"]msieve-svn991-cuda75-haswell.zip[/URL] [URL="hoegge.dk/mersenne/msieve-svn991-cuda75-sandybridge.zip"]msieve-svn991-cuda75-sandybridge.zip[/URL] |
Slower
On my System (Linux, GeForce GTX 650 Ti) the new version (compiled with Cuda 6.5, using sm30) is about 20% slower than the old version (compiled with Cuda 6.0, using sm20).
For the comparison I disabled the randomization. The card isn't used for the display. I haven't had the time to try the old version with Cuda 6.5. |
I only use rotating HDDs, and using gzipped relations, as produced by NFS@Home clients for bandwidth reasons, usually saves filtering time compared to dealing with uncompressed relations. Possibly square root time as well, as the bottleneck when reading relations is on I/O.
On modern zlib versions, the CPU cost of dealing with uncompressed output is minimal. |
1 Attachment(s)
[QUOTE=ATH;420975]Here are the Msieve svn988 I compiled a few days ago without CUDA support (before I figured out how to):
[URL="http://hoegge.dk/mersenne/msieve-svn988-nogpu-haswell.zip"]msieve-svn988-nogpu-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn988-nogpu-sandybridge.zip"]msieve-svn988-nogpu-sandybridge.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn988-nogpu-core2.zip"]msieve-svn988-nogpu-core2.zip[/URL][/QUOTE] These need libwinpthread-1.dll to work on a machine without minGW and Visual Studio installed. |
[QUOTE=VictordeHolland;421455]These need libwinpthread-1.dll to work on a machine without minGW and Visual Studio installed.[/QUOTE]
Thanks, I did not get a chance to test them on another computer. I added my "libwinpthread-1.dll" from MSYS2 to all the zip-files. I also added "nvcuda.dll" to the gpu versions just to be safe, although it might be present on all computers with Nvidia drivers installed? |
991-haswell doesn't work unless I delete nvcuda.dll - apparently that DLL is highly driver-specific and must be loaded from the system.
|
ATH, any possibility to compile msieve taking into consideration the large vectors addon presented here: [url]http://www.mersenneforum.org/showthread.php?t=22386[/url]
|
I do not think so unfortunately. It seems you need multithreading MPI=1 to benefit from those large vectors right?
And I cannot seem to compile OpenMPI in Msys2 and it cannot be downloaded with a package. OpenMP comes with the gcc package in Msys2 but I'm not sure what OpenMP is vs OpenMPI. Baseically I have never compiled multithreaded applications. Here is a "normal" build of the latest svn 1018 with CUDA enabled. Compiled with: make all WIN=1 WIN64=1 ECM=1 CUDA=1 NO_ZLIB=1 [URL="http://hoegge.dk/mersenne/msieve-svn1018-cuda75-haswell.zip"]msieve-svn1018-cuda75-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-cuda75-sandybridge.zip"]msieve-svn1018-cuda75-sandybridge.zip[/URL] |
I thought it was able for the two versions, with and without MPI capability.
Thank you although I only have an ivy bridge machine but the binaries will be useful for the NFS@Home team. |
You can use the Sandy Bridge build on the Ivy Bridge. I doubt a dedicated Ivy Bridge build would be much faster.
|
OpenMP adds a thread pool to compiled programs, along with a set of pragma command to perform task decomposition and feed the thread pool. It's totally different from MPI.
|
So apparently the large vectors makes sense even without MPI=1:
make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=1 VBITS=64/128/256: [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-haswell.zip"]msieve-svn1018-vbits64-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-haswell.zip"]msieve-svn1018-vbits128-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-haswell.zip"]msieve-svn1018-vbits256-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-sandybridge.zip"]msieve-svn1018-vbits64-sandybridge.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-sandybridge.zip"]msieve-svn1018-vbits128-sandybridge.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-sandybridge.zip"]msieve-svn1018-vbits256-sandybridge.zip[/URL] |
[QUOTE=ATH;479393]You can use the Sandy Bridge build on the Ivy Bridge. I doubt a dedicated Ivy Bridge build would be much faster.[/QUOTE]
Apologies for only now coming back to you, it works perfectly fine on Ivy. Thank you so much. |
[QUOTE=ATH;479462]So apparently the large vectors makes sense even without MPI=1:
make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=1 VBITS=64/128/256: [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-haswell.zip"]msieve-svn1018-vbits64-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-haswell.zip"]msieve-svn1018-vbits128-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-haswell.zip"]msieve-svn1018-vbits256-haswell.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-sandybridge.zip"]msieve-svn1018-vbits64-sandybridge.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-sandybridge.zip"]msieve-svn1018-vbits128-sandybridge.zip[/URL] [URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-sandybridge.zip"]msieve-svn1018-vbits256-sandybridge.zip[/URL][/QUOTE] Hi, can you please compile it again for Revision 1028. Tia. Carlos |
SVN1028 Win64 builds
3 Attachment(s)
Windows 64bit builds using MSYS2/mingw64 of msieve SVN1028
[code]gcc version 8.2.0 (Rev3, Built by MSYS2 project)[/code]Built on a i7 3770 (IvyBridge) make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=0 VBITS=64/128/256 I think I've included all the necessary .dlls, but do reply if you miss any. Very little testing done, but enjoy. P.S. I've included zlib, since I like to keep the relations compressed (.gz) and don't want to unzip a few GBs of relations when I post-process NFS@home jobs. |
Anyone has a msieve version prior to svn988 for windows ivy bridge?
|
Could someone share msieve v1028 compiled files for win64 and cuda 10 (or newest 10.1)?
I tried to compile it myself (vs2017) and got several errors in gmp-ecm and msieve sources. (mpir and pthreads compiled fine without erros) |
try to use SVN 1005 and got error:
[code] Msieve v. 1.53 (SVN 1005) Fri Dec 06 00:25:41 2019 random seeds: 35e6e7a0 d0500745 factoring 2881039827457895971881627053137530734638790825166127496066674320241571 446494762386620442953820735453 (100 digits) searching for 15-digit factors commencing number field sieve (100-digit input) commencing number field sieve polynomial selection polynomial degree: 4 max stage 1 norm: 1.58e+017 max stage 2 norm: 3.44e+015 min E-value: 8.85e-009 poly select deadline: 1317 time limit set to 0.37 CPU-hours expecting poly E from 1.43e-008 to > 1.64e-008 searching leading coefficients from 1 to 4000 using GPU 0 (GeForce GTX 1050 Ti) selected card has CUDA arch 6.1 deadline: 5 CPU-seconds per coefficient error (line 1116): CUDA_ERROR_FILE_NOT_FOUND Msieve Error: return value 4294967295. Is CUDA enabled? Terminating... [/code] |
What card and CUDA version are you using? Do you run the binary from command line, from the directory above the one with the PTX files?
|
using GPU 0 (GeForce GTX 1050 Ti) selected card has CUDA arch 6.1
win 8.1 x64 all ptx-file in same directory run from python script |
[QUOTE=BfoX;532553]using GPU 0 (GeForce GTX 1050 Ti) selected card has CUDA arch 6.1
win 8.1 x64 all ptx-file in same directory run from python script[/QUOTE] It sounds like the stage1_core_sm60.ptx file is missing. (or sm61 ?) You will have to add that to the "make" file and rebuild or find another binary which includes this later ptx file. |
[QUOTE=RichD;532780]It sounds like the stage1_core_sm60.ptx file is missing. (or sm61 ?).[/QUOTE]
the card support a low version of the ptx-files. |
[QUOTE=BfoX;532784]the card support a low version of the ptx-files.[/QUOTE]
I don't know how to tell msieve to use a lower version ptx-file. Perhaps this [url=https://www.mersenneforum.org/showthread.php?t=23685]thread[/url] is helpful or post your question/request there. |
[QUOTE=BfoX;532116]try to use SVN 1005 and got error:
[code] Msieve v. 1.53 (SVN 1005) Fri Dec 06 00:25:41 2019 random seeds: 35e6e7a0 d0500745 factoring 2881039827457895971881627053137530734638790825166127496066674320241571 446494762386620442953820735453 (100 digits) searching for 15-digit factors commencing number field sieve (100-digit input) commencing number field sieve polynomial selection polynomial degree: 4 max stage 1 norm: 1.58e+017 max stage 2 norm: 3.44e+015 min E-value: 8.85e-009 poly select deadline: 1317 time limit set to 0.37 CPU-hours expecting poly E from 1.43e-008 to > 1.64e-008 searching leading coefficients from 1 to 4000 using GPU 0 (GeForce GTX 1050 Ti) selected card has CUDA arch 6.1 deadline: 5 CPU-seconds per coefficient error (line 1116): CUDA_ERROR_FILE_NOT_FOUND Msieve Error: return value 4294967295. Is CUDA enabled? Terminating... [/code][/QUOTE] I had the same problem with a arch 7.5 card. In msieve's source file stage1_sieve_gpu.c file, starting at line 1106, is this code: [CODE]if (d->gpu_info->compute_version_major == 2) { CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm20.ptx")) } else if (d->gpu_info->compute_version_major == 3) { if (d->gpu_info->compute_version_minor < 5) CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm30.ptx")) else CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm35.ptx")) } else if (d->gpu_info->compute_version_major >= 5) { CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm50.ptx")) } else { printf("sorry, Nvidia doesn't want to support your card\n"); exit(-1); } [/CODE] Which for anything at or above compute capability 5 attempts to load stage1_core_sm50.ptx. I tried the following, which gets past the error message but appears to hang once the card gets going: 1) edit gpu_sm.props in the build.cuda.vc15 directory to update the CC_major and CC_minor fields (mine were 7 and 5) 2) in visual studio, edit the stage1_core_sm property pages by adding compute_75,sm_75 to the Code Generation item of the CUDA C/C++ section (or in your case, I guess compute_61,sm_61) 3) rebuild This will make a stage1_core_sm61.ptx file in the bin/ directory alongside the msieve.exe executable. You'll then have to rename to stage1_core_sm50.ptx for it to be loaded by the code above. But, like I said, some step or another is still missing because the code now just hangs after starting the search. |
[QUOTE=VictordeHolland;499852]Windows 64bit builds using MSYS2/mingw64 of msieve SVN1028
[code]gcc version 8.2.0 (Rev3, Built by MSYS2 project)[/code]Built on a i7 3770 (IvyBridge) make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=0 VBITS=64/128/256 I think I've included all the necessary .dlls, but do reply if you miss any. Very little testing done, but enjoy. P.S. I've included zlib, since I like to keep the relations compressed (.gz) and don't want to unzip a few GBs of relations when I post-process NFS@home jobs.[/QUOTE] Is SVN 1028 still the latest version of M Sleve for Windows 10 64-bit? Thanks Jarod |
| All times are UTC. The time now is 01:12. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.