mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Msieve (https://www.mersenneforum.org/forumdisplay.php?f=83)
-   -   Newer X64 build needed (https://www.mersenneforum.org/showthread.php?t=20796)

Googulator 2015-12-30 12:28

Newer X64 build needed
 
Hi,

The latest usable X64 binary on Jeff Gilchrist's page is SVN 883, and the latest one for Core 2 CPUs is 1.50. (There are SVN 939 and 942 binaries, but those are buggy and crash in the linear algebra.) On this forum, there exists a build from SVN 946, but it's built without ECM support.

1.52 is available on SourceForge, but only 32-bit. In my testing, 32-bit builds are a lot slower, especially the ECM.

I don't know if Msieve's SF maintainer reads this forum, but if he does, please make a 64-bit build of the official 1.52!

jasonp 2015-12-30 13:44

Msieve's maintainer does read this forum, and should release v1.53 since it's been languishing for a year. I also now have the capability to build x64 binaries so those should move to SF too.

The slower ECM does not surprise me, but in my experience latter-day GMP has a lot of difficulty building on 64-bit windows.

bsquared 2015-12-30 14:55

[QUOTE=jasonp;420520]

The slower ECM does not surprise me, but in my experience latter-day GMP has a lot of difficulty building on 64-bit windows.[/QUOTE]

It requires an up-to-date mingw-64/msys toolchain. ATH seems to know quite a bit about how to make it work (e.g. [url]http://www.mersenneforum.org/showthread.php?t=4087&page=36);[/url] maybe he'd be willing to take questions if something doesn't work for you. I don't know how this reconciles with the GPU portion of the msieve build... or if there is a way to get latter-day GMP to compile with microsoft/visual studio...

wombatman 2015-12-30 15:09

Yeah, GMP is not an issue with MinGW-64. I had a harder time building MSieve, especially in terms of the issues when using a CUDA version newer than 5.5.

ATH 2015-12-30 19:36

Actually I learned all about 64 bit compiling with Mingw64 from mainly WraithX but also from wombatman, so they know a lot more than me. I have never compiled a CUDA application but it was a long time since I tried.

ATH 2015-12-30 22:22

I installed CUDA 6.5 (since 7.0 and 7.5 does not work for mfaktc I assumed it was best to avoid them for msieve as well).

I added the CUDA path and various flags to the makefile but it fails when it reaches the nvcc step. Do I really need Visual Studio installed to compile a CUDA application when I'm trying to compile with Msys2+Mingw64? I have VS2015 but it needs 2010/2012/2013:

[CODE]"/C/CUDA6.5/bin/nvcc" -arch sm_11 -ptx -o stage1_core_sm11.ptx gnfs/poly/stage1/ stage1_core_gpu/stage1_core.cu
nvcc warning : The 'compute_11', 'compute_12', 'compute_13', 'sm_11', 'sm_12', a nd 'sm_13' architectures are deprecated, and may be removed in a future release.
nvcc fatal : nvcc cannot find a supported version of Microsoft Visual Studio. Only the versions 2010, 2012, and 2013 are supported
Makefile:307: recipe for target 'stage1_core_sm11.ptx' failed
make: *** [stage1_core_sm11.ptx] Error 1[/CODE]

jasonp 2015-12-31 00:05

My problem is not that GMP doesn't work, it's that Windows 7 has some kind of background service that holds onto executables just long enough for the configure script to fail trying to delete temporary exe's that it generates. Thus configure doesn't work and I can't get to the next step.

People all over the internet have the same problem, and I've tried all of the solutions they propose and nothing works.

I can compile a generic build of 64-bit MPIR using Visual Studio, but the last time I tried compiling the assembler code YASM had some kind of problem (late last year).

Getting something that can execute GMP function calls on 64-bit windows has been incredibly frustrating for me, it has exceeded the time limit I could devote to it every single time.

I could probably cross-compile a 64-bit build on a widows XP system, but the latest CUDA doesn't work with MSVC2008 and MSVC2010 doesn't work on XP. Hence the year-long stalemate where I can't update the CUDA code to work with CUDA > 5.5 either.

ATH 2015-12-31 01:57

Have you tried MSYS2 with Mingw64?:
[URL="https://sourceforge.net/p/msys2/wiki/MSYS2%20installation/"]https://sourceforge.net/p/msys2/wiki/MSYS2%20installation/[/URL]

I ran these commands to update it initially:

[CODE]Run "msys2_shell.bat"

update-core

Restart MSYS2 (using msys2_shell.bat)

pacman -Su

Restart MSYS2 (using mingw64_shell.bat and use this to start MSYS2 from now on)

pacman -S mingw-w64-x86_64-gcc
pacman -S mingw-w64-x86_64-make
pacman -S mingw-w64-x86_64-libtool
pacman -S autoconf
pacman -S automake
pacman -S make[/CODE]

If this does not work or if you just want to try it quick, I saved my installation after these steps above: [URL="http://hoegge.dk/gmp/msys64.zip"]msys64.zip[/URL]
(It's a 7-zip file but the site does not allow .7z extention. Rename it to msys64.7z)
It is simply the msys64 folder which should be extracted to C: and then run mingw64_shell.bat.

I compile GMP 6.1.0 on a Haswell with:
[CODE]./configure ABI=64 CC=gcc CFLAGS="-O3 -m64 -mavx -mavx2 -mfma -march=haswell -mtune=haswell" --build=haswell-w64-mingw32 --enable-static --disable-shared
make
make install
make check[/CODE]

It works fine with MSYS2 in both Windows 7 and 10.

Googulator 2015-12-31 18:52

I see you're using Haswell-specific optimizations. Please make also a generic build, or at least something usable on Core 2. (Mine has SSE4.1, but not all Core 2s have it.)

ATH 2015-12-31 22:29

I compiled the newest svn988 [B]without CUDA support[/B] with haswell, sandy bridge and core2 flags. I got a fair number of warnings most are "unused-parameter" but there are other that might be serious?

I will not post any binaries until we check if the warnings are ok:

[CODE]common/lanczos/lanczos.c: In function 'dump_lanczos_state':
common/lanczos/lanczos.c:485:21: warning: unused parameter 'packed_matrix' [-Wunused-parameter]
packed_matrix_t *packed_matrix,
^
common/lanczos/lanczos.c:488:11: warning: unused parameter 'n' [-Wunused-parameter]
uint32 n, uint32 max_n, uint32 dim_solved, uint32 iter,
^
common/lanczos/lanczos.c: In function 'read_lanczos_state':
common/lanczos/lanczos.c:643:21: warning: unused parameter 'packed_matrix' [-Wunused-parameter]
packed_matrix_t *packed_matrix,
^
common/lanczos/lanczos.c:646:11: warning: unused parameter 'n' [-Wunused-parameter]
uint32 n, uint32 max_n, uint32 *dim_solved,
^
common/lanczos/lanczos_io.c: In function 'dump_matrix':
common/lanczos/lanczos_io.c:173:10: warning: unused parameter 'sparse_weight' [-Wunused-parameter]
uint64 sparse_weight) {
^
common/lanczos/lanczos_io.c: In function 'file_cache_get_next':
common/lanczos/lanczos_io.c:372:45: warning: unused parameter 'obj' [-Wunused-parameter]
static void file_cache_get_next(msieve_obj *obj, FILE *fp,
^
common/lanczos/lanczos_io.c:375:12: warning: unused parameter 'read_submatrix' [-Wunused-parameter]
uint32 read_submatrix) {
^
common/lanczos/lanczos_io.c: In function 'read_matrix':
common/lanczos/lanczos_io.c:438:23: warning: variable 'mpi_nrows' set but not used [-Wunused-but-set-variable]
uint32 mpi_resclass, mpi_nrows;
^
common/lanczos/lanczos_io.c:438:9: warning: variable 'mpi_resclass' set but not used [-Wunused-but-set-variable]
uint32 mpi_resclass, mpi_nrows;
^
common/lanczos/lanczos_matmul0.c: In function 'packed_matrix_init':
common/lanczos/lanczos_matmul0.c:419:12: warning: unused variable 'j' [-Wunused-variable]
uint32 i, j;
^
common/lanczos/lanczos_matmul0.c: In function 'mul_MxN_Nx64':
common/lanczos/lanczos_matmul0.c:619:23: warning: unused parameter 'scratch' [-Wunused-parameter]
uint64 *b, uint64 *scratch) {
^
common/lanczos/lanczos_matmul1.c: In function 'mul_packed_core':
common/lanczos/lanczos_matmul1.c:308:38: warning: unused parameter 'thread_num' [-Wunused-parameter]
void mul_packed_core(void *data, int thread_num)
^
common/lanczos/lanczos_matmul1.c: In function 'mul_packed_small_core':
common/lanczos/lanczos_matmul1.c:348:44: warning: unused parameter 'thread_num' [-Wunused-parameter]
void mul_packed_small_core(void *data, int thread_num)
^
common/lanczos/lanczos_matmul2.c: In function 'mul_trans_packed_core':
common/lanczos/lanczos_matmul2.c:319:44: warning: unused parameter 'thread_num' [-Wunused-parameter]
void mul_trans_packed_core(void *data, int thread_num)
^
common/lanczos/lanczos_matmul2.c: In function 'mul_trans_packed_small_core':
common/lanczos/lanczos_matmul2.c:358:50: warning: unused parameter 'thread_num' [-Wunused-parameter]
void mul_trans_packed_small_core(void *data, int thread_num)
^
common/lanczos/lanczos_vv.c: In function 'mul_Nx64_64x64_acc':
common/lanczos/lanczos_vv.c:201:9: warning: unused variable 'i' [-Wunused-variable]
uint32 i;
^
common/lanczos/lanczos_vv.c: In function 'outer_thread_run':
common/lanczos/lanczos_vv.c:210:46: warning: unused parameter 'thread_num' [-Wunused-parameter]
static void outer_thread_run(void *data, int thread_num)
^
common/lanczos/lanczos_vv.c: In function 'inner_thread_run':
common/lanczos/lanczos_vv.c:427:46: warning: unused parameter 'thread_num' [-Wunused-parameter]
static void inner_thread_run(void *data, int thread_num)
^
common/lanczos/lanczos_vv.c: In function 'tmul_64xN_Nx64':
common/lanczos/lanczos_vv.c:441:12: warning: unused variable 'j' [-Wunused-variable]
uint32 i, j;
^
common/smallfact/smallfact.c: In function 'trial_factor':
common/smallfact/smallfact.c:22:9: warning: variable 'factor_found' set but not used [-Wunused-but-set-variable]
uint32 factor_found = 0;
^
common/minimize.c: In function 'solve_dmatrix':
common/minimize.c:421:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (i = 0; i < n; i++)
^
common/minimize.c:424:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (i = 0; i < n - 1; i++) {
^
common/minimize.c:431:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (j = i + 1; j < n; j++) {
^
common/minimize.c:444:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (j = i + 1; j < n; j++) {
^
common/minimize.c:448:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (k = i + 1; k < n; k++) {
^
common/minimize.c:460:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (j = i + 1; j < n; j++) {
^
common/savefile.c: In function 'savefile_open':
common/savefile.c:131:9: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
s->fp = gzopen(name_gz, open_string);
^
common/savefile.c:157:10: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
s->fp = gzopen(s->name, "a");
^
common/savefile.c:163:9: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
s->fp = gzopen(s->name, open_string);
^
common/savefile.c: In function 'savefile_close':
common/savefile.c:182:49: warning: passing argument 1 of 'gzclose' from incompatible pointer type [-Wincompatible-pointer-types]
s->is_a_FILE ? fclose((FILE *)s->fp) : gzclose(s->fp);
^
In file included from include/util.h:46:0,
from include/msieve.h:24,
from include/common.h:18,
from common/savefile.c:15:
C:/msys64/mingw64/include/zlib.h:1511:24: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **'
ZEXTERN int ZEXPORT gzclose OF((gzFile file));
^
common/savefile.c: In function 'savefile_eof':
common/savefile.c:193:53: warning: passing argument 1 of 'gzeof' from incompatible pointer type [-Wincompatible-pointer-types]
return (s->is_a_FILE ? feof((FILE *)s->fp) : gzeof(s->fp));
^
In file included from include/util.h:46:0,
from include/msieve.h:24,
from include/common.h:18,
from common/savefile.c:15:
C:/msys64/mingw64/include/zlib.h:1475:21: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **'
ZEXTERN int ZEXPORT gzeof OF((gzFile file));
^
common/savefile.c: In function 'savefile_read_line':
common/savefile.c:251:9: warning: passing argument 1 of 'gzgets' from incompatible pointer type [-Wincompatible-pointer-types]
gzgets(s->fp, buf, (int)max_len);
^
In file included from include/util.h:46:0,
from include/msieve.h:24,
from include/common.h:18,
from common/savefile.c:15:
C:/msys64/mingw64/include/zlib.h:1372:24: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **'
ZEXTERN char * ZEXPORT gzgets OF((gzFile file, char *buf, int len));
^
common/savefile.c: In function 'savefile_flush':
common/savefile.c:279:10: warning: passing argument 1 of 'gzputs' from incompatible pointer type [-Wincompatible-pointer-types]
gzputs(s->fp, s->buf);
^
In file included from include/util.h:46:0,
from include/msieve.h:24,
from include/common.h:18,
from common/savefile.c:15:
C:/msys64/mingw64/include/zlib.h:1364:21: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **'
ZEXTERN int ZEXPORT gzputs OF((gzFile file, const char *s));
^
common/savefile.c: In function 'savefile_rewind':
common/savefile.c:298:50: warning: passing argument 1 of 'gzrewind' from incompatible pointer type [-Wincompatible-pointer-types]
s->is_a_FILE ? rewind((FILE *)s->fp) : gzrewind(s->fp);
^
In file included from include/util.h:46:0,
from include/msieve.h:24,
from include/common.h:18,
from common/savefile.c:15:
C:/msys64/mingw64/include/zlib.h:1447:24: note: expected 'gzFile {aka struct gzFile_s *}' but argument is of type 'struct gzFile_s **'
ZEXTERN int ZEXPORT gzrewind OF((gzFile file));
^
common/util.c: In function 'aligned_malloc':
common/util.c:40:9: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
addr = (unsigned long)ptr;
^
common/util.c: In function 'get_cpu_type':
common/util.c:456:17: warning: variable 'model' set but not used [-Wunused-but-set-variable]
uint8 family, model;
^
mpqs/relation.c: In function 'qs_filter_relations':
mpqs/relation.c:807:9: warning: variable 'poly_saved' set but not used [-Wunused-but-set-variable]
uint32 poly_saved;
^
mpqs/relation.c:804:21: warning: variable 'curr_poly_idx' set but not used [-Wunused-but-set-variable]
uint32 curr_a_idx, curr_poly_idx, curr_rel;
^
gnfs/poly/poly.c: In function 'read_poly':
gnfs/poly/poly.c:28:8: warning: variable 'status' set but not used [-Wunused-but-set-variable]
int32 status = 0;
^
gnfs/poly/poly_skew.c: In function 'sizeopt_callback':
gnfs/poly/poly_skew.c:85:37: warning: unused parameter 'deg' [-Wunused-parameter]
static void sizeopt_callback(uint32 deg, mpz_t *alg_coeffs, mpz_t *rat_coeffs,
^
gnfs/poly/stage2/optimize_deg6.c: In function 'poly_eval':
gnfs/poly/stage2/optimize_deg6.c:208:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (k = 0; k < pow; k++)
^
gnfs/poly/stage2/optimize_deg6.c: In function 'fill_powers':
gnfs/poly/stage2/optimize_deg6.c:406:17: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (j = 2; j <= max_pow; j++)
^
gnfs/poly/stage2/optimize_deg6.c: In function 'optimize_initial_deg6':
gnfs/poly/stage2/optimize_deg6.c:722:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (i = 0; i < (1 << (num_vars - 1)); i++) {
^
gnfs/ffpoly.c: In function 'get_zeros_rec':
gnfs/ffpoly.c:594:9: warning: 'g[0u].degree' may be used uninitialized in this function [-Wmaybe-uninitialized]
poly_t g, xpow;
^
gnfs/relation.c:23:8: warning: always_inline function might not be inlinable [-Wattributes]
uint32 divide_factor_out(mpz_t polyval, uint64 p,
^
[/CODE]

ATH 2015-12-31 22:34

In the attempt to compile Msieve with CUDA support I installed Visual Studio 2013 and CUDA 6.5 on my laptop, and it got a bit further than it did with VS2015 on my desktop.

But it fails when reaching the "sm_10, compute_10" step, can this be disabled as this is deprecated I believe?

[CODE]
make[1]: Entering directory '/home/ATH/msieve/b40c'
"C:\CUDA6.5/bin/nvcc" -gencode=arch=compute_10,code=\"sm_10,compute_10\" -o sort_engine_sm10.dll sort_engine.cu -Xptxas -v -Xcudafe -# -shared -Xptxas -abi=no -I"C:\CUDA6.5/include" -I. -O3
nvcc fatal : Unsupported gpu architecture 'compute_10'
Makefile:42: recipe for target 'sort_engine_sm10.dll' failed
make[1]: *** [sort_engine_sm10.dll] Error 1
make[1]: Leaving directory '/home/ATH/msieve/b40c'
Makefile:316: recipe for target 'b40c/built' failed
make: *** [b40c/built] Error 2[/CODE]

wombatman 2015-12-31 23:12

It can be, but MSieve doesn't work with CUDA 6.5.

ATH 2015-12-31 23:36

Ok, which version should I try then?

I read in the mfaktc thread that it does not work with CUDA 7.0 and 7.5. Are those ok for Msieve?

wombatman 2015-12-31 23:43

I know 5.5 works. I don't think any version after that currently works. I believe I tried 6.0, 6.5, and 7.0RC. It may be worthwhile to try 7.0 (official release) and 7.5 to see if they're ok. I don't fully understand (or remember, frankly) exactly what the problem is, but it will build everything fine. When you run it, however, it immediately throws an error.

ATH 2016-01-01 01:05

Damn of course CUDA 5.5 does not work with VS2013, so I have to uninstall both CUDA6.5 and VS2013 and install CUDA 5.5 and VS2012 :( VS and CUDA sucks.

jasonp 2016-01-01 02:55

Yes, the jumping around of versions is very annoying.

I didn't know about MSYS2; it's nice. Configuring GMP has problems that look very much like I saw with Mingw64, but at least this distribution has a precompiled GMP available, and that's good enough.

Msieve builds without issue, GMP-ECM configure fails but I can get it to succeed by running Sysinternals Process Monitor simultaneously, I think because it slows down the configure so the rapid-delete-after-create doesn't happen. Maybe it's the AV software on the machine.

Hopefully this will be enough to replace the b40c library with CUB, so latter-day Nvidia toolkits can work again.

wombatman 2016-01-01 02:59

Sweet! :smile:

If you want some testing as you update, I'd be happy to help.

ATH 2016-01-01 03:20

[QUOTE=jasonp;420767]I didn't know about MSYS2; it's nice. Configuring GMP has problems that look very much like I saw with Mingw64, but at least this distribution has a precompiled GMP available, and that's good enough.[/QUOTE]

It is strange you have such problems compiling GMP and GMP-ECM. I think I saw the issue you have long ago in the original MSYS but MSYS2 has worked for me on like 4 different computers with Windows 7, 8.1 and 10.

Maybe try and stop the antivirus software just while compiling GMP?

Which CUDA and Visual Studio versions are you using to compile Msieve? Do you have time in the next days/week to check if the warnings I posted are ok on the non-gpu msieve versions I compiled?

My CUDA compilation got even further now with CUDA5.5 and VS2012, it managed to build "stage1_core_sm11.ptx", "stage1_core_sm13.ptx" and "stage1_core_sm20.ptx" now. But it still failed eventually:

[CODE]gcc -O3 -m64 -mavx -fomit-frame-pointer -march=sandybridge -mtune=sandybridge -D_FILE_OFFSET_BITS=64 -DNDEBUG -D_LARGEFILE64_SOURCE -Wall -W -DMSIEVE_SVN_VERSION="\"988\"" -I. -Iaprcl -Iinclude -Ignfs -Ignfs/poly -Ignfs/poly/stage1 -I/usr/local/include -I/c/CUDA5.5/include -DHAVE_GMP_ECM -I"/C/CUDA5.5/include" -Ib40c -DHAVE_CUDA demo.c -o msieve \
libmsieve.a -lecm "/C/CUDA5.5/lib/win32/cuda.lib" -lz -lgmp -lm -lpthread -L/usr/local/lib
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x45): undefined reference to `cuEventDestroy_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x5a): undefined reference to `cuEventDestroy_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x74): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x86): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x98): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xaa): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xd0): undefined reference to `cuStreamDestroy_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xec): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x148): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x155): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x177): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x199): undefined reference to `cuCtxDestroy_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x4f7): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x508): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x51c): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x534): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x687): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x69f): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x743): undefined reference to `cuCtxCreate_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x769): undefined reference to `cuModuleLoad'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x818): undefined reference to `cuStreamCreate'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x82e): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x85e): undefined reference to `cuMemsetD8_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x966): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x981): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x9e0): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa24): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa38): undefined reference to `cuMemAlloc_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa4e): more undefined references to `cuMemAlloc_v2' follow
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa83): undefined reference to `cuEventCreate'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xa9c): undefined reference to `cuEventCreate'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xb0e): undefined reference to `cuModuleLoad'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xb7a): undefined reference to `cuModuleLoad'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0xd6a): undefined reference to `cuFuncSetBlockShape'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1301): undefined reference to `cuMemcpyHtoDAsync_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1323): undefined reference to `cuMemcpyHtoDAsync_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x158a): undefined reference to `cuEventRecord'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x15bf): undefined reference to `cuMemcpyHtoDAsync_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x15ec): undefined reference to `cuMemsetD8Async'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1874): undefined reference to `cuFuncSetBlockShape'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1892): undefined reference to `cuLaunchGridAsync'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a1a): undefined reference to `cuLaunchGridAsync'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a32): undefined reference to `cuEventRecord'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a46): undefined reference to `cuEventSynchronize'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1a69): undefined reference to `cuEventElapsedTime'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1aca): undefined reference to `cuMemcpyDtoHAsync_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1adb): undefined reference to `cuStreamSynchronize'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x1d37): undefined reference to `cuMemsetD8Async'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x2347): undefined reference to `cuMemFree_v2'
libmsieve.a(stage1_sieve_gpu.no):stage1_sieve_gpu.c:(.text+0x236c): undefined reference to `cuMemAlloc_v2'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x22b): undefined reference to `cuInit'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x23b): undefined reference to `cuDeviceGetCount'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x268): undefined reference to `cuDeviceGet'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x287): undefined reference to `cuDeviceGetName'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x2a1): undefined reference to `cuDeviceComputeCapability'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x2b5): undefined reference to `cuDeviceGetProperties'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x31e): undefined reference to `cuDeviceTotalMem_v2'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x339): undefined reference to `cuDeviceGetAttribute'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x354): undefined reference to `cuDeviceGetAttribute'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x36a): undefined reference to `cuDeviceGetAttribute'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x527): undefined reference to `cuModuleGetFunction'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x53d): undefined reference to `cuFuncGetAttribute'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x5dd): undefined reference to `cuParamSetSize'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x6e0): undefined reference to `cuParamSetv'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x720): undefined reference to `cuParamSetv'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x75a): undefined reference to `cuParamSeti'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x79a): undefined reference to `cuParamSeti'
libmsieve.a(cuda_xface.o):cuda_xface.c:(.text+0x7e8): undefined reference to `cuParamSetv'
collect2.exe: error: ld returned 1 exit status
Makefile:266: recipe for target 'all' failed
make: *** [all] Error 1[/CODE]

jasonp 2016-01-01 03:30

Is there a 64-bit cuda.lib?

ATH 2016-01-01 03:32

Yes, there is a "cuda.lib" in the "Cuda5.5\lib\x64" directory.

Edit: Thank you! The CUDA libs was set to win32 in the Makefile:

CUDA_LIBS = "$(CUDA_ROOT)/lib/win32/cuda.lib"

I changed win32 to x64 and now it compiled!!!!!

ATH 2016-01-01 05:23

So I got the CUDA version compiled and it works. Trying the polynomial selection it is running with 0% CPU load and 99% GPU load, but during compiling it still had all the same warnings as posted earlier.

wombatman 2016-01-01 05:56

With CUDA 5.5, right?

jasonp 2016-01-01 12:16

I assume so. The makefile would need to change to remove SM1.x compiling and link the 64-bit library.

ATH: this is a work machine and I don't have administrator access. If I could turn off the AV software, malware could as well :)

debrouxl 2016-01-01 16:56

Various classes of malware are known not to require administrator access to turn off so-called security software, that said :smile:

ATH 2016-01-01 20:15

Here are the Msieve svn 988 versions I managed to compile for Haswell, Sandy Bridge and Core2.

Remember I got a lot of warnings during compiling, see post #10.

[URL="hoegge.dk/mersenne/msieve-svn988-haswell.zip"]msieve-svn988-haswell.zip[/URL]
[URL="hoegge.dk/mersenne/msieve-svn988-sandybridge.zip"]msieve-svn988-sandybridge.zip[/URL]
[URL="hoegge.dk/mersenne/msieve-svn988-core2.zip"]msieve-svn988-core2.zip[/URL]

Googulator 2016-01-01 23:07

Tried the Core 2 version, it complains about missing zlib1.dll. I guess zlib should be compiled in statically.

ATH 2016-01-01 23:24

I did not compile zlib, it was preinstalled in MSYS2. I added "zlib1.dll" from MSYS2 to the zip files. Try and download it again.

Googulator 2016-01-02 01:34

Thank you, it appears to work now.

However, is there a way to disable GPU poly selection? It's trying and failing because I have a CUDA 1.x card.

wombatman 2016-01-02 04:47

I think as long as you don't use the "-g" parameter it should use CPU.

Googulator 2016-01-02 05:49

I tried that, it uses GPU no matter that. It doesn't even fall back in case of failure.

Perhaps it would work properly on a computer with a non-Nvidia GPU, but it's completely unusable if you have an older Nvidia.

wombatman 2016-01-02 06:41

Gotcha. If you go here: [url]http://gilchrist.ca/jeff/factoring/index.html[/url] you can find an i7/i5 version of MSieve 1.52. It's not the absolute latest version, but it should get the job done for you.

Googulator 2016-01-02 07:31

Yes, the SVN 883 version I mentioned in the 1st post (SVN 939 is good for poly selection only, it crashes in linear algebra). And it won't run on Core 2 - for that, the last version is 1.50 only.

bgbeuning 2016-01-02 12:09

[QUOTE=jasonp;420607]My problem is not that GMP doesn't work, it's that Windows 7 has some kind of background service that holds onto executables just long enough for the configure script to fail trying to delete temporary exe's that it generates. Thus configure doesn't work and I can't get to the next step.
[/QUOTE]

I have a similar problem using MSDEV. I always suspected the Virus scanner is checking the new EXE for a virus. It doesn't know the difference between downloading an EXE from the internet and a tool chain writing an EXE file.

ATH 2016-01-02 12:59

Here are the Msieve svn988 I compiled a few days ago without CUDA support (before I figured out how to):

[URL="hoegge.dk/mersenne/msieve-svn988-nogpu-haswell.zip"]msieve-svn988-nogpu-haswell.zip[/URL]
[URL="hoegge.dk/mersenne/msieve-svn988-nogpu-sandybridge.zip"]msieve-svn988-nogpu-sandybridge.zip[/URL]
[URL="hoegge.dk/mersenne/msieve-svn988-nogpu-core2.zip"]msieve-svn988-nogpu-core2.zip[/URL]

wombatman 2016-01-02 20:46

This is embarrassing...
 
I'm trying to compile MSieve on Ubuntu 14.04 using the following (after doing make clean):

[CODE]make all ECM=1 CUDA=1[/CODE]

But it's throwing errors related to omp_get_thread_limit and other "omp" related functions. Is it trying to compile with MPI or am I forgetting something else?

jasonp 2016-01-02 23:55

Something in the compile chain wants OpenMP (probably GMP-ECM). Try adding -lomp to the link line (there will probably be more dependencies) or compiling with -mopenmp.

wombatman 2016-01-03 03:53

Will do. I did notice it was something with GMP-ECM and compiled it with just CUDA. That worked fine, but when I try and actually run msieve with a C134, I get a CUDA_ERROR_OUT_OF_MEMORY (4x) followed by CUDA_ERROR_DEINITIALIZED immediately as stage 1 of poly selection starts. This is with CUDA 5.5. Any idea what that might be from?

wombatman 2016-01-03 07:54

Might have figured it out. I, by way of habit from Windows, ran Msieve with "-t 16" at first, which is when the error popped up. When I took the "-t" parameter out completely, it seems to be fine. I'm going to let it run overnight and confirm that that's the case.

jasonp 2016-01-03 13:30

Yes, stage 1 of poly selection is multithreaded and needs a limit on the number of threads that make sense. I would never use more than 4, and beyond about C140 there's basically no difference in stage 1 throughput compared to one thread.

wombatman 2016-01-03 15:01

Yeah, it's more that on Windows you can put something like 16, and it will work fine (or at least not throw any errors--maybe performance suffers).

Googulator 2016-01-04 10:54

The non-GPU version works fine, thanks.

jasonp 2016-01-04 15:24

Does anyone want to test SVN 991 with a CUDA toolkit > 5.5? This replaces the GPU sorting code with CUB and streamlines some of the GPU setup. 64-bit windows builds should compile with 'make all WIN=1 WIN64=1 CUDA=1 NO_ZLIB=1'.

Single-threaded and multithreaded runs work fine for me on a range of input sizes with the v6.5 toolkit. It would be nice if someone can also confirm that it builds and runs okay in linux.

ATH 2016-01-04 18:37

What about CUDA 7.0 and 7.5? Are they still not compatible?

wombatman 2016-01-04 19:29

Working on building it with 7.0, but running into some issues (not with compilation, but immediate crash when run). I'm going to try a few things first.
[STRIKE]
Edit: Yeah, still an immediate crash when compiled in MSieve. I recall some sort of issue with crossing between 32-bit and 64-bit dlls or something. Also working on getting it compiled in VS2012, but I'm running into issues there with mpir.lib and libecm.lib throwing unresolved external symbol errors at the last step of linking.[/STRIKE]

Edit 2: Alright, everything compiled fine in VS2012 and seems to work fine (I can start it and watch poly numbers go screaming by on the screen for a few seconds before I stop it). I'll do more testing in a bit, but it looks like you have successfully updated Msieve! Thanks a ton for taking the time to do it.

wombatman 2016-01-05 01:15

MSieve 991M compiled with CUDA 7
 
1 Attachment(s)
Here's the exe along with the pthreads dll and sort engine dll and ptx files. I added a CC5.2 ptx compilation. Don't know which are needed or not, but everything is there except for the cudart dlls. They'd be too big for posting here, I think. CPU is an Ivy Bridge.

chris2be8 2016-01-05 17:29

[QUOTE=ATH;421211]What about CUDA 7.0 and 7.5? Are they still not compatible?[/QUOTE]

I made it work with CUDA 7.5 by editing the makefiles to remove CC 1.1 and 1.3:
[url]http://mersenneforum.org/showpost.php?p=416072&postcount=20[/url]

Chris

ATH 2016-01-06 00:47

SVN991 compiled fine with CUDA 7.5 without tinkering with the Makefile beyond the usual parameters. It even worked with ZLIB on. What is the benefit of Zlib, it compresses relations? It is better to leave NO_ZLIB=1 ?

It seems to work, it searches for a poly at least. I tried to use parameters from this old RSA896 thread:
[URL="http://www.mersenneforum.org/showthread.php?t=17460"]http://www.mersenneforum.org/showthread.php?t=17460[/URL]
but it is not finding the same polynomials or I do not know what I'm doing, which is far more likely.

Anyone have some more recent parameters that should find a polynomial just for a test?

jasonp 2016-01-06 01:55

Compiling zlib in allows the binary to read and write compressed relation files.

I don't think there's a controlled test you can run that will find a known polynomial. There's a lot of checking in polyselect stage 2, so if you find any polynomials at all it's probably working fine. Of course with a hot modern GPU you will find stage 1 hits so fast that performing stage 2 will leave the GPU mostly idle.

Sorry to everyone that it took so long to get back to a working state.

LaurV 2016-01-06 02:41

[QUOTE=ATH;421344]SVN991 compiled fine with CUDA 7.5 without tinkering with the Makefile beyond the usual parameters. It even worked with ZLIB on. What is the benefit of Zlib, it compresses relations? It is better to leave NO_ZLIB=1 ?
[/QUOTE]
If you have enough space, and don't intend to move the files much around, and don't have a terrible slow hdd, then yes. Disadvantages, as shown by the former sentence, are that the uncompressed files are big, taking a lot of space and being difficult to move from folders to folders, or share on the web, the are slow to read when resuming the work, etc. Advantages is that if the zipped file crash - and the probability is not null, because the files are big and not fast to handle/read/write - then you can most probably say bye-bye to all your relations, but when the file is in clear, the crashed lines are just ignored, and most of the relations are still recoverable.

ATH 2016-01-06 14:21

Here are the svn 991 files compiled without Zlib:

[URL="hoegge.dk/mersenne/msieve-svn991-cuda75-haswell.zip"]msieve-svn991-cuda75-haswell.zip[/URL]
[URL="hoegge.dk/mersenne/msieve-svn991-cuda75-sandybridge.zip"]msieve-svn991-cuda75-sandybridge.zip[/URL]

Gimarel 2016-01-07 07:03

Slower
 
On my System (Linux, GeForce GTX 650 Ti) the new version (compiled with Cuda 6.5, using sm30) is about 20% slower than the old version (compiled with Cuda 6.0, using sm20).

For the comparison I disabled the randomization. The card isn't used for the display.

I haven't had the time to try the old version with Cuda 6.5.

debrouxl 2016-01-07 07:15

I only use rotating HDDs, and using gzipped relations, as produced by NFS@Home clients for bandwidth reasons, usually saves filtering time compared to dealing with uncompressed relations. Possibly square root time as well, as the bottleneck when reading relations is on I/O.

On modern zlib versions, the CPU cost of dealing with uncompressed output is minimal.

VictordeHolland 2016-01-07 12:24

1 Attachment(s)
[QUOTE=ATH;420975]Here are the Msieve svn988 I compiled a few days ago without CUDA support (before I figured out how to):

[URL="http://hoegge.dk/mersenne/msieve-svn988-nogpu-haswell.zip"]msieve-svn988-nogpu-haswell.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn988-nogpu-sandybridge.zip"]msieve-svn988-nogpu-sandybridge.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn988-nogpu-core2.zip"]msieve-svn988-nogpu-core2.zip[/URL][/QUOTE]
These need libwinpthread-1.dll to work on a machine without minGW and Visual Studio installed.

ATH 2016-01-07 15:38

[QUOTE=VictordeHolland;421455]These need libwinpthread-1.dll to work on a machine without minGW and Visual Studio installed.[/QUOTE]

Thanks, I did not get a chance to test them on another computer. I added my "libwinpthread-1.dll" from MSYS2 to all the zip-files.

I also added "nvcuda.dll" to the gpu versions just to be safe, although it might be present on all computers with Nvidia drivers installed?

Googulator 2016-01-09 05:29

991-haswell doesn't work unless I delete nvcuda.dll - apparently that DLL is highly driver-specific and must be loaded from the system.

pinhodecarlos 2018-02-03 23:21

ATH, any possibility to compile msieve taking into consideration the large vectors addon presented here: [url]http://www.mersenneforum.org/showthread.php?t=22386[/url]

ATH 2018-02-05 07:15

I do not think so unfortunately. It seems you need multithreading MPI=1 to benefit from those large vectors right?

And I cannot seem to compile OpenMPI in Msys2 and it cannot be downloaded with a package. OpenMP comes with the gcc package in Msys2 but I'm not sure what OpenMP is vs OpenMPI. Baseically I have never compiled multithreaded applications.


Here is a "normal" build of the latest svn 1018 with CUDA enabled. Compiled with:
make all WIN=1 WIN64=1 ECM=1 CUDA=1 NO_ZLIB=1

[URL="http://hoegge.dk/mersenne/msieve-svn1018-cuda75-haswell.zip"]msieve-svn1018-cuda75-haswell.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-cuda75-sandybridge.zip"]msieve-svn1018-cuda75-sandybridge.zip[/URL]

pinhodecarlos 2018-02-05 21:23

I thought it was able for the two versions, with and without MPI capability.

Thank you although I only have an ivy bridge machine but the binaries will be useful for the NFS@Home team.

ATH 2018-02-05 22:56

You can use the Sandy Bridge build on the Ivy Bridge. I doubt a dedicated Ivy Bridge build would be much faster.

jasonp 2018-02-06 00:32

OpenMP adds a thread pool to compiled programs, along with a set of pragma command to perform task decomposition and feed the thread pool. It's totally different from MPI.

ATH 2018-02-06 19:42

So apparently the large vectors makes sense even without MPI=1:


make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=1 VBITS=64/128/256:


[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-haswell.zip"]msieve-svn1018-vbits64-haswell.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-haswell.zip"]msieve-svn1018-vbits128-haswell.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-haswell.zip"]msieve-svn1018-vbits256-haswell.zip[/URL]

[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-sandybridge.zip"]msieve-svn1018-vbits64-sandybridge.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-sandybridge.zip"]msieve-svn1018-vbits128-sandybridge.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-sandybridge.zip"]msieve-svn1018-vbits256-sandybridge.zip[/URL]

pinhodecarlos 2018-04-17 20:35

[QUOTE=ATH;479393]You can use the Sandy Bridge build on the Ivy Bridge. I doubt a dedicated Ivy Bridge build would be much faster.[/QUOTE]
Apologies for only now coming back to you, it works perfectly fine on Ivy. Thank you so much.

pinhodecarlos 2018-11-07 22:33

[QUOTE=ATH;479462]So apparently the large vectors makes sense even without MPI=1:


make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=1 VBITS=64/128/256:


[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-haswell.zip"]msieve-svn1018-vbits64-haswell.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-haswell.zip"]msieve-svn1018-vbits128-haswell.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-haswell.zip"]msieve-svn1018-vbits256-haswell.zip[/URL]

[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits64-sandybridge.zip"]msieve-svn1018-vbits64-sandybridge.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits128-sandybridge.zip"]msieve-svn1018-vbits128-sandybridge.zip[/URL]
[URL="http://hoegge.dk/mersenne/msieve-svn1018-vbits256-sandybridge.zip"]msieve-svn1018-vbits256-sandybridge.zip[/URL][/QUOTE]

Hi, can you please compile it again for Revision 1028. Tia. Carlos

VictordeHolland 2018-11-07 23:41

SVN1028 Win64 builds
 
3 Attachment(s)
Windows 64bit builds using MSYS2/mingw64 of msieve SVN1028

[code]gcc version 8.2.0 (Rev3, Built by MSYS2 project)[/code]Built on a i7 3770 (IvyBridge)

make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=0 VBITS=64/128/256

I think I've included all the necessary .dlls, but do reply if you miss any. Very little testing done, but enjoy.

P.S. I've included zlib, since I like to keep the relations compressed (.gz) and don't want to unzip a few GBs of relations when I post-process NFS@home jobs.

pinhodecarlos 2019-02-19 07:24

Anyone has a msieve version prior to svn988 for windows ivy bridge?

Garfield 2019-03-31 23:48

Could someone share msieve v1028 compiled files for win64 and cuda 10 (or newest 10.1)?


I tried to compile it myself (vs2017) and got several errors in gmp-ecm and msieve sources. (mpir and pthreads compiled fine without erros)

BfoX 2019-12-05 19:27

try to use SVN 1005 and got error:
[code]
Msieve v. 1.53 (SVN 1005)
Fri Dec 06 00:25:41 2019
random seeds: 35e6e7a0 d0500745
factoring 2881039827457895971881627053137530734638790825166127496066674320241571
446494762386620442953820735453 (100 digits)
searching for 15-digit factors
commencing number field sieve (100-digit input)
commencing number field sieve polynomial selection
polynomial degree: 4
max stage 1 norm: 1.58e+017
max stage 2 norm: 3.44e+015
min E-value: 8.85e-009
poly select deadline: 1317
time limit set to 0.37 CPU-hours
expecting poly E from 1.43e-008 to > 1.64e-008
searching leading coefficients from 1 to 4000
using GPU 0 (GeForce GTX 1050 Ti)
selected card has CUDA arch 6.1
deadline: 5 CPU-seconds per coefficient
error (line 1116): CUDA_ERROR_FILE_NOT_FOUND
Msieve Error: return value 4294967295. Is CUDA enabled? Terminating...
[/code]

jasonp 2019-12-09 04:20

What card and CUDA version are you using? Do you run the binary from command line, from the directory above the one with the PTX files?

BfoX 2019-12-10 15:58

using GPU 0 (GeForce GTX 1050 Ti) selected card has CUDA arch 6.1
win 8.1 x64
all ptx-file in same directory
run from python script

RichD 2019-12-13 09:50

[QUOTE=BfoX;532553]using GPU 0 (GeForce GTX 1050 Ti) selected card has CUDA arch 6.1
win 8.1 x64
all ptx-file in same directory
run from python script[/QUOTE]

It sounds like the stage1_core_sm60.ptx file is missing. (or sm61 ?)

You will have to add that to the "make" file and rebuild or find another binary which includes this later ptx file.

BfoX 2019-12-13 12:58

[QUOTE=RichD;532780]It sounds like the stage1_core_sm60.ptx file is missing. (or sm61 ?).[/QUOTE]


the card support a low version of the ptx-files.

RichD 2019-12-13 14:27

[QUOTE=BfoX;532784]the card support a low version of the ptx-files.[/QUOTE]

I don't know how to tell msieve to use a lower version ptx-file.

Perhaps this [url=https://www.mersenneforum.org/showthread.php?t=23685]thread[/url] is helpful or post your question/request there.

bsquared 2019-12-17 20:59

[QUOTE=BfoX;532116]try to use SVN 1005 and got error:
[code]
Msieve v. 1.53 (SVN 1005)
Fri Dec 06 00:25:41 2019
random seeds: 35e6e7a0 d0500745
factoring 2881039827457895971881627053137530734638790825166127496066674320241571
446494762386620442953820735453 (100 digits)
searching for 15-digit factors
commencing number field sieve (100-digit input)
commencing number field sieve polynomial selection
polynomial degree: 4
max stage 1 norm: 1.58e+017
max stage 2 norm: 3.44e+015
min E-value: 8.85e-009
poly select deadline: 1317
time limit set to 0.37 CPU-hours
expecting poly E from 1.43e-008 to > 1.64e-008
searching leading coefficients from 1 to 4000
using GPU 0 (GeForce GTX 1050 Ti)
selected card has CUDA arch 6.1
deadline: 5 CPU-seconds per coefficient
error (line 1116): CUDA_ERROR_FILE_NOT_FOUND
Msieve Error: return value 4294967295. Is CUDA enabled? Terminating...
[/code][/QUOTE]

I had the same problem with a arch 7.5 card. In msieve's source file stage1_sieve_gpu.c file, starting at line 1106, is this code:

[CODE]if (d->gpu_info->compute_version_major == 2) {
CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm20.ptx"))
}
else if (d->gpu_info->compute_version_major == 3) {
if (d->gpu_info->compute_version_minor < 5)
CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm30.ptx"))
else
CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm35.ptx"))
}
else if (d->gpu_info->compute_version_major >= 5) {
CUDA_TRY(cuModuleLoad(&t->gpu_module, "stage1_core_sm50.ptx"))
}
else
{
printf("sorry, Nvidia doesn't want to support your card\n");
exit(-1);
}
[/CODE]

Which for anything at or above compute capability 5 attempts to load stage1_core_sm50.ptx.

I tried the following, which gets past the error message but appears to hang once the card gets going:

1) edit gpu_sm.props in the build.cuda.vc15 directory to update the CC_major and CC_minor fields (mine were 7 and 5)
2) in visual studio, edit the stage1_core_sm property pages by adding compute_75,sm_75 to the Code Generation item of the CUDA C/C++ section (or in your case, I guess compute_61,sm_61)
3) rebuild

This will make a stage1_core_sm61.ptx file in the bin/ directory alongside the msieve.exe executable. You'll then have to rename to stage1_core_sm50.ptx for it to be loaded by the code above.

But, like I said, some step or another is still missing because the code now just hangs after starting the search.

Jarod 2020-08-30 07:47

[QUOTE=VictordeHolland;499852]Windows 64bit builds using MSYS2/mingw64 of msieve SVN1028

[code]gcc version 8.2.0 (Rev3, Built by MSYS2 project)[/code]Built on a i7 3770 (IvyBridge)

make all WIN=1 WIN64=1 ECM=1 CUDA=0 NO_ZLIB=0 VBITS=64/128/256

I think I've included all the necessary .dlls, but do reply if you miss any. Very little testing done, but enjoy.

P.S. I've included zlib, since I like to keep the relations compressed (.gz) and don't want to unzip a few GBs of relations when I post-process NFS@home jobs.[/QUOTE]
Is SVN 1028 still the latest version of M Sleve for Windows 10 64-bit?
Thanks
Jarod


All times are UTC. The time now is 01:12.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.