20230225, 17:53  #122 
Jul 2003
So Cal
Posts 
Ugh. I wish nVidia would stop tweaking the CUB implementation with every CUDA release! Anyway, in cub/Makefile, line 87, try replacing
Code:
INC = I"$(CUDA_ROOT)/include" I. Code:
INC = I. I"$(CUDA_ROOT)/include" 
20230225, 19:03  #123 
"Oliver"
Sep 2017
Porta Westfalica, DE
Posts 
Wonderful, that worked, thank you! For everyone else: This is without my file changed with sed.

20230226, 18:36  #124 
I moo ablest echo power!
May 2013
Posts 
Quick question for GPU LA. Does it cause screen slowdown if the GPU is also powering monitors? Just trying to get a handle on whether I could still do simple tasks (web browsing, word processing, etc.) without hitches.

20230227, 00:36  #125 
Aug 2002
Posts 
Simple tasks work fine.

20230227, 16:27  #126 
Sep 2008
Kansas
Posts 
Another minor issue with this branch. After finding the first factor of a 3way (multiway) split, it starts over at nc2.
Code:
linear algebra completed 7718825 of 7719307 dimensions (100.0%, ETA 0h 0m) lanczos halted after 30244 iterations (dim = 7718903) recovered 40 nontrivial dependencies BLanczosTime: 14106 commencing square root phase handling dependencies 1 to 1 reading relations for dependency 1 read 3859455 cycles cycles contain 12436602 unique relations read 12436602 relations multiplying 12436602 relations multiply complete, coefficients have about 306.54 million bits initial square root is modulo 316501 found factor: 2233445501375914764336106503459352175207245068609895555827367775642317 sqrtTime: 944 commencing number field sieve (152digit input) warning: NFS input not found in factor base file integrator failed nan inf R0: 0 A0: 0 skew 1.00, size 0.000e+00, alpha 0.000, combined = 0.000e+00 rroots = 0 commencing linear algebra using VBITS=256 skipping matrix build matrix starts at (0, 0) matrix is 7719143 x 7719307 (3356.4 MB) with weight 956519666 (123.91/col) sparse part has weight 794939492 (102.98/col) saving the first 240 matrix rows for later matrix includes 256 packed rows matrix is 7718903 x 7719307 (3084.2 MB) with weight 723462162 (93.72/col) sparse part has weight 685003660 (88.74/col) using GPU 0 (NVIDIA GeForce RTX 3060) selected card has CUDA arch 8.6 Nonzeros per block: 250000000 converting matrix to CSR and copying it onto the GPU Killed Last fiddled with by RichD on 20230227 at 16:29 
20230227, 19:59  #127  
Jul 2003
So Cal
Posts 
Quote:
Code:
./msieve nc2 "skip_matbuild=1" nc3 g 0 t 23 v 

20230227, 21:59  #128  
Sep 2008
Kansas
Posts 
Quote:
I've since rebooted and the log file doesn't help much. 

20230331, 20:51  #129  
"Curtis"
Feb 2005
Riverside, CA
Posts 
Quote:
I updated to ubuntu 22.04, wiped nvidia drivers & tools, and installed toolkit and drivers. nvidiasmi reports cuda version 11.6 (driver version 510), nvcc V reports 11.5. Seth's ecm=cgbn compiled successfully last night under this environment, but following directions from Plutie's post #116 gets me: Code:
/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’: 435  function(_Functor&& __f)  ^ /usr/include/c++/11/bits/std_function.h:435:145: note: ‘_ArgTypes’ /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’: 530  operator=(_Functor&& __f)  ^ /usr/include/c++/11/bits/std_function.h:530:146: note: ‘_ArgTypes’ make[1]: *** [Makefile:106: sort_engine.so] Error 1 make[1]: Leaving directory '/home/vbcurtis/math/msieve/msieve_nfsathome/cub' make: *** [Makefile:364: cub/built] Error 2 

20230401, 03:25  #130 
Jul 2003
So Cal
Posts 
You'll get this error when using CUB with nvcc 11.5 or 11.6 with gcc 11+. nvcc 11.5 and 11.6 require gcc 10.
To answer your question, though, replace "b msievelacudanfsathome" with "b msievelacudanfsathomecuda11.5" in the git clone command. Or after it's downloaded in the msieve_nfsathome directory use Code:
git checkout msievelacudanfsathomecuda11.5 
20230401, 04:59  #131 
"Curtis"
Feb 2005
Riverside, CA
Posts 
OK, I now have gcc10 installed, and selected as the default gcc (but gcc11 is still on the system).
Compilation gets further, but fails: Code:
"/usr/bin/nvcc" arch sm_50 ptx DVBITS=128 o lanczos_kernel.ptx common/lanczos/gpu/lanczos_kernel.cu nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use Wnodeprecatedgputargets to suppress warning). /usr/include/c++/11/bits/stl_pair.h(460): error: argument list for class template "std::pair" is missing /usr/include/c++/11/bits/stl_pair.h(460): error: expected a ")" /usr/include/c++/11/bits/stl_pair.h(460): error: template parameter "_T1" may not be redeclared in this scope /usr/include/c++/11/bits/stl_pair.h(460): error: expected a ";" 4 errors detected in the compilation of "common/lanczos/gpu/lanczos_kernel.cu". make: *** [Makefile:357: lanczos_kernel.ptx] Error 1 So, I tried editing the Makefile to set compiler as gcc10 instead of gcc; same error at same point of compilation. Should I uninstall gcc11? I rarely build software other than that used on this forum (cado, msieve, ecm, gmp). Edit: here are lines 459461 of stl_pair.h: Code:
#if __cpp_deduction_guides >= 201606 template<typename _T1, typename _T2> pair(_T1, _T2) > pair<_T1, _T2>; #endif Last fiddled with by VBCurtis on 20230401 at 05:01 Reason: added quote from stl_pair.h 
20230518, 13:21  #132 
"Oliver"
Sep 2017
Porta Westfalica, DE
Posts 
Finally, I got everything together for my single machine multi GPU setup. I downloaded the latest stable release of OpenMPI and configured it with CUDA, then downloaded the latest GIT commit of Greg's LACUDA branch and compiled everything.
I tried the data from the benchmark thread. After filtering, I ran mpirun np 2 ../../msievempi/msieve nc2 g 2 t 16 v (on a machine with 32 threads). It aborts almost immediately after starting to solve the matrix with: Code:
... commencing Lanczos iteration vector memory use: 695.3 MB dense rows memory use: 69.5 MB sparse matrix memory use: 1857.8 MB memory use: 2622.6 MB Allocated 175.8 MB for SpMV library [debian:250736] Read 1, expected 72901376, errno = 14 [debian:250737] Read 1, expected 72901376, errno = 14  Primary job terminated normally, but 1 process returned a nonzero exit code. Per userdirection, the job has been aborted.   mpirun noticed that process rank 1 with PID 0 on node debian exited on signal 11 (Segmentation fault).  
