mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2023-02-25, 17:53   #122
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2·33·72 Posts
Default

Ugh. I wish nVidia would stop tweaking the CUB implementation with every CUDA release! Anyway, in cub/Makefile, line 87, try replacing
Code:
INC = -I"$(CUDA_ROOT)/include" -I.
with
Code:
INC = -I. -I"$(CUDA_ROOT)/include"
frmky is online now   Reply With Quote
Old 2023-02-25, 19:03   #123
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

1,559 Posts
Default

Wonderful, that worked, thank you! For everyone else: This is without my file changed with sed.
kruoli is offline   Reply With Quote
Old 2023-02-26, 18:36   #124
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

74016 Posts
Default

Quick question for GPU LA. Does it cause screen slowdown if the GPU is also powering monitors? Just trying to get a handle on whether I could still do simple tasks (web browsing, word processing, etc.) without hitches.
wombatman is offline   Reply With Quote
Old 2023-02-27, 00:36   #125
Xyzzy
 
Xyzzy's Avatar
 
Aug 2002

3·43·67 Posts
Default

Simple tasks work fine.

Xyzzy is offline   Reply With Quote
Old 2023-02-27, 16:27   #126
RichD
 
RichD's Avatar
 
Sep 2008
Kansas

392310 Posts
Default

Another minor issue with this branch. After finding the first factor of a 3-way (multi-way) split, it starts over at -nc2.
Code:
linear algebra completed 7718825 of 7719307 dimensions (100.0%, ETA 0h 0m)    
lanczos halted after 30244 iterations (dim = 7718903)
recovered 40 nontrivial dependencies
BLanczosTime: 14106

commencing square root phase
handling dependencies 1 to 1
reading relations for dependency 1
read 3859455 cycles
cycles contain 12436602 unique relations
read 12436602 relations
multiplying 12436602 relations
multiply complete, coefficients have about 306.54 million bits
initial square root is modulo 316501
found factor: 2233445501375914764336106503459352175207245068609895555827367775642317
sqrtTime: 944
commencing number field sieve (152-digit input)
warning: NFS input not found in factor base file
integrator failed nan inf
R0: 0
A0: 0
skew 1.00, size 0.000e+00, alpha 0.000, combined = 0.000e+00 rroots = 0

commencing linear algebra
using VBITS=256
skipping matrix build
matrix starts at (0, 0)
matrix is 7719143 x 7719307 (3356.4 MB) with weight 956519666 (123.91/col)
sparse part has weight 794939492 (102.98/col)
saving the first 240 matrix rows for later
matrix includes 256 packed rows
matrix is 7718903 x 7719307 (3084.2 MB) with weight 723462162 (93.72/col)
sparse part has weight 685003660 (88.74/col)
using GPU 0 (NVIDIA GeForce RTX 3060)
selected card has CUDA arch 8.6
Nonzeros per block: 250000000
converting matrix to CSR and copying it onto the GPU
Killed
Edit: Or maybe it starts over with the command line. I specified "./msieve -nc2 -nc3".

Last fiddled with by RichD on 2023-02-27 at 16:29
RichD is offline   Reply With Quote
Old 2023-02-27, 19:59   #127
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2·33·72 Posts
Default

Quote:
Originally Posted by RichD View Post
Code:
handling dependencies 1 to 1
Edit: Or maybe it starts over with the command line. I specified "./msieve -nc2 -nc3".
I'm not sure why it only tried one dependency. I used
Code:
./msieve -nc2 "skip_matbuild=1" -nc3 -g 0 -t 23 -v
and it reporting handling all dependencies.
frmky is online now   Reply With Quote
Old 2023-02-27, 21:59   #128
RichD
 
RichD's Avatar
 
Sep 2008
Kansas

3,923 Posts
Default

Quote:
Originally Posted by RichD View Post
Code:
handling dependencies 1 to 1

warning: NFS input not found in factor base file
integrator failed nan inf
R0: 0
A0: 0
skew 1.00, size 0.000e+00, alpha 0.000, combined = 0.000e+00 rroots = 0
Something went bad wrong. I didn't specify anything after -nc3 and it couldn't find a good factor base file.
I've since rebooted and the log file doesn't help much.
RichD is offline   Reply With Quote
Old 2023-03-31, 20:51   #129
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

37×157 Posts
Default

Quote:
Originally Posted by frmky View Post
Use the msieve-lacuda-nfsathome-cuda11.5 branch. Or upgrade your CUDA toolkit. CUDA 11.6 introduced breaking changes to CUB.
How do I find this branch?

I updated to ubuntu 22.04, wiped nvidia drivers & tools, and installed toolkit and drivers. nvidia-smi reports cuda version 11.6 (driver version 510), nvcc -V reports 11.5.

Seth's ecm=cgbn compiled successfully last night under this environment, but following directions from Plutie's post #116 gets me:

Code:
/usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’:
  435 |         function(_Functor&& __f)
      |                                                                                                                                                 ^ 
/usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
/usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
  530 |         operator=(_Functor&& __f)
      |                                                                                                                                                  ^ 
/usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
make[1]: *** [Makefile:106: sort_engine.so] Error 1
make[1]: Leaving directory '/home/vbcurtis/math/msieve/msieve_nfsathome/cub'
make: *** [Makefile:364: cub/built] Error 2
VBCurtis is online now   Reply With Quote
Old 2023-04-01, 03:25   #130
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

2×33×72 Posts
Default

You'll get this error when using CUB with nvcc 11.5 or 11.6 with gcc 11+. nvcc 11.5 and 11.6 require gcc 10.

To answer your question, though, replace "-b msieve-lacuda-nfsathome" with "-b msieve-lacuda-nfsathome-cuda11.5" in the git clone command. Or after it's downloaded in the msieve_nfsathome directory use
Code:
git checkout msieve-lacuda-nfsathome-cuda11.5
frmky is online now   Reply With Quote
Old 2023-04-01, 04:59   #131
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

37·157 Posts
Default

OK, I now have gcc-10 installed, and selected as the default gcc (but gcc-11 is still on the system).
Compilation gets further, but fails:
Code:
"/usr/bin/nvcc" -arch sm_50 -ptx -DVBITS=128 -o lanczos_kernel.ptx common/lanczos/gpu/lanczos_kernel.cu
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/include/c++/11/bits/stl_pair.h(460): error: argument list for class template "std::pair" is missing

/usr/include/c++/11/bits/stl_pair.h(460): error: expected a ")"

/usr/include/c++/11/bits/stl_pair.h(460): error: template parameter "_T1" may not be redeclared in this scope

/usr/include/c++/11/bits/stl_pair.h(460): error: expected a ";"

4 errors detected in the compilation of "common/lanczos/gpu/lanczos_kernel.cu".
make: *** [Makefile:357: lanczos_kernel.ptx] Error 1
This suggests that nvcc is using the g++-11 folder, even though g++-10 is the compiler set.
So, I tried editing the Makefile to set compiler as gcc-10 instead of gcc; same error at same point of compilation.

Should I uninstall gcc-11? I rarely build software other than that used on this forum (cado, msieve, ecm, gmp).

Edit: here are lines 459-461 of stl_pair.h:
Code:
#if __cpp_deduction_guides >= 201606
  template<typename _T1, typename _T2> pair(_T1, _T2) -> pair<_T1, _T2>;
#endif

Last fiddled with by VBCurtis on 2023-04-01 at 05:01 Reason: added quote from stl_pair.h
VBCurtis is online now   Reply With Quote
Old 2023-05-18, 13:21   #132
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

61716 Posts
Default

Finally, I got everything together for my single machine multi GPU setup. I downloaded the latest stable release of OpenMPI and configured it with CUDA, then downloaded the latest GIT commit of Greg's LA-CUDA branch and compiled everything.

I tried the data from the benchmark thread. After filtering, I ran mpirun -np 2 ../../msieve-mpi/msieve -nc2 -g 2 -t 16 -v (on a machine with 32 threads). It aborts almost immediately after starting to solve the matrix with:
Code:
...
commencing Lanczos iteration
vector memory use: 695.3 MB
dense rows memory use: 69.5 MB
sparse matrix memory use: 1857.8 MB
memory use: 2622.6 MB
Allocated 175.8 MB for SpMV library
[debian:250736] Read -1, expected 72901376, errno = 14
[debian:250737] Read -1, expected 72901376, errno = 14
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node debian exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
I have applied my NvLink bridge. What could be wrong?
kruoli is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Resume linear algebra Timic Msieve 35 2020-10-05 23:08
use msieve linear algebra after CADO-NFS filtering aein Msieve 2 2017-10-05 01:52
Has anyone tried linear algebra on a Threadripper yet? fivemack Hardware 3 2017-10-03 03:11
Linear algebra at 600% CRGreathouse Msieve 8 2009-08-05 07:25
Linear algebra proof Damian Math 8 2007-02-12 22:25

All times are UTC. The time now is 06:51.


Mon Jun 5 06:51:14 UTC 2023 up 291 days, 4:19, 0 users, load averages: 0.93, 1.01, 1.00

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔