![]() |
![]() |
#1 |
"Curtis"
Feb 2005
Riverside, CA
2×3×953 Posts |
![]() |
![]() |
![]() |
![]() |
#2 | |
Aug 2002
857610 Posts |
![]()
We were told:
Quote:
|
|
![]() |
![]() |
![]() |
#3 |
Jul 2003
So Cal
33×97 Posts |
![]() |
![]() |
![]() |
![]() |
#4 | |
Bamboozled!
"๐บ๐๐ท๐ท๐ญ"
May 2003
Down not across
11,719 Posts |
![]() Quote:
The three systems still in use have a 460, a 970, and a 1060 with drivers 390.138, 390.144 and 390.141 respectively. Do you think your new code might run on any of those? If so, I will try again to get CUDA installed and working. Thanks. |
|
![]() |
![]() |
![]() |
#5 |
Jul 2003
So Cal
A3B16 Posts |
![]()
Technically yes, but consumer cards don't have enough memory to store interesting matrices. If the GTX 1060 has 6GB, it could run matrices up to about 5Mx5M. The problem is that block Lanczos requires multiplying by both the matrix and its transpose, but gpus only seem to work well with the matrix in CSR, which doesn't allow efficiently calculating the transpose. So we load both the matrix and its transpose onto the card.
It would be possible to create a version that stores the matrices in system memory and loads the next matrix block into GPU memory while calculating the product with the current block. The block size is adjustable, but I don't know how performant that would be. |
![]() |
![]() |
![]() |
#6 |
Aug 2002
218016 Posts |
![]()
How important is ECC on a video card? (Most consumer cards don't have that, right?)
Our card has it, and we have it enabled, but it runs faster without. We haven't logged an ECC error yet. Note the "aggregate" counter described below. Code:
ECC Errors NVIDIA GPUs can provide error counts for various types of ECC errors. Some ECC errors are either single or double bit, where single bit errors are corrected and double bit errors are uncorrectable. Texture memory errors may be correctable via resend or uncorrectable if the resend fails. These errors are available across two timescales (volatile and aggregate). Single bit ECC errors are automatically corrected by the HW and do not result in data corruption. Double bit errors are detected but not corrected. Please see the ECC documents on the web for information on compute application behavior when double bit errors occur. Volatile error counters track the number of errors detected since the last driver load. Aggregate error counts persist indefinitely and thus act as a lifetime counter. ![]() |
![]() |
![]() |
![]() |
#7 | |
Jul 2003
So Cal
50738 Posts |
![]() Quote:
Code:
using VBITS=512 matrix is 42100909 x 42101088 (20033.9 MB) with weight 6102777434 (144.96/col) ... using GPU 0 (Tesla V100-SXM2-32GB) <-------- 32 GB card ... vector memory use: 17987.6 MB <-- 7 x matrix columns x VBITS / 8 bytes on card, adjust VBITS as needed dense rows memory use: 2569.6 MB <-- on card but could be moved to cpu memory sparse matrix memory use: 30997.3 MB <-- Hosted in cpu memory, transferred on card as needed memory use: 51554.6 MB <-- significantly exceeds 32 GB Allocated 357.7 MB for SpMV library ... linear algebra completed 33737 of 42101088 dimensions (0.1%, ETA 133h21m) |
|
![]() |
![]() |
![]() |
#8 |
Jul 2003
So Cal
A3B16 Posts |
![]()
What's your risk tolerance? msieve has robust error detection so it's not as important. But it's usually a small price to ensure no memory faults.
|
![]() |
![]() |
![]() |
#9 |
Mar 2019
14A16 Posts |
![]()
Are there instructions on how to check out and build the msieve GPU LA code? Is it in trunk or a separate branch?
|
![]() |
![]() |
![]() |
#10 | |
"Curtis"
Feb 2005
Riverside, CA
2·3·953 Posts |
![]() Quote:
I hope this means you'll be digging out of your matrix backlog from the big siever queue. ![]() |
|
![]() |
![]() |
![]() |
#11 | |
Jul 2003
So Cal
33·97 Posts |
![]() Quote:
git clone https://github.com/gchilders/msieve_nfsathome.git -b msieve-lacuda-nfsathome cd msieve_nfsathome make all VBITS=128 CUDA=XX where XX is the two-digit CUDA compute capability of your GPU. Specifying CUDA=1 defaults to a compute capability of 60. You may want to experiment with both VBITS=128 and VBITS=256 to see which is best on your GPU. If you want to copy msieve to another directory, you need the msieve binary, both *.ptx files, and in the cub directory both *.so files. Or just run it from the build directory. Last fiddled with by frmky on 2021-08-12 at 08:17 Reason: Add specifying the compute capability on the make command line. |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Resume linear algebra | Timic | Msieve | 35 | 2020-10-05 23:08 |
use msieve linear algebra after CADO-NFS filtering | aein | Msieve | 2 | 2017-10-05 01:52 |
Has anyone tried linear algebra on a Threadripper yet? | fivemack | Hardware | 3 | 2017-10-03 03:11 |
Linear algebra at 600% | CRGreathouse | Msieve | 8 | 2009-08-05 07:25 |
Linear algebra proof | Damian | Math | 8 | 2007-02-12 22:25 |