![]() |
![]() |
#144 | |
Jun 2012
24·3·5·17 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#145 |
Jul 2003
So Cal
7×383 Posts |
![]()
Lambda Labs now has H100's available for $2.40/hr. It will also solve a 29M matrix for about $30 without the complexity of OpenMPI or shutdowns.
|
![]() |
![]() |
![]() |
#146 | |
Sep 2008
Kansas
401710 Posts |
![]() Quote:
On-Demand Cloud: Spin up on-demand GPUs billed by the hour. H100 instances starting at $1.99/hr. Cloud Clusters: Reserve thousands of NVIDIA H100s with 3200 Gbps Infiniband. Starting at $1.89/hr. |
|
![]() |
![]() |
![]() |
#147 |
Jul 2003
So Cal
1010011110012 Posts |
![]()
I see that too, but when I logged it and tried to launch a H100 instance, they don't seem to be available right now. However, they did have a 40GB PCIe A100 for $1.10/hr. That's a much better deal for GPU LA. That should solve a 29M matrix for a bit under $20.
Edit: Wow, they even have 8x A100 SXM4 40GB for $8.80/hr. That'll solve a ~160M matrix (SNFS difficulty around 335 digits) for about $750. Last fiddled with by frmky on 2023-06-16 at 03:02 |
![]() |
![]() |
![]() |
#148 |
"Serge"
Mar 2008
San Diego, Calif.
2·3·1,733 Posts |
![]()
I decided to try Lambda; they have very limited user options. (AND you have to wait for any A100 or H100 instance to become available for hours if not days.)
You get a system of any color as long as it is black (in their case "Driver Version: 525.85.12 CUDA Version: 12.0"). msieve-lacuda-nfsathome-cuda11.5 branch doesn't build in cub/ -- Code:
cd cub && make WIN=0 WIN64=0 VBITS=256 sm=800 && cd .. make[1]: Entering directory '/home/ubuntu/G/msieve_nfsathome/cub' "/usr/bin/nvcc" -gencode=arch=compute_80,code=\"sm_80,compute_80\" -DSM800 -o sort_engine.so sort_engine.cu -Xptxas -v -Xcudafe -# -shared -Xcompiler -ffloat-store -Xcompiler -fPIC -Xcompiler -fvisibility=hidden -I. -I"/usr/bin/..//include" -O3 -DTHRUST_IGNORE_CUB_VERSION_CHECK /usr/include/cub/detail/device_synchronize.cuh(27): error: expected a ";" /usr/include/cub/detail/device_synchronize.cuh(34): error: this pragma must immediately precede a declaration /usr/include/cub/detail/device_synchronize.cuh(66): error: expected a declaration /usr/include/thrust/system/cuda/detail/util.h(61): error: expected a declaration /usr/include/thrust/system/cuda/detail/util.h(149): error: expected a ";" /usr/include/thrust/system/cuda/detail/util.h(151): error: expected a declaration /usr/include/thrust/system/cuda/detail/util.h(182): error: variable "cuda_cub" has already been defined ...downgrading to 11.5... ...compiles. |
![]() |
![]() |
![]() |
#149 | |
Jul 2003
So Cal
51718 Posts |
![]() Quote:
It should compile on CUDA 12. Is that a fresh clone of msieve-lacuda-nfsathome from github? If so, try changing line 91 of cub/Makefile from Code:
INC = -I"$(CUDA_ROOT)/include" -I. Code:
INC = -I. -I"$(CUDA_ROOT)/include" Last fiddled with by frmky on 2023-08-19 at 16:15 |
|
![]() |
![]() |
![]() |
#150 |
"Serge"
Mar 2008
San Diego, Calif.
289E16 Posts |
![]()
I used that cub/Makefile line 91 change, but on that system it didn't change the outcome (similar to above, which means maybe some macros are now refactored so at some place the generated code loses ';' and then everything barfs out).
Perhaps it has to do with their OS choice - they use "Ubuntu 20"-based linux. Maybe kernel code is not easily pluggable on Ubuntu, or maybe a needed to get what on RHEL goes with "sudo yum -y install kernel-devel-`uname -r`" On AWS I always choose Amazon's AMIs (RHEL-based, and all dialects that go with it). I also did everything from scratch previously (default AMI, then install CUDA from nvidia's procedures). Even more previously I built all in SLES-based instances (because I'd used to work in SLES-flavored environments for decades), but now juggled SLES RHEL, and now (for Lambda's use) Ubuntu. (zypper vs yum vs apt-get and keep memorized package lists and names). Doable. On AWS, I now get a "Deep Learning"-flavored AMI to not waste time on that low-level stuff. This time it was too late into the night, so I tried to install cuda-11.5 but then it conflicted with the system driver (and so I had to cleanly nuke it). Indeed while I installed cuda-11.5 it swapped kernel POST modules so with a bit of hesitation I ran 'sudo reboot', and nicely Lamdba doesn't take away the node while rebooting. Then everything compiled. Some next time I will try to hack into a node with cuda-12.0 some more. Lambda has even less available attached storage than nodes, so I cross my fingers now and expect the node to go on for ~3.5 days (that's what I ended up needing for the 45M matrix). On AWS I use a mounted drive to keep state. On Lambda? Nothing. I can scp the .chk files out somewhere, but for now decided to trust them to keep the node mine. The node is decent: AMD EPYC 7J13 64-Core Processor (lscpu: 30 cores, so probably virtualized), tons of RAM (200Gb), and a A100-SXM with 40GB. For $1.10/hr. P.S. The tropical storm is coming so both Gas&Electric and internet provider already robo-texted me that they might have outages. But I nohup'd and disown'd the process so maybe it will fly solo even if my shell will disconnect for a few hours. We'll see! |
![]() |
![]() |
![]() |
#151 |
"Serge"
Mar 2008
San Diego, Calif.
2·3·1,733 Posts |
![]()
On the linear algebra separate topic (not to lump with Lambda stuff), maybe it will be useful for someone:
I made a few tests on the 45M matrix that I have for 6,505- c202 (good size for this project; no need to oversieve) and it slightly doesn't fit into the 40G A100 card. So I groked a simple mnemonic rule: "if your matrix doesn't fit (even at VBITS=64) and we resort to using use_managed=1, then go as high as possible in VBITS." My ETAs were
Code:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... On | 00000000:06:00.0 Off | 0 | | N/A 50C P0 273W / 400W | 40534MiB / 40536MiB | 97% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1293 G /usr/lib/xorg/Xorg 4MiB | | 0 N/A N/A 5297 C ./msieve 21815MiB | +-----------------------------------------------------------------------------+ |
![]() |
![]() |
![]() |
#152 |
"Oliver"
Sep 2017
Porta Westfalica, DE
5·359 Posts |
![]()
Suggestions for those who would like to tinker with their GPUs for power efficiency: Do not use nvidia-smi -pl xxx if your card supports nvidia-smi -lgc 0,xxxx --mode=1 instead! The former will lower both core and memory clocks, the latter only the core clocks. With this, I was able to reduce the power consumption to 55-60 % while increasing the LA time only by less than 5 %.
This of course makes less sense when using the cloud… |
![]() |
![]() |
![]() |
#153 |
"Oliver"
Sep 2017
Porta Westfalica, DE
5×359 Posts |
![]()
Since I started doing GPU-LA, I always get
Code:
The call to cuIpcCloseMemHandle failed. This is a warning and the program will continue to run. cuIpcCloseMemHandle return value: 201 address: 0x7feab4000000 It is obvious that this is not a high-priority issues, but I thought I should mention it nontheless. |
![]() |
![]() |
![]() |
#154 |
Jul 2003
So Cal
7·383 Posts |
![]() |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Resume linear algebra | Timic | Msieve | 35 | 2020-10-05 23:08 |
use msieve linear algebra after CADO-NFS filtering | aein | Msieve | 2 | 2017-10-05 01:52 |
Has anyone tried linear algebra on a Threadripper yet? | fivemack | Hardware | 3 | 2017-10-03 03:11 |
Linear algebra at 600% | CRGreathouse | Msieve | 8 | 2009-08-05 07:25 |
Linear algebra proof | Damian | Math | 8 | 2007-02-12 22:25 |