mersenneforum.org Tesla K80 trial-factoring setup
 Register FAQ Search Today's Posts Mark Forums Read

 2022-09-16, 04:09 #1 AwesomeMachine   Apr 2018 USA 17 Posts Tesla K80 trial-factoring setup I put together a trial-factoring machine with parts from my basement and eBay. It's a mid tower with 1250W PSU, (2) Tesla K80 GPU boards, Asus ATX mobo, 4th gen. Intel i7 CPU, 32GB DDR3-3200, NVMe storage, and I'll throw in a DVD-RW for at least the OS install (Linux Debian 10 or 11). I've got the coolers for the Tesla cards, with 2.1A 40mm fans, ans 2 case fans 2.4A 120mm. It sounds like a jet taking off. I can only imagine what crypto-mining farms must sound like! I've got plenty of power and PCIe bandwidth, but I'm not totally clear on how to install the Tesla driver, what version of CUDA I should go with (Presently thinking 'mfaktc/mfakto'.). The k80 appeared the biggest bang for the buck in trial factoring work ~700GHz*days/day per board for $65+PC mounting brackets, although a bit pricey on power consumption. If it turns out I'm satisfied with this setup and all that goes along with manually procuring assignments and turning in work, I'll upgrade to more efficient GPUs. This is a first, and I'm unable to locate info on configuring 2 Tesla boards in a PC chassis. The plan is to use the onboard Intel video for display, when the monitor is connected, but mostly headless, and the Tesla board (4 total GPUs) for trial factoring. And yes, I'll join 'GPU to 72' with this chassis. If there is any advice on configuring the nVidia driver and what CUDA components to install, I'm all ears. Thanks for all the help.  2022-09-16, 05:30 #2 frmky Jul 2003 So Cal 9A816 Posts That card works but is deprecated in CUDA 11, so I recommend using CUDA 10.2. If you use Ubuntu 18.04 LTS, it's a straightforward install. Note, though, that standard support for Ubuntu 18.04 LTS ends in April. Code: wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /" sudo apt-get update sudo apt-get -y install cuda  2022-09-16, 07:33 #3 kriesel "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 3·43·53 Posts Definitely want mfaktc not mfakto for K80 Tesla which are NVIDIA cards. CUDA is better supported on NVIDIA than OpenCl is. Tesla K80 are Cuda Compute capability 3.7, and https://en.wikipedia.org/wiki/CUDA indicates support in the CUDA SDKs 6.5 to 11.7. The K80s are not very power efficient, but probably economical to learn on before moving on to something more current and efficient. nvidia-smi has power and clock control and monitoring, for tuning to optimal cost effectiveness. Have fun! Last fiddled with by kriesel on 2022-09-16 at 08:26  2022-09-16, 07:53 #4 LaurV Romulan Interpreter "name field" Jun 2011 Thailand 235728 Posts My take: 1. Do you physically have 2 (two) k80 cards inside? Because K80 is a dual card (it has 2 GPU chips), it may appear to the user like being two GPUs (depends on your system), but only one physical card. And if you have two, do you see 2 GPUs or 4 GPUs? 2. These cards are wasted if you do TF. They are mostly FP64 cards. You should do PRP with them, if you want to use them for GIMPS and don't have other purpose in mind. If you are in the game for factors, the P-1 is your way to go. You can try running gpuOwl newer or older version (v6 could run P-1 stand-alone, I understand the newer version can not, but other people should weight here, I lost the contact a bit). 3. However, you can settle a dispute here, related to TF. Can you run one/two/more copies of mfaktc in each card and add the output, and post here? This is because our "feeling" is that K80s which are offered on Colab accounts are only "half-card" and not a "full card". Question is, can you get (about) double performance for either TF or PRP from one card, compared with Colab? We can help with the setup if you use windoze (other people can help you for linux, etc). Also, the Nvidia Control Panel app should have an option to switch between FP64 and FP32. For each type of work, the performance doubles if the right mode is selected. TF needs FP32 cores, while PRP/PM1 need a lot of FP64 cores. If you use windoze and decide for TF, then mfaktc (as kriesel said) and MISFIT are the ways to go. MISFIT is a tool to automatize the work (run mfaktc in all cards, get assignments from the server or from other sources, like gpu72 or mersenne.ca, distribute them to the cards when needed, report the results to primenet when work is done) - it helps a lot. Thanks in advance. Last fiddled with by LaurV on 2022-09-16 at 08:01 2022-09-16, 09:14 #5 kriesel "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 3×43×53 Posts @Laurv, seems clear the OP's bought and installed two dual-GPU K80 cards; plural redundantly in post 1. Tesla addon blower assemblies tend to be one duct, one hi-pressure fan per card judging by eBay listings. Quote:  Originally Posted by AwesomeMachine I put together a trial-factoring machine with parts from my basement and eBay. It's a mid tower with 1250W PSU, (2) Tesla K80 GPU boards, Asus ATX mobo, 4th gen. Intel i7 CPU, 32GB DDR3-3200, NVMe storage, and I'll throw in a DVD-RW for at least the OS install (Linux Debian 10 or 11). I've got the coolers for the Tesla cards, with 2.1A 40mm fans, ans 2 case fans 2.4A 120mm. It sounds like a jet taking off. I can only imagine what crypto-mining farms must sound like! I've got plenty of power and PCIe bandwidth, but I'm not totally clear on how to install the Tesla driver, what version of CUDA I should go with (Presently thinking 'mfaktc/mfakto'.). The k80 appeared the biggest bang for the buck in trial factoring work ~700GHz*days/day per board for$65+PC mounting brackets, although a bit pricey on power consumption.
K80s don't make sense to me even at low utility rates for production running, although they can provide a low cost lab setup for multi-GPU running, especially if they can be resold later. At 300W/board, that's only 2.33GHD/day/watt & twice as many GPU units/board to manage. An RTX2080 can do 2600. GHD/day, 215W TDP, 12.1 GHD/day/watt, in a single GPU unit/board. Both productivity figures will improve with power reduction tuning. Running 24/7 the K80s will be _expensive_. At current local utility rates a watt-year is ~US$1.25. 2 K80 modules 24/7/365 would be ~$875/year and that's without air conditioning to dump the heat,
and matched in output by one RTX2080 running 6.5 months/year for ~$145 utility cost; ~$270 for 12 months and nearly twice the TF output. Used RTX2080 can be found on eBay for \$500, less cost than the annual local utility cost difference for equal annual output or 12-month running.
For DP running (Gpuowl etc) the RTX2080 is not as strong, but still beats a single K80 GPU (half K80 module) for throughput.

Colab free is definitely one GPU (half a K80 module) not two GPU devices (whole K80 module). But K80s are no longer appearing; it's all T4 now at least in US midwest. (Strong TF, weak DP.)

Quote:
 If it turns out I'm satisfied with this setup and all that goes along with manually procuring assignments and turning in work, I'll upgrade to more efficient GPUs. This is a first, and I'm unable to locate info on configuring 2 Tesla boards in a PC chassis. The plan is to use the onboard Intel video for display, when the monitor is connected, but mostly headless, and the Tesla board (4 total GPUs) for trial factoring. And yes, I'll join 'GPU to 72' with this chassis. If there is any advice on configuring the nVidia driver and what CUDA components to install, I'm all ears. Thanks for all the help.
Tesla GPUs don't have display output, so other video or remote access it is. USB external DVD or memory sticks are handy for installs, with one shared as needed among many systems.

Consider turning off unneeded hardware in system BIOS for slight power savings, a miners' trick.
Multiple GPUs per system is straightforward on Windows. I understand it's also feasible on Linux, but haven't attempted that lately. That's what server farms do IIUC, but unlike me, their admins know Linux.
Cheap old server grade hardware is cheap for a reason or more; probably uneconomic to continue to operate, compared to upgrade & n-year operating cost going forward, or reliability declining, or clients wanting faster cheaper-per-instruction-executed gear to rent.

Last fiddled with by kriesel on 2022-09-16 at 09:36

2022-09-16, 09:28   #6
LaurV
Romulan Interpreter

"name field"
Jun 2011
Thailand

2·31·163 Posts

Well, we are in violent agreement, and I am sorry I didn't read OP's post very careful. Job humm....

Quote:
 Originally Posted by kriesel But K80s are no longer appearing; it's all T4 now at least in US midwest. Multiple GPUs per system is straightforward on Windows. I understand it's also feasible on Linux, but haven't attempted that lately.
K80 (halves) still appear, even on Pro (non plus) accounts. But of course, it depends on your geographic area. Singapore server never had them, it seems, but I am roaming.... (at least virtually). However, you are right they are on the verge of extinction, and for home, unless you use them to heat the house in winter.... (I still have 6 classic Titan cards which are doing nothing - too expensive to run them, and too hot here the whole year).
Related to the second part, I am mining with a combination of 2x R-vii and 2x 2020Ti on an Ubuntu system and I don't have any issue with recognizing many GPUs, in spite of the fact that my linux skill sucks (but yeah, I grew older till I was able to make it working, mostly web search and help from people who knew, including this forum).

 Similar Threads Thread Thread Starter Forum Replies Last Post bhelmes Miscellaneous Math 5 2021-07-14 23:29 garo GPU Computing 100 2019-04-22 10:58 LaurV Lounge 3 2013-01-28 09:08 odin Software 4 2010-08-08 20:23 jocelynl Math 8 2006-02-01 14:12

All times are UTC. The time now is 02:26.

Thu Sep 29 02:26:50 UTC 2022 up 41 days, 23:55, 0 users, load averages: 1.16, 1.25, 1.23