View Single Post
Old 2018-05-28, 18:38   #1
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

38 Posts
Default Mfaktc-specific reference material

This thread is intended as a home for reference material specific to the mfaktc program.
(Suggestions are welcome. Discussion posts in this thread are not encouraged. Please use the reference material discussion thread http://www.mersenneforum.org/showthread.php?t=23383. Off-topic posts may be moved or removed, to keep the reference threads clean, tidy, and useful.)


Mfaktc howto


The following assumes you've already read the generic "How to get started in gpu computing for GIMPS" portion of https://www.mersenneforum.org/showpo...89&postcount=1, and duplicates little of that here.

Download a suitable version of mfaktc from here https://download.mersenne.ca/mfaktc/mfaktc-0.21
You might as well get one of the larger-GpuSieveSize-enabled ones allowing up to 2047 Mbits if one is available.
You'll also need the appropriate CUDArt library.
CUDA Dlls are in the CUDA_dlls folder of https://download.mersenne.ca/
If you can't find a suitable version there, you may be able to get one built by one of the participants on the mfaktc thread.

(If all else fails, CUDA dlls or .so files can be obtained by downloading and installing the applicable OS and CUDA version’s CUDA SDK, then locating the files and adding the few needed files to your path or to your working directory on other systems. Download the current version from https://developer.nvidia.com/cuda-downloads
Older versions can be obtained from the download archive, at https://developer.nvidia.com/cuda-toolkit-archive
These are typically gigabytes in the newer versions; 2.8GB @ CUDA v11.3.
Get the installation guide and other documentation pdfs also while you're there.

Note, not all OS version and CUDA version combinations are supported! See CUDA Toolkit compatibility vs CUDA level https://www.mersenneforum.org/showpo...1&postcount=11 and good luck!)

Install in a suitable user-account-owned subfolder.
Set the needed folder permissions so suitable permissions are inherited by files created there.

Modify the mfaktc.ini file to customize it to your GPU model, primenet account name, and system name.
(I usually incorporate system name, GPU identification, and instance number together. Something like systemname/gpuid-wn; condor/gtx1060-w2 for example.)
Correct the comments in the standard ini file to reflect the increased GpuSieveSize maximum.

You may want to try GPU-Z as a utility on your Windows system to see an indication of what the computer thinks is installed for your GPU (CUDA opencl openGl etc), graphically monitor GPU parameters, maybe even log them if you want. One of many utilities listed in https://www.mersenneforum.org/showpo...74&postcount=6 which also lists some Linux alternatives. It can be handy while getting a GPU application going. When it's not needed shut it down along with other idle applications to reduce overhead that's costing performance.

Now is a good time to run mfaktc with -h >>help.txt in your working directory. Run once, refer to as often as needed.

Run the self test: mfaktc -st >>selftest.txt

Or the longer one: mfaktc -st2 >>selftest.txt

Create a Windows batch file or Linux shell script with a short name. I sometimes use similar to the following, named mf.bat. Adjust device name, device number, mfaktc version etc for your situation.

Code:
title %cd% mfaktc gtx1080Ti
if not exist help.txt mfaktc-more-cuda80-64 -h >help.txt
if not exist selftest.txt mfaktc-more-cuda80-64 -d 0 -st >selftest.txt
mfaktc-more-cuda80-64 -d 0 >>mfaktc-run.txt
Set the device number there.
Consider redirecting console output to files, as in the batch file example above, or employing a good tee program.

Create a desktop shortcut for easy launch of the batch file or script. Shortcut command line should be on Windows, of form cmd /k: <fullsystempath>cmd.exe /k <full mfaktc folder path>\<batchfilename> and the directory in which it runs <full mfaktc folder path>. (Eventually, for multiple instances or multiple GPUs this could launch a routine that invokes the individual-instance files with short time delays between, so you have a few seconds to see whether each launched correctly or a bug occurred.)

Check the results. Resolve any reliability issues before proceeding to real GIMPS work.

Create a worktodo file and put some assignments in there. Start with few, in case your GPU or IGP does not work out. Get the type you plan to run the most. Get them from https://www.mersenne.org/manual_gpu_assignment/

Results are reported manually at https://www.mersenne.org/manual_result/

Run one instance with default settings. Modify tuning in the mfaktc.ini file, stop and restart the program to make it take effect, and document performance for each modification. Tune one variable at a time. For best accuracy and reproducibility, minimize interactive use during a benchmark period. Averaging many timing samples for the same tune parameter set in a spreadsheet improves timing accuracy. See Mfaktc tuning for optimal throughput for tuning advice.
An example of tuning for an RTX 2080 Super is here. There are other GPU models' tuning shown also near it in the same thread.

When substantially changing type of work, such as switching between 100M tasks and 100Mdigit tasks, or significantly changing bit levels, especially when changing kernels results, retuning is suggested. (You may want to save different tunes for different exponent and bit level ranges, for later reuse.)
For best performance use a SSD, tune for single-instance first, and tune for maximum throughput by experimenting with multiple instances last. (I suggest keeping the work of simultaneous instances similar. If they are too different, different kernels may reduce throughput. Test for that.)
Multiple instances sometimes give slightly higher sustained throughput, and keep the GPU working if one instance has a problem, runs out of work, or is stopped briefly to replenish work and report results. If a single instance is optimal, chain the batch file to another to keep the GPU productive when one worktodo file empties or a bug terminates the program.
Slow GPUs benefit less from large GpuSieveSize and multiple instances; fast GPUs benefit more.

Get familiar with the nvidia-smi command line tool at some point. That's more efficient for when you get into production mode. It's probably hiding somewhere like the following. I usually make a batch file with a short name nv.bat (and suggest doing the Linux equivalent)
Code:
:loop
"c:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe"
pause
goto loop
Nvidia-smi has less overhead than graphical monitoring utilities such as GPU-Z. Used as above, it will list all CUDA gpus working with the driver and whatever executables are using them, memory usage, power, temperature, etc. One keystroke in that command prompt window will refresh it.

Nvidia-smi can also control power and other parameters. Power savings can be quite considerable; 50% reduction in power from the nominal maximum seems to still provide better than 80% of maximum throughput.

Beware, nvidia-smi gpu number may not match mfaktc device number in the case of multiple gpus per system.

After getting the program functioning manually, you can consider continuing to operate it that way, or trying one of the client management software described at Available Mersenne prime hunting client management software, each of which have their own install requirements, or GPU to 72 https://mersenneforum.org/forumdisplay.php?f=95

(Note much of the preceding was derived from or copied from https://mersenneforum.org/showthread.php?t=25673)


Table of Contents:
  1. This post
  2. Run time versus exponent and bit level http://www.mersenneforum.org/showpos...19&postcount=2
  3. Bug and wish list http://www.mersenneforum.org/showpos...21&postcount=3
  4. Mfaktc throughput variables http://www.mersenneforum.org/showpos...76&postcount=4
  5. Volta and Turing support https://www.mersenneforum.org/showpo...29&postcount=5
  6. Concepts in GIMPS trial factoring https://www.mersenneforum.org/showpo...23&postcount=6
  7. Mfaktc -h help output https://www.mersenneforum.org/showpo...90&postcount=7
  8. Mfaktc tuning for optimal throughput https://www.mersenneforum.org/showpo...99&postcount=8
  9. gr-mfaktc https://www.mersenneforum.org/showpo...37&postcount=9
  10. Mfaktc tuning for best compute efficiency https://www.mersenneforum.org/showpo...4&postcount=10
  11. etc tbd

Top of reference tree: https://www.mersenneforum.org/showpo...22&postcount=1

Last fiddled with by kriesel on 2022-05-15 at 13:26 Reason: add Mfaktc tuning for best compute efficiency
kriesel is offline