mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

ATH 2018-09-29 11:43

[QUOTE=James Heinrich;496947]The [i]LessClasses[/i] version should only be used for extremely-fast-running assignments (where each assignment only takes a few seconds).

mfaktc can be download from [url=https://mersenneforum.org/mfaktc/mfaktc-0.21/]here[/url] or [url=https://download.mersenne.ca/mfaktc/mfaktc-0.21]here[/url].[/QUOTE]

The CUDA 10 binary does not work with just the .dll files, it wants CUDA 10 installed (I have CUDA 9.2 installed):
"ERROR: current CUDA driver version is lower than the CUDA toolkit version used during compile!
Please update your graphics driver."

I think the other binaries work with just the dll files.

TheJudger 2018-09-29 11:54

[QUOTE=VictordeHolland;497052]Still I find it a bit confusing that you need to compile for different architectures, right? A mfaktc compile with CUDA SDK 10, GTX980 won't work on a GTX1080 right? Cause the architecture/CUDA capability of the GTX1080 is higher (and somehow not backwards compatibe?) Or am I just being ignorent?[/QUOTE]

It is not that bad.[LIST][*]the CUDA [U]runtime DLL[/U] must be [U]exactly the same version[/U] used during compilation of mfaktc[*]the [U]driver[/U] of your system must support the [U]same or newer version[/U] of CUDA used for compiling mfaktc.[*]the binary (mfaktc.exe) must have support for your GPU. A single binary can support multiple GPU architectures, e.g. the CUDA 8 binary found [URL="https://mersenneforum.org/mfaktc/mfaktc-0.21/"]here[/URL] are compiled for Fermi, Kepler, Kepler "update", Maxwell and Pascal.[/LIST]
Oliver

TheJudger 2018-09-30 12:27

JFYI: just built some Windows binaries using CUDA Toolkit 10.0. Will do some testing and provide binaries after sucessfull testing.

Oliver

ATH 2018-09-30 14:57

If you have time could you please also post a guide describing how you compile it for Windows.

kriesel 2018-09-30 15:29

[QUOTE=ATH;497142]If you have time could you please also post a guide describing how you compile it for Windows.[/QUOTE]
The readme.txt for mfaktc v0.21 CUDA 8.0 says
[CODE]#############################
# 2.2 Compilation (Windows) #
#############################

The following instructions have been tested on Windows 7 64bit using Visual
Studio 2012 Professional. A GNU compatible version of make is also required
as the Makefile is not compatible with nmake. GNU Make for Win32 can be
downloaded from http://gnuwin32.sourceforge.net/packages/make.htm.

Run the Visual Studio 2012 x64 Win64 Command Prompt for x64 or
Run the Visual Studio 2012 x86 Native Tools Command Prompt for x86 (32 bit)

and change into the "\src" subdirectory.

Run 'make -f Makefile.win' for a 64bit built (recommended on 64bit systems)
or 'make -f Makefile.win32' for a 32bit built.

You will have to adjust the paths to your CUDA installation and the
Microsoft Visual Studio binaries in the makefiles if you have something
other than CUDA 8.0 and MSVS 2012. The binaries "mfaktc-win-64.exe" or
"mfaktc-win-32.exe" are placed in the parent directory.[/CODE]
Presumably you're asking for an update or more detail.

TheJudger 2018-09-30 17:11

Hello!

[QUOTE=ATH;497142]If you have time could you please also post a guide describing how you compile it for Windows.[/QUOTE]
[LIST=1][*]Installed [URL="https://visualstudio.microsoft.com/de/downloads/"]Visual Studio 2017.8 "Community"[/URL][*]Installed [URL="https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64"]CUDA Toolkit 10 for Windows[/URL][*]Installed [URL="http://www.mingw.org/"]MinGW[/URL] as on of many options for [I]GNU Make[/I] on Windows. In MinGW folder I've copied [I]bin/mingw32-make.exe[/I] to [I]bin/make.exe[/I] because I'm lazy. Careful when updating [I]mingw32-make.exe[/I]...[*]Configure Environment for [I]"x64 Native Tools-Command Promt"[/I] - add MinGW/bin and CUDA/bin to PATH variable.[/LIST]
The just open [I]"x64 Native Tools-Command Promt"[/I] and change into the directory with the mfaktc source files and run[CODE]
make -f Makefile.win[/CODE]

I had to adjust some settings in Makefile.win:[CODE]
CUDA_DIR = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0"

CC = cl
CFLAGS = /Ox /Oy /GL /W2 /fp:fast /I$(CUDA_DIR)\include /I$(CUDA_DIR)\include\cudart /nologo

NVCCFLAGS = --ptxas-options=-v
CUFLAGS = -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.15.26726\bin\Hostx86\x64" -x cu -I$(CUDA_DIR)\/include --machine 64 --compile -Xcompiler "/wd 4819" -DWIN64 -Xcompiler "/EHsc /W3 /nologo /O2 /FS" $(NVCCFLAGS)

# generate code for various compute capabilities
# NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc)
# NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all!
NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code
NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc
NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_60,code=sm_60 # CC 6.x GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_70,code=sm_70 # CC 7.x GPUs will use this code
# NVCCFLAGS += --generate-code arch=compute_75,code=sm_75 # CC 7.5 GPUs will use this code[/CODE]

Oliver

TheJudger 2018-09-30 17:13

Wanted: Testrun on Turing card!
 
Hello,

anyone with a Turing GPU (e.g. Geforce RTX 2080 or 2080 Ti) willing to run some tests? Honza can't right now, I've asked him already.

Oliver

TheJudger 2018-10-05 20:23

Initial Turing benchmarks (RTX 2080Ti)
 
Hello,

finally I was able to put my hands on a Turing (RTX 20x0 series) card. Because of [URL="https://docs.nvidia.com/cuda/turing-tuning-guide/index.html#turing-tuning"]this[/URL] I was excited and I was right, Turing is a beast for mfaktc.

Unmodified mfaktc 0.21 sources (just adjusted the Makefile) + CUDA 10.0.130 on Linux:
[CODE]# ./mfaktc.exe -tf 66362159 73 74
mfaktc v0.21 (64bit built)
[...]
CUDA device info
name [B][COLOR="Red"]GeForce RTX 2080 Ti[/COLOR][/B]
compute capability 7.5
max threads per block 1024
max shared memory per MP 65536 byte
number of multiprocessors 68
clock rate (CUDA cores) 1635MHz
memory clock rate: 7000MHz
memory bus width: 352 bit
[...]
got assignment: exp=66362159 bit_min=73 bit_max=74 (28.83 GHz-days)
Starting trial factoring M66362159 from 2^73 to 2^74 (28.83 GHz-days)
k_min = 71160531149400
k_max = 142321062305090
Using GPU kernel "barrett76_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Oct 05 22:12 | 0 0.1% | 0.630 10m04s | 4118.14 82485 n.a.%
Oct 05 22:12 | 4 0.2% | 0.563 8m59s | 4608.22 82485 n.a.%
Oct 05 22:12 | 9 0.3% | 0.562 8m58s | 4616.42 82485 n.a.%
[...]
Oct 05 22:21 | 4612 99.9% | 0.599 0m01s | 4331.27 82485 n.a.%
Oct 05 22:21 | 4617 100.0% | 0.600 0m00s | 4324.05 82485 n.a.%
no factor for [B][COLOR="red"]M66362159 from 2^73 to 2^74[/COLOR][/B] [mfaktc 0.21 barrett76_mul32_gs]
tf(): total time spent: [B][COLOR="red"]9m 30.800s[/COLOR][/B]

[/CODE]

This is a founders editions card, starting with a cold card. Power draw is ~260W on average so limited by power target. Temperature is a bit below 80°C and average clock is about 1680MHz once the card is "hot".

[U]New performance king[/U] but slightly behind [URL="https://mersenneforum.org/showpost.php?p=490784&postcount=2819"]Tesla V100[/URL] in terms of energy efficency.

Oliver

petrw1 2018-10-05 20:37

[QUOTE=TheJudger;497430]Hello,

finally I was able to put my hands on a Turing (RTX 20x0 series) card. Because of [URL="https://docs.nvidia.com/cuda/turing-tuning-guide/index.html#turing-tuning"]this[/URL] I was excited and I was right, Turing is a beast for mfaktc.

Unmodified mfaktc 0.21 sources (just adjusted the Makefile) + CUDA 10.0.130 on Linux:
[CODE]# ./mfaktc.exe -tf 66362159 73 74
mfaktc v0.21 (64bit built)
[...]
CUDA device info
name [B][COLOR="Red"]GeForce RTX 2080 Ti[/COLOR][/B]
compute capability 7.5
max threads per block 1024
max shared memory per MP 65536 byte
number of multiprocessors 68
clock rate (CUDA cores) 1635MHz
memory clock rate: 7000MHz
memory bus width: 352 bit
[...]
got assignment: exp=66362159 bit_min=73 bit_max=74 (28.83 GHz-days)
Starting trial factoring M66362159 from 2^73 to 2^74 (28.83 GHz-days)
k_min = 71160531149400
k_max = 142321062305090
Using GPU kernel "barrett76_mul32_gs"
Date Time | class Pct | time ETA | GHz-d/day Sieve Wait
Oct 05 22:12 | 0 0.1% | 0.630 10m04s | 4118.14 82485 n.a.%
Oct 05 22:12 | 4 0.2% | 0.563 8m59s | 4608.22 82485 n.a.%
Oct 05 22:12 | 9 0.3% | 0.562 8m58s | 4616.42 82485 n.a.%
[...]
Oct 05 22:21 | 4612 99.9% | 0.599 0m01s | 4331.27 82485 n.a.%
Oct 05 22:21 | 4617 100.0% | 0.600 0m00s | 4324.05 82485 n.a.%
no factor for [B][COLOR="red"]M66362159 from 2^73 to 2^74[/COLOR][/B] [mfaktc 0.21 barrett76_mul32_gs]
tf(): total time spent: [B][COLOR="red"]9m 30.800s[/COLOR][/B]

[/CODE]

This is a founders editions card, starting with a cold card. Power draw is ~260W on average so limited by power target. Temperature is a bit below 80°C and average clock is about 1680MHz once the card is "hot".

[U]New performance king[/U] but slightly behind [URL="https://mersenneforum.org/showpost.php?p=490784&postcount=2819"]Tesla V100[/URL] in terms of energy efficency.

Oliver[/QUOTE]

Wow...I can hardly wait. Mine is supposed to arrive Tuesday...hopefully alive within a week. Good to hear it works without any mfaktc compatibility issues.

James Heinrich 2018-10-05 20:42

[url]https://www.mersenne.ca/mfaktc.php[/url] has been updated with the new benchmark.

ET_ 2018-10-05 20:50

What is the Street Price of this monster?


All times are UTC. The time now is 23:06.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.