mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-10-05, 21:16   #2916
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

1101010101102 Posts
Default

Quote:
Originally Posted by ET_ View Post
What is the Street Price of this monster?
Seems to be hovering around US$1800 right now, which is getting close to double MSRP, due to high demand and low supply. Things should calm down in a little while when there's more supply.
James Heinrich is offline   Reply With Quote
Old 2018-10-05, 21:54   #2917
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

33×192 Posts
Default

Quote:
Originally Posted by TheJudger View Post
New performance king but slightly behind Tesla V100 in terms of energy efficency.
Like, um, wow!!!

Despite the capex, the investment might make sense over the life of the kit based on the TDP.
chalsall is offline   Reply With Quote
Old 2018-10-05, 22:11   #2918
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

124A16 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Seems to be hovering around US$1800 right now, which is getting close to double MSRP, due to high demand and low supply. Things should calm down in a little while when there's more supply.
I pre-ordered my 2080Ti right after the announcement for $1200US.
petrw1 is offline   Reply With Quote
Old 2018-10-06, 10:24   #2919
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

24×7×43 Posts
Default

Quote:
Originally Posted by petrw1 View Post
I pre-ordered my 2080Ti right after the announcement for $1200US.
Still expensive, having a GTX 980 idle at home...
ET_ is offline   Reply With Quote
Old 2018-10-06, 15:12   #2920
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

C5516 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello!


  1. Installed Visual Studio 2017.8 "Community"
  2. Installed CUDA Toolkit 10 for Windows
  3. Installed MinGW as on of many options for GNU Make on Windows. In MinGW folder I've copied bin/mingw32-make.exe to bin/make.exe because I'm lazy. Careful when updating mingw32-make.exe...
  4. Configure Environment for "x64 Native Tools-Command Promt" - add MinGW/bin and CUDA/bin to PATH variable.

The just open "x64 Native Tools-Command Promt" and change into the directory with the mfaktc source files and run
Code:
make -f Makefile.win
I had to adjust some settings in Makefile.win:
Code:
CUDA_DIR = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0"

CC = cl
CFLAGS = /Ox /Oy /GL /W2 /fp:fast /I$(CUDA_DIR)\include /I$(CUDA_DIR)\include\cudart /nologo

NVCCFLAGS = --ptxas-options=-v
CUFLAGS = -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.15.26726\bin\Hostx86\x64" -x cu -I$(CUDA_DIR)\/include --machine 64 --compile -Xcompiler "/wd 4819" -DWIN64 -Xcompiler "/EHsc /W3 /nologo /O2 /FS" $(NVCCFLAGS)

# generate code for various compute capabilities
# NVCCFLAGS += --generate-code arch=compute_11,code=sm_11 # CC 1.1, 1.2 and 1.3 GPUs will use this code (1.0 is not possible for mfaktc)
# NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # CC 2.x GPUs will use this code, one code fits all!
NVCCFLAGS += --generate-code arch=compute_30,code=sm_30 # all CC 3.x GPUs _COULD_ use this code 
NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc
NVCCFLAGS += --generate-code arch=compute_50,code=sm_50 # CC 5.x GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_60,code=sm_60 # CC 6.x GPUs will use this code
NVCCFLAGS += --generate-code arch=compute_70,code=sm_70 # CC 7.x GPUs will use this code
# NVCCFLAGS += --generate-code arch=compute_75,code=sm_75 # CC 7.5 GPUs will use this code
Oliver

I compiled mfaktc this way without any errors but when running it, it does not work:
ERROR: cudaGetLastError() returned 8: invalid device function


Is CUDA10 too advanced for compute capability 3.5 (Titan Black) ? I did add 3.5 in Makefile and as I said no error:
NVCCFLAGS += --generate-code arch=compute_35,code=sm_35 # but CC 3.5 (3.2?) _CAN_ use funnel shift which is useful for mfaktc

There were some errors:
Code:
c:\msys64\home\ath\mfaktc-0.21\src\tf_common.cu(242): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
c:\msys64\home\ath\mfaktc-0.21\src\tf_common.cu(242): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
c:\msys64\home\ath\mfaktc-0.21\src\tf_common_gs.cu(169): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
c:\msys64\home\ath\mfaktc-0.21\src\tf_common.cu(242): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
c:\msys64\home\ath\mfaktc-0.21\src\tf_common_gs.cu(169): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
gpusieve.cu(1371): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1385): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1400): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1416): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1450): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1466): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1506): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1522): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1558): warning C4244: '=': conversion from '__int64' to 'uint32', possible loss of data
gpusieve.cu(1273): warning C4996: 'cudaThreadSetCacheConfig': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(1112): note: see declaration of 'cudaThreadSetCacheConfig'
gpusieve.cu(1599): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
gpusieve.cu(1621): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
gpusieve.cu(1645): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
c:\msys64\home\ath\mfaktc-0.21\src\tf_common.cu(242): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'
c:\msys64\home\ath\mfaktc-0.21\src\tf_common_gs.cu(169): warning C4996: 'cudaThreadSynchronize': was declared deprecated
c:\cuda10\include\cuda_runtime_api.h(947): note: see declaration of 'cudaThreadSynchronize'

Last fiddled with by ATH on 2018-10-06 at 15:30
ATH is offline   Reply With Quote
Old 2018-10-06, 17:18   #2921
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

539710 Posts
Default

Quote:
Originally Posted by ATH View Post
I compiled mfaktc this way without any errors but when running it, it does not work:
ERROR: cudaGetLastError() returned 8: invalid device function
...
Is CUDA10 too advanced for compute capability 3.5 (Titan Black) ?
OK, you compiled for CUDA10. Have you confirmed the installed NVIDIA driver supports CUDA10? Confirmed the CUDArt....dll is CUDA10 also? (Via file names, or separate tools?) My recollection is invalid device function error 8 shows up when there's a mismatch.

Have you tried compiling a smaller simpler sample project? How did that go? One that prints out driver supported CUDA version, runtime dll supported version, and gpu model & CUDA CC level would be good. Such might be quickly created from (a copy of) mfaktc by deleting most of mfaktc. Keep just the part necessary for producing the following output, and accepting a device number as input, or spin through integers starting from zero until there's no gpu there:
Code:
CUDA version info
  binary compiled for CUDA  6.50
  CUDA runtime version      6.50
  CUDA driver version       9.10

CUDA device info
  name                      GeForce GTX 480
  compute capability        2.0
  maximum threads per block 1024
  number of multiprocessors 15 (480 shader cores)
  clock rate                1451MHz
I take https://docs.nvidia.com/cuda/pdf/CUD...river_NVCC.pdf page 23 to mean CUDA10 toolkit supports CC 3.0 and up.

Also https://en.wikipedia.org/wiki/CUDA#GPUs_supported.

Last fiddled with by kriesel on 2018-10-06 at 17:28
kriesel is offline   Reply With Quote
Old 2018-10-06, 19:19   #2922
ATH
Einyen
 
ATH's Avatar
 
Dec 2003
Denmark

7·11·41 Posts
Default

Yes, I installed the drivers that came with CUDA10 and added the CUDA 10 dll files:
Code:
CUDA version info
  binary compiled for CUDA  10.0
  CUDA runtime version      10.0
  CUDA driver version       10.0
I also tried compiling with CUDA 9.2 which I still have installed and copy the CUDA 9.2 dll files to the folder. Again it compiled without error, but same error message when running it:
Code:
CUDA version info
  binary compiled for CUDA  9.20
  CUDA runtime version      9.20
  CUDA driver version       10.0

I'll just wait for Oliver's CUDA10 Windows binaries and test if those works on my card. Otherwise the old binaries I have are good enough, I just wanted to compile them myself if I could.

Last fiddled with by ATH on 2018-10-06 at 19:22
ATH is offline   Reply With Quote
Old 2018-10-07, 14:35   #2923
bayanne
 
bayanne's Avatar
 
"Tony Gott"
Aug 2002
Yell, Shetland, UK

22×83 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello,

finally I was able to put my hands on a Turing (RTX 20x0 series) card. Because of this I was excited and I was right, Turing is a beast for mfaktc.

Unmodified mfaktc 0.21 sources (just adjusted the Makefile) + CUDA 10.0.130 on Linux:
Code:
# ./mfaktc.exe -tf 66362159 73 74
mfaktc v0.21 (64bit built)
[...]
CUDA device info
  name                      GeForce RTX 2080 Ti
  compute capability        7.5
  max threads per block     1024
  max shared memory per MP  65536 byte
  number of multiprocessors 68
  clock rate (CUDA cores)   1635MHz
  memory clock rate:        7000MHz
  memory bus width:         352 bit
[...]
got assignment: exp=66362159 bit_min=73 bit_max=74 (28.83 GHz-days)
Starting trial factoring M66362159 from 2^73 to 2^74 (28.83 GHz-days)
 k_min =  71160531149400
 k_max =  142321062305090
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Oct 05 22:12 |    0   0.1% |  0.630  10m04s |   4118.14    82485    n.a.%
Oct 05 22:12 |    4   0.2% |  0.563   8m59s |   4608.22    82485    n.a.%
Oct 05 22:12 |    9   0.3% |  0.562   8m58s |   4616.42    82485    n.a.%
[...]
Oct 05 22:21 | 4612  99.9% |  0.599   0m01s |   4331.27    82485    n.a.%
Oct 05 22:21 | 4617 100.0% |  0.600   0m00s |   4324.05    82485    n.a.%
no factor for M66362159 from 2^73 to 2^74 [mfaktc 0.21 barrett76_mul32_gs]
tf(): total time spent:  9m 30.800s
This is a founders editions card, starting with a cold card. Power draw is ~260W on average so limited by power target. Temperature is a bit below 80°C and average clock is about 1680MHz once the card is "hot".

New performance king but slightly behind Tesla V100 in terms of energy efficency.

Oliver
Wow, look at the GHz-d/day figures!
bayanne is offline   Reply With Quote
Old 2018-10-07, 15:42   #2924
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

3×7×257 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
Seems to be hovering around US$1800 right now, which is getting close to double MSRP, due to high demand and low supply. Things should calm down in a little while when there's more supply.
Today's eBay listings have the RTX 2080 Ti ranging from $1200-1400 in ongoing auctions, and $1500 and up (way up) for buy it now.
kriesel is offline   Reply With Quote
Old 2018-10-07, 15:47   #2925
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

2·3·569 Posts
Default

Quote:
Originally Posted by kriesel View Post
Today's eBay listings have the RTX 2080 Ti ranging from $1200-1400 in ongoing auctions, and $1500 and up (way up) for buy it now.
Buy It Now prices are useful, ongoing auction prices are mostly irrelevant. Most useful is the price of recently sold items, a version of which I use to update the price listing on my site.
James Heinrich is offline   Reply With Quote
Old 2018-10-07, 19:33   #2926
xx005fs
 
"Eric"
Jan 2018
USA

3248 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hello,

finally I was able to put my hands on a Turing (RTX 20x0 series) card. Because of this I was excited and I was right, Turing is a beast for mfaktc.

Unmodified mfaktc 0.21 sources (just adjusted the Makefile) + CUDA 10.0.130 on Linux:
Code:
# ./mfaktc.exe -tf 66362159 73 74
mfaktc v0.21 (64bit built)
[...]
CUDA device info
  name                      GeForce RTX 2080 Ti
  compute capability        7.5
  max threads per block     1024
  max shared memory per MP  65536 byte
  number of multiprocessors 68
  clock rate (CUDA cores)   1635MHz
  memory clock rate:        7000MHz
  memory bus width:         352 bit
[...]
got assignment: exp=66362159 bit_min=73 bit_max=74 (28.83 GHz-days)
Starting trial factoring M66362159 from 2^73 to 2^74 (28.83 GHz-days)
 k_min =  71160531149400
 k_max =  142321062305090
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Oct 05 22:12 |    0   0.1% |  0.630  10m04s |   4118.14    82485    n.a.%
Oct 05 22:12 |    4   0.2% |  0.563   8m59s |   4608.22    82485    n.a.%
Oct 05 22:12 |    9   0.3% |  0.562   8m58s |   4616.42    82485    n.a.%
[...]
Oct 05 22:21 | 4612  99.9% |  0.599   0m01s |   4331.27    82485    n.a.%
Oct 05 22:21 | 4617 100.0% |  0.600   0m00s |   4324.05    82485    n.a.%
no factor for M66362159 from 2^73 to 2^74 [mfaktc 0.21 barrett76_mul32_gs]
tf(): total time spent:  9m 30.800s
This is a founders editions card, starting with a cold card. Power draw is ~260W on average so limited by power target. Temperature is a bit below 80°C and average clock is about 1680MHz once the card is "hot".

New performance king but slightly behind Tesla V100 in terms of energy efficency.

Oliver
This is honestly really impressive performance. For 1200$ you get about the same as Titan V would in trial factoring. Now I am just hoping an improvement like this for the next generation cards in LL that's as incredible as this speed bump from Pascal to Turing in trial factoring.
xx005fs is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 08:25.


Tue Jul 27 08:25:00 UTC 2021 up 4 days, 2:53, 0 users, load averages: 1.89, 1.82, 1.78

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.