mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-09-30, 03:08   #3213
petrw1
1976 Toyota Corona years forever!
 
petrw1's Avatar
 
"Wayne"
Nov 2006
Saskatchewan, Canada

22×3×17×23 Posts
Default

Quote:
Originally Posted by petrw1 View Post
Wow too bad I didn't see this months ago.
Throughput changed from 3,900 to 4,500 on my 2080Ti doing TF to 74 in the 4xM ranges. 15% improvement.

AND....running 8 cores of P-1 on the CPU does NOT slow down the GPU.
petrw1 is online now   Reply With Quote
Old 2019-10-01, 00:56   #3214
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009

22×3×163 Posts
Default

Quote:
Originally Posted by petrw1 View Post
AND....running 8 cores of P-1 on the CPU does NOT slow down the GPU.
In my experience, both flavors of mfaktc run faster if the CPU is busy. A person can always increase the CPU's minimum speed in the power options. The default is 5%. I set it to 60% This helps too.

A side note: I never run my GPU at full throttle now. I use Afterburner to slow it down. 75% is typical. The performance loss is negligible. Power consumption and heat output is considerably less. 70°C at full, 55°C at 75%.
storm5510 is offline   Reply With Quote
Old 2019-10-04, 01:04   #3215
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·5·271 Posts
Default Anyone? Bueller?

Is there a CUDA 8 version Windows build of the 2047M gpusievesize mfaktc mod somewhere?
kriesel is offline   Reply With Quote
Old 2019-10-06, 16:54   #3216
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×5×271 Posts
Default

Quote:
Originally Posted by kriesel View Post
Is there a CUDA 8 version Windows build of the 2047M gpusievesize mfaktc mod somewhere?
nomead posted one on the RTX2080 thread at https://www.mersenneforum.org/showpo...&postcount=139. I've run and tuned it on my GTX1080Ti. Results are posted at https://www.mersenneforum.org/showpo...99&postcount=8. It looks like the GTX1080Ti would benefit slightly from an even further increase in GPUSieveSize. Whenever I get around to trying and tuning it on a GTX1060 that will be added.
To try it on my older CC2.x gpus, it would require a CUDA6.5 Windows 7 X64 build.
kriesel is offline   Reply With Quote
Old 2019-10-06, 18:48   #3217
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

317 Posts
Default

Quote:
Originally Posted by kriesel View Post
To try it on my older CC2.x gpus, it would require a CUDA6.5 Windows 7 X64 build.
I took that as a hint and compiled it again under CUDA6.5 ... so, it complains, but still compiles:
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release
(Use -Wno-deprecated-gpu-targets to suppress warning).


I also managed to recompile under CUDA 8.0 but for compute 2.0. I didn't read the error messages correctly the last time around, it complained about 2.1 but not 2.0... go figure. Do you need that combination as well?

Anyway the attached binaries are for 64-bit CUDA 6.5, compute versions 20 30 35 37 50 and 52.
Attached Files
File Type: zip mfaktc-cuda65-2047.zip (764.4 KB, 107 views)
nomead is offline   Reply With Quote
Old 2019-10-12, 19:32   #3218
Fan Ming
 
Oct 2019

5·19 Posts
Default

Quote:
Originally Posted by MrRepunit View Post
I modified class_needed, 10 has to be exponentiated instead of 2, I added a 64 bit shortcut. I guess most of the 64 bit stuff can be used for mersenne numbers, but might need some changes or are missing some minor stuff. I think I removed some bit shift methods since they were not used for base 10.
Also a snippet from the readme:
- Removed Barrett and 72 bit kernels
- Removed Wagstaff related stuff
- Added 64 bit kernels
- Implemented repunit factorization (hardcoded)
- Improved performance compared to older version (0.18-repunit) about 30%
- Notes
- Compiling with more-classes flag seem to be slightly faster, thus it is switched on
- Not tested on Windows yet
- GPU sieving utilizes 100% of the GPU, so 1 mfaktc instance is enough
- GPU sieving makes the system response slow (tested on Ubuntu 14.04 64 bit with Geforce 460 GTS)
Setting GPUSieveSize in mfaktc.ini to 8 or lower makes the system more responsive



I did not remove the git directory, so if anybody is interested in the single commits feel free to take a closer look. I also added the linux executable. Not sure if I can quickly provide a windows variant...
Hi!
I tried to compile this code with CUDA Toolkit v10.1 and Visual Studio 2019 on my Windows 10 computer. It succeeded at compiling. But when I run the compiled executables on my another Windows 7 computer(the Windows 10 computer only have CUDA 8 drive installed, so can't run), the exe exited with this error information:
Code:
CUDA version info
  binary compiled for CUDA  10.10
  CUDA runtime version      10.10
  CUDA driver version       10.10

CUDA device info
  name                      GeForce GTX 1660
  compute capability        7.5
  max threads per block     1024
  max shared memory per MP  65536 byte
  number of multiprocessors 22
  clock rate (CUDA cores)   1800MHz
  memory clock rate:        4001MHz
  memory bus width:         192 bit

Automatic parameters
  threads per grid          720896
  GPUSievePrimes (adjusted) 82486
  GPUsieve minimum exponent 1055144

running a simple selftest...
ERROR: cudaGetLastError() returned 98: invalid device function
It's really frustrating that I completely knew nothing about how to deal with this error..
Attached file contains Makefile.win I used (and modified) and the compiled executable.
Attached Files
File Type: zip mfaktc-0.21-repunit (1).zip (408.7 KB, 116 views)
Fan Ming is offline   Reply With Quote
Old 2019-10-12, 19:59   #3219
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

317 Posts
Default

Quote:
Originally Posted by Fan Ming View Post
Hi!
I tried to compile this code with CUDA Toolkit v10.1 and Visual Studio 2019 on my Windows 10 computer.
Back in February when I set up my Windows compile environment for mfaktc, I apparently ran into the same problem, or something very similar. Compilation on VS2017 worked, but the resulting binary exited with the error
ERROR: cudaGetLastError() returned 8: invalid device function
So almost the same, but not quite. Windows 7 computer for compiling and running, but that shouldn't make a difference (but then, who knows?) Anyway, compiling on VS2012 worked and then the binary also started working.

This was for the plain unmodified mfaktc 0.21 though, but maybe the code is still similar enough so that the errors are the same. No idea why it happens...
nomead is offline   Reply With Quote
Old 2019-10-13, 05:20   #3220
Fan Ming
 
Oct 2019

5×19 Posts
Default

Quote:
Originally Posted by nomead View Post
This was for the plain unmodified mfaktc 0.21 though, but maybe the code is still similar enough so that the errors are the same. No idea why it happens...
That's frustrating... May be problems related to Makefile.win or cl.exe in VS, but I knew nothing about them(I just Googled and knew the name of these things yesterday).

Can anyone test compiling on Windows environment?
Fan Ming is offline   Reply With Quote
Old 2019-10-19, 11:13   #3221
Fan Ming
 
Oct 2019

5·19 Posts
Default

Attached file is the compiled CUDA 10.1 mfaktc (repunit version) on Windows 64bit.
I compiled it using CUDA 10.1 toolkit and Microsoft VS 2012 on Windows 10.
I failed all test cases on my GTX 1660 GPU(CC 7.5 Turing card) with the following error message:
Code:
CUDA version info
  binary compiled for CUDA  10.10
  CUDA runtime version      10.10
  CUDA driver version       10.10
CUDA device info
  name                      GeForce GTX 1660
  compute capability        7.5
  max threads per block     1024
  max shared memory per MP  65536 byte
  number of multiprocessors 22
  clock rate (CUDA cores)   1800MHz
  memory clock rate:        4001MHz
  memory bus width:         192 bit
Automatic parameters
  threads per grid          720896
  GPUSievePrimes (adjusted) 82486
  GPUsieve minimum exponent 1055144
running a simple selftest...
ERROR: selftest failed for R2866499
  no factor found
ERROR: selftest failed for R2866499
  no factor found
ERROR: selftest failed for R2866499
  no factor found
ERROR: selftest failed for R2866499
  no factor found
ERROR: selftest failed for R2866499
  no factor found
ERROR: selftest failed for R2866499
  no factor found
ERROR: selftest failed for R1729921
  no factor found
ERROR: selftest failed for R1729921
  no factor found
ERROR: selftest failed for R1729921
  no factor found
ERROR: selftest failed for R1729921
  no factor found
ERROR: selftest failed for R1729921
  no factor found
ERROR: selftest failed for R1729921
  no factor found
ERROR: selftest failed for R2482747
  no factor found
ERROR: selftest failed for R2482747
  no factor found
ERROR: selftest failed for R2482747
  no factor found
ERROR: selftest failed for R2482747
  no factor found
ERROR: selftest failed for R2482747
  no factor found
ERROR: selftest failed for R2482747
  no factor found
ERROR: selftest failed for R403433
  no factor found
ERROR: selftest failed for R403433
  no factor found
ERROR: selftest failed for R422057
  no factor found
ERROR: selftest failed for R422057
  no factor found
ERROR: selftest failed for R554959
  no factor found
ERROR: selftest failed for R554959
  no factor found
ERROR: selftest failed for R575173
  no factor found
ERROR: selftest failed for R444113
  no factor found
ERROR: selftest failed for R442487
  no factor found
Selftest statistics
  number of tests           27
  successfull tests         0
  no factor found           27
selftest FAILED!
  random selftest offset was: 29242
However, the compiled execute seems work well on some GPUs.
My compiled mmff seems work well on my GPU and I didn't test compiling original mfaktc version. IDK why this happened:(
Can anyone do a test to see if it works well? Thanks!
Attached Files
File Type: zip mfaktc-0.21-repunit (2).zip (339.1 KB, 116 views)

Last fiddled with by Fan Ming on 2019-10-19 at 11:26
Fan Ming is offline   Reply With Quote
Old 2019-10-19, 12:08   #3222
mnd9
 
Jun 2019
Boston, MA

478 Posts
Default

Sorry to derail the thread but random question:

Are mfaktc and mfakto checkpoint files cross compatible or not?
mnd9 is offline   Reply With Quote
Old 2019-10-19, 14:44   #3223
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22×5×271 Posts
Default

Quote:
Originally Posted by mnd9 View Post
Sorry to derail the thread but random question:

Are mfaktc and mfakto checkpoint files cross compatible or not?
Unlikely. Mfaktc isn't even cross compatible with itself, v0.20 vs. v0.21, or less-classes vs. more-classes.
See also points 38 and 39 in https://www.mersenneforum.org/showpo...23&postcount=6
kriesel is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 14:10.


Mon Aug 2 14:10:07 UTC 2021 up 10 days, 8:39, 0 users, load averages: 4.34, 3.82, 3.23

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.