mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2019-07-30, 23:39   #34
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×7×43 Posts
Default

Quote:
Originally Posted by dominicanpapi82 View Post
this is what I get (see attached)
Well that's inconvenient. (opencl error -5 out of resources) but we're getting somewhere in that the device is identified, as an NVIDIA Geforce 940M. (Not an Intel igp.)
Maybe this is on Windows and Windows is detecting a slow response and interfering? See
https://www.mersenneforum.org/showpo...3&postcount=10 for symptoms and possible solutions; check your event logs.

https://www.techpowerup.com/gpu-spec...rce-940m.c2643 says the GeForce 940M is CUDA 5 capable. It should be able to run mfaktc. Nominally 146GhzD/day TF per https://www.mersenne.ca/mfaktc.php which is about 45% of a GTX 1050 Ti.
It should also be able to run CUDALucas. The 2GB memory is small for CUDAPm1, but it might be usable up to p~140M. With sufficient system ram, prime95 settings, and patience, prime95 can do P-1 to exponents up to ~600M or on some processors to ~900M.

Last fiddled with by kriesel on 2019-07-30 at 23:50
kriesel is online now   Reply With Quote
Old 2019-07-30, 23:57   #35
JuanTutors
 
JuanTutors's Avatar
 
Mar 2004

22·33·5 Posts
Default

Quote:
Originally Posted by kriesel View Post
Well that's inconvenient. (opencl error -5 out of resources) but we're getting somewhere in that the device is identified, as an NVIDIA Geforce 940M. (Not an Intel igp.)
Maybe this is on Windows and Windows is detecting a slow response and interfering? See
https://www.mersenneforum.org/showpo...3&postcount=10 for symptoms and possible solutions; check your event logs.

https://www.techpowerup.com/gpu-spec...rce-940m.c2643 says the GeForce 940M is CUDA 5 capable. It should be able to run mfaktc. Nominally 146GhzD/day TF per https://www.mersenne.ca/mfaktc.php which is about 45% of a GTX 1050 Ti.
It should also be able to run CUDALucas. The 2GB memory is small for CUDAPm1, but it might be usable up to p~140M. With sufficient system ram, prime95 settings, and patience, prime95 can do P-1 to exponents up to ~600M or on some processors to ~900M.
I'll have to explore. I really don't know what to do at this point since my LL tests are supposed to run in 2 to 3 days. Right now an mfaktc executable is running but I'm pretty sure it's not using my GPU at all, as my task manager says my GPU is at 1% and prime95 is slowed by a corresponding amount. Also my laptop proc's temps are up from 65C to 80C so I might not be able to do anything about it on this computer. Would love to solve the problem on my desktop still, and probably need someone to do the factoring for me.

Last fiddled with by JuanTutors on 2019-07-30 at 23:58
JuanTutors is offline   Reply With Quote
Old 2019-07-31, 00:44   #36
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·32·7·43 Posts
Default

Quote:
Originally Posted by dominicanpapi82 View Post
I'll have to explore. I really don't know what to do at this point since my LL tests are supposed to run in 2 to 3 days. Right now an mfaktc executable is running but I'm pretty sure it's not using my GPU at all, as my task manager says my GPU is at 1% and prime95 is slowed by a corresponding amount. Also my laptop proc's temps are up from 65C to 80C so I might not be able to do anything about it on this computer. Would love to solve the problem on my desktop still, and probably need someone to do the factoring for me.
Is your mfaktc making reasonable headway for a 940M, showing ~146GhzD/day?
Load and run GPU-Z. https://www.techpowerup.com/gpuz/ See what it tells you about the utilization of various parts of your 940M. I suspect Windows 10 Task Manager is paying attention to the screen/video load, on CUDA, not the gpu compute load and should not be believed about GIMPS loads, at least for CUDA. Or maybe it's only paying attention to the display-designated igp or first possible display device.

For the system on which the attached GPU-Z images were captured, Win10 Task Manager claims the mfakto process for the igp is utilizing gpu 100% and the mfaktc process for the GTX1050Ti is utilizing 0% gpu. The GTX1050Ti in that laptop was set to be a nondisplay device; the display is run by the UHD630 IGP to keep the 1050Ti free for compute. GPU-Z shows both devices are seeing gpu load ~100%.
Attached Thumbnails
Click image for larger version

Name:	peregrine-tf.png
Views:	47
Size:	114.7 KB
ID:	20823  

Last fiddled with by kriesel on 2019-07-31 at 00:47
kriesel is online now   Reply With Quote
Old 2019-07-31, 03:36   #37
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72×197 Posts
Default

Quote:
Originally Posted by ATH View Post
Onboard gpus are pretty slow compared to dedicated gpus and I'm not sure if they run mfaktc
They do, if they are Nvidia.


My argument here is about the word "onboard", which means "soldered on the PCB" (printed circuit board) - as opposite to what you (and other people) really mean: the internal GPU integrated into the processor. Many tiny laptops nowadays have onboard GPUs which means that the VGA card, with GPU and all the circuitry, is soldered on the main board (this laptops are single board, or some "all in one" stuff, for criteria of price and minimization). If such GPU supports Cuda, then yes, they will run mfaktc. This implicit means they are produced by Nvidia, as Cuda is their proprietary thing and I don't know to be distributed to a third party up to now.

Talking about integrated GPUs (inside of the CPU), no, they do NOT run Cuda, but many of them run OpenCL (which is the "Cuda from Apple", they made it, but it is not proprietary, they licensed it out for a small fee, same as ARM did with their cores, and everybody can put it into their silicone if they have the guts and the money), so, as pointed already by the colleagues, one may be able to run mfakto in them. But they are not so fast.

There may be some confusion from the fact that Cuda-enabled hardware can run OpenCL (but lousy, slow, emulated, not native). However, not the other way around. You can see the OpenCL more like a "language", and Cuda more like a "technology".

Last fiddled with by LaurV on 2019-07-31 at 03:36
LaurV is offline   Reply With Quote
Old 2019-07-31, 03:55   #38
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

@papi/@dcheuk (now I don't know anymore which of you posted what, sorry):

- window appearing and disappearing fast is a sign of a missing library. Something like cufft_x_x.dll or cudart_x_x.dll may be missing. Can you run the same mfaktc or cudapm1 or whatever you were running, in the command prompt or in a batch with a "pause" command at the end, and see what the error message is? Then the library can be downloaded from the same place you got the program.
- gtx940 should be very well able to run mfaktc. If your CPU slows down and you do not see any occupancy of the GPU, maybe your real problem is that you don't use properly the -d switch? Try using "-d 0" or "-d 1" or "-d 2" in the command line (see help of mfaktc when launched with -h switch, launch it in a command prompt to avoid window closing). This would select the right GPU to run mfaktc into.
- if this solves the problem, then add the right number of the card in the ini file (Device_something line, see the comments in the ini file). The command line takes priority over the ini file.

- the "...fft missing" error is due to the fact that the benchmark was never run. You have to run cudalucas or cudapm1 with the -cufftbench and then with -threadbench switches. This will create two files, where the best (fastest) ffts for your GPU card and system are stored (they are different for each computer, as there are other things that matter, beside of the GPU type). Then, for the future TF jobs, cudapm1 or cudalucas will refer to these files to select the best/fastest FFT for the respective job. This is a must to do before starting long LL/PM1 jobs in a GPU. Not relevant for TF jobs.

Last fiddled with by LaurV on 2019-07-31 at 04:19
LaurV is offline   Reply With Quote
Old 2019-07-31, 04:44   #39
JuanTutors
 
JuanTutors's Avatar
 
Mar 2004

22·33·5 Posts
Default

Quote:
Originally Posted by kriesel View Post
Is your mfaktc making reasonable headway for a 940M, showing ~146GhzD/day?
Load and run GPU-Z. https://www.techpowerup.com/gpuz/ See what it tells you about the utilization of various parts of your 940M. I suspect Windows 10 Task Manager is paying attention to the screen/video load, on CUDA, not the gpu compute load and should not be believed about GIMPS loads, at least for CUDA. Or maybe it's only paying attention to the display-designated igp or first possible display device.

For the system on which the attached GPU-Z images were captured, Win10 Task Manager claims the mfakto process for the igp is utilizing gpu 100% and the mfaktc process for the GTX1050Ti is utilizing 0% gpu. The GTX1050Ti in that laptop was set to be a nondisplay device; the display is run by the UHD630 IGP to keep the 1050Ti free for compute. GPU-Z shows both devices are seeing gpu load ~100%.
@kriesel I did as you said, and it looks like the NVIDIAis in fact at 100%! I believe you asked about the GHz days/day, and it's been about 72 GHz days/day for the last few hours with multiple programs open. It was faster earlier today with my browser and Ps closed. Unfortunately the chip is at 80C, and I'm pretty sure that's causing Prime95 to throttle, so I'm probably going to stop at 2^78 and ask someone else if they are willing to do the factoring for me. If it takes a day to get to 2^78 at 80C, I'm not sure it's worth it. That sound like it's too hot.

Quote:
Originally Posted by LaurV View Post
@papi/@dcheuk (now I don't know anymore which of you posted what, sorry):

- window appearing and disappearing fast is a sign of a missing library. Something like cufft_x_x.dll or cudart_x_x.dll may be missing. Can you run the same mfaktc or cudapm1 or whatever you were running, in the command prompt or in a batch with a "pause" command at the end, and see what the error message is? Then the library can be downloaded from the same place you got the program.
- gtx940 should be very well able to run mfaktc. If your CPU slows down and you do not see any occupancy of the GPU, maybe your real problem is that you don't use properly the -d switch? Try using "-d 0" or "-d 1" or "-d 2" in the command line (see help of mfaktc when launched with -h switch, launch it in a command prompt to avoid window closing). This would select the right GPU to run mfaktc into.
- if this solves the problem, then add the right number of the card in the ini file (Device_something line, see the comments in the ini file). The command line takes priority over the ini file.

- the "...fft missing" error is due to the fact that the benchmark was never run. You have to run cudalucas or cudapm1 with the -cufftbench and then with -threadbench switches. This will create two files, where the best (fastest) ffts for your GPU card and system are stored (they are different for each computer, as there are other things that matter, beside of the GPU type). Then, for the future TF jobs, cudapm1 or cudalucas will refer to these files to select the best/fastest FFT for the respective job. This is a must to do before starting long LL/PM1 jobs in a GPU. Not relevant for TF jobs.
That's a lot of words for 12am ... Can I check in with you tomorrow? In the meantime, one thing I did on my laptop that I didn't do on my PC is run AMD-APP-SDKInstaller-v3.0.130.135-GA-windows-F-x86.exe . Not sure if that makes a difference.
JuanTutors is offline   Reply With Quote
Old 2019-07-31, 06:40   #40
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×7×43 Posts
Default

Quote:
Originally Posted by LaurV View Post
They do, if they are Nvidia.


My argument here is about the word "onboard", which means "soldered on the PCB" (printed circuit board) -
That's apparently not what he has; an NVIDIA 940M is an MXM module with gold plated contacts for connector insertion. See the photo and data at https://www.techpowerup.com/gpu-spec...rce-940m.c2643 previously posted in #34. MXM is a socket specification. https://en.wikipedia.org/wiki/Mobile_PCI_Express_Module

Quote:
as opposite to what you (and other people) really mean: the internal GPU integrated into the processor....
Talking about integrated GPUs (inside of the CPU), no, they do NOT run Cuda, but many of them run OpenCL
which Intel calls an integrated graphics processor, or IGP.
Quote:
There may be some confusion from the fact that Cuda-enabled hardware can run OpenCL (but lousy, slow, emulated, not native).
Not really; Gpuowl 6.5 benchmarks on NVIDIA via OpenCL quite close to CUDALucas 2.06 via CUDA on the same gpu models and individual units, on Windows. (Won't run on ancient NVIDIA models, but runs on GTX1050Ti to 1080Ti as I recall. Seems to be related to opencl level supported by the ancients.)
kriesel is online now   Reply With Quote
Old 2019-07-31, 06:54   #41
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72×197 Posts
Default

Quote:
Originally Posted by kriesel View Post
That's apparently not what he has
Who? ATH?
LaurV is offline   Reply With Quote
Old 2019-07-31, 16:03   #42
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

124528 Posts
Default

Quote:
Originally Posted by LaurV View Post
Who? ATH?
dominicanpapi82 aka JuanTutors
see the screen shot at https://www.mersenneforum.org/showpo...7&postcount=33
kriesel is online now   Reply With Quote
Old 2019-07-31, 16:06   #43
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×32×7×43 Posts
Default

Quote:
Originally Posted by LaurV View Post
@papi/@dcheuk (now I don't know anymore which of you posted what, sorry):

- window appearing and disappearing fast is a sign of
...something quickly fatally wrong, and the user neglecting to run via cmd /k commandline so any error message will stick around long enough to read. (/k = keep the command prompt window after termination) The error could be any of a number of things, like a mis-typed executable name, wrong dll for the exe, missing dll, permissions problem, device number mismatch, program bug that crashes CUDAPm1, gpu with a driver problem or that has gone to sleep from a thermal limit or turned off because of inadequate total system power, etc. Using a batch file with pause may work, but on some Windows versions batch files run with different permissions than the interactive session of the user who launches it. That permission difference can add confusion that's not needed. That difference can be handled with a shortcut that specifies explicitly what user to run it as. (see https://www.mersenneforum.org/showpo...3&postcount=10)
Quote:
- the "...fft missing" error is due to the fact that the benchmark was never run. You have to run cudalucas or cudapm1 with the -cufftbench and then with -threadbench switches. This will create two files, where the best (fastest) ffts for your GPU card and system are stored (they are different for each computer, as there are other things that matter, beside of the GPU type). Then, for the future TF jobs, cudapm1 or cudalucas will refer to these files to select the best/fastest FFT for the respective job. This is a must to do before starting long LL/PM1 jobs in a GPU. Not relevant for TF jobs.
CUDALucas has the -threadbench option, but CUDAPm1 has no -threadbench option. A CUDAPm1 threadbench is performed for a single fft length by specifying -cufftbench (fftlength) (fftlength) (repetitions) (mask). Repetitions and mask are optional. Same fftlength given twice is what tells CUDAPm1 to do a threadbench instead of an fftbench. The fft and threads files produced by CUDAPm1 will differ from those produced by CUDALucas for the same gpu and should not be used for CUDALucas. Nor should CUDALucas fft or threads files be used in CUDAPm1. CUDALucas should be run to produce its own fft and threads files. Keep these apps and related files in separate directories from each other. In my experience (based on lots of deep testing and benchmarking on numerous gpu models and software versions), the benchmark result files differ when any of the following differ:
  • Software application
  • application version
  • CUDA level of the application
  • GPU model
  • GPU unit
  • variations in other system activity, especially affecting the display gpu
  • one run to the next, everything else held constant (minor)
So set other things running in the case you want to tune for, such as prime95 running on the cpu, other gpus busy with their GIMPS work, no interactive use, then do your fft and threads benchmarking on a gpu at a time in that context that's reflective of running GIMPS work undisturbed. I saw no appreciable effect of widely varying CUDA driver version, in CUDALucas on a GTX480. However, I saw reductions in gpuowl performance on AMD of up to 5% from upgrading the Windows Adrenalin driver. Which CUDA level performs best on a given gpu can vary with the fft length as well as application etc.; performance can fluctuate several percent versus CUDA level, other things held constant. It's not always the latest that's fastest.

TF applications don't use ffts at all, so have no use for such files.
For CUDAPm1 threads benchmarking, I use batch files with for loops to spin cudapm1 through the list of fft lengths that it keeps in the fft file.
kriesel is online now   Reply With Quote
Old 2019-07-31, 17:02   #44
JuanTutors
 
JuanTutors's Avatar
 
Mar 2004

10348 Posts
Default

Sorry for not quoting the big message, and also for causing confusion about the name change. I cd'd to the folder containing mfaktc-0.21 and tried the following variations:
Code:
start mfaktc-win-64.exe /k
start mfaktc-win-64.exe /?
The same basic thing happened: a quick 1-frame pop up and go-away.

//EDIT: I should also say, I am not actually sure what video card I have on my desktop. The NVidia was on my laptop. I downloaded GPU-Z and the only option on the dropdown is Intel HD Graphics 630

Last fiddled with by JuanTutors on 2019-07-31 at 17:11
JuanTutors is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
332.2M - 333.9M (aka 100M digit range) Uncwilly LMH > 100M 684 2018-07-01 10:52
I want a 100M digit Mersenne that.... JuanTutors PrimeNet 8 2012-12-06 13:47
100M-digit n/k pairs __HRB__ Riesel Prime Search 0 2010-05-22 01:17
Who is LL-ing a mersenne number > 100M digits? joblack LMH > 100M 1 2009-10-08 12:31
62-digit prime factor of a Mersenne number ET_ Factoring 39 2006-05-11 18:27

All times are UTC. The time now is 18:01.


Sun Aug 1 18:01:34 UTC 2021 up 9 days, 12:30, 0 users, load averages: 2.76, 2.46, 2.21

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.