![]() |
|
|
#12 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24·3·163 Posts |
Quote:
Does GpuOwl confirm by probing the device for identifying characteristics, at program start, or at time of generation of the result, that the intended gpu is the one that device selection by device number actually causes the program to use? I ran into the issue of changing device numbers described below on Windows with CUDALucas. It seems to me that some version of the issue might occur also in linux and could occur in applications other than CUDALucas. I've proposed verifying certain device characteristics match as an approach. Otherwise results may get misattributed to a different gpu name and physical gpu, or be the result of running alternately on multiple physical gpus, possibly without the user's knowledge. In multiple-GPU systems, NVIDIA driver timeout or thermal limits or a combination may cause a device to disappear from the device count, even though Windows Device Manager shows it, and an already running instance of GPU-Z lists it and can display its parameters but not display its sensor readings. In a system with multiple gpus, if one drops out, the number to physical gpu device mapping changes, without user action or knowledge. That means the device number to physical GPU device mapping, embodied in application ini files' specific device number entries changes meaning. User action or batch wrappers may restart a run on a different device than intended as a result. Tests intended to be performed on a specified gpu may run for a time on a different gpu than intended. The logging and results land in the directory and file expected, helping mask the occurrence. If the models or speeds are the same the switch may go undetected. I have observed the remap cause two applications to run on the same gpu at the same time, whose ini files or batch files specify separate gpu device numbers. The remap may affect execution timing of two sessions sharing one gpu, or may cause a restarted session to fail if it requires more resources than available on a dissimilar card or a device number higher than the reduced Windows gpu count allows. Requesting a device number higher than active in the reduced count generates error message in CUDALucas and CUDAPm1, device_number >= device_count ... exiting (This is probably a driver problem) Confirming unique device characteristics could allow greater confidence in execution. Depending on system configuration, gpu BIOS string, model name, or the combination may or may not be unique enough for device confirmation, but are relatively permanent. PCIexpress bus and ID number combination are I think certain to be unique, but only relate to the unique GPU as its current location, which may change as possibilities for resolving thermal issues get explored. These parameters also have the advantage they can be easily obtained through utilities such as GPU-Z. Another identifier that has been proposed is the UUID available at least in 64-bit Windows. CUDALucas v2.06beta 64-bit May 5 2017 build outputs and can log and could be modified to check at least the following: CUDALucas v2.06beta 64-bit build, compiled May 5 2017 @ 13:00:15 binary compiled for CUDA 6.50 CUDA runtime version 6.50 CUDA driver version 8.0 ------- DEVICE 0 ------- name GeForce GTX 1060 3GB UUID GPU-5e2c5531-4684-57ec-6393-8b762f286c70 ECC Support? Disabled Compatibility 6.1 clockRate (MHz) 1771 memClockRate (MHz) 4004 totalGlobalMem 3221225472 totalConstMem 65536 l2CacheSize 1572864 sharedMemPerBlock 49152 regsPerBlock 65536 warpSize 32 memPitch 2147483647 maxThreadsPerBlock 1024 maxThreadsPerMP 2048 multiProcessorCount 9 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 2147483647,65535,65535 textureAlignment 512 deviceOverlap 1 pciDeviceID 0 pciBusID 40 Manufacturer serial number would seem ideal, but at least for some apparently it can not be queried. It costs more to put that in a rom somewhere so presumably it is not done for consumer grade gpus. https://superuser.com/questions/4692...he-case#469220 |
|
|
|
|
|
|
#13 |
|
"Mihai Preda"
Apr 2015
22·3·112 Posts |
Right now gpuOwL does not attempt to fill in the GPU name or id automatically. By default it produces *no* UID:, but if the user specifies -uid foo/bar on the command line, it will just use that string (UID: foo/bar) coming from the user without validation or transformation.
To prevent user error, the only element now is logging at startup some basic info about the card, e.g. "44x1080MHz Hawaii", but that's all. Now, I don't know exactly how to get a better ID of the card using OpenCL. If such an ID could be obtained, I would at least print that on startup as well. The second point is, should the software generate UID: automatically? the software still needs the user name from the user. So maybe the hardware id could be filled in automatically -- but I don't know exactly what info would be good to put there. Would "Hawaii-44x1080" automatically be a better string then what the user inputs, e.g. I put "390x"..? I'm open to improve things in this area (it's not something difficult), but it's not clear to me yet what the solution is. |
|
|
|
|
|
#14 | ||
|
"Forget I exist"
Jul 2009
Dartmouth NS
204158 Posts |
Quote:
Quote:
Last fiddled with by science_man_88 on 2017-08-13 at 13:30 |
||
|
|
|
|
|
#15 | |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
24×3×163 Posts |
Quote:
Last fiddled with by kriesel on 2017-08-14 at 04:51 |
|
|
|
|
|
|
#16 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
11110100100002 Posts |
On linux or Windows you can retrieve a UUID per gpu, which is highly likely though not guaranteed to be unique. It's not invariant in time for a particular piece of hardware, if for example a GPU is removed from one system and installed in another; that may result in another UUID related to the one GPU. OS upgrades or reinstalls or driver upgrades or reinstalls are other occurrences that may create new UUIDs for the same hardware. https://en.wikipedia.org/wiki/Univer...que_identifier
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mersenne.ca and manual results | Gordon | mersenne.ca | 3 | 2015-08-31 03:08 |
| manual results | ramgeis | PrimeNet | 8 | 2013-05-30 06:33 |
| Loading of manual results into the DB | mdettweiler | No Prime Left Behind | 43 | 2012-01-15 07:50 |
| Manual Testing - Results Submission | rogue | Sierpinski/Riesel Base 5 | 5 | 2008-04-05 02:52 |
| Manual Checkin of P-1 Results | Unregistered | PrimeNet | 1 | 2004-05-18 03:15 |