mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-12-04, 00:52   #221
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

7,823 Posts
Default

Quote:
Originally Posted by Prime95 View Post
A bit bummed. Today I plugged my Windows box into Kill-A-Watt.

Best I can tell Kill-A-Watt shows gpuowl drawing an extra 240 watts over an idling Radeon VII. Wattman shows the card drawing only 190 watts.

The box has a Platinum power supply.

Anyone else observing such a discrepancy between Wattman readings and at-the-wall power draw?
That is disappointing. How fully loaded is the supply?
https://www.corsair.com/us/en/blog/8...-benefit-to-me
Where do you find the gpu power draw in WattMan?
Here, gpu-z says radeon7 170W average while running 1 gpuowl V6.11 PRP; meanwhile power meter at plug was 667W;
if gpu idle 20W at gpu-z; plug meter says 456W system input;
ratio of deltas, (170-20)/667-456) = 150/211 = 0.71,
on a Lenovo Thinkstation D30 with 80Plus Gold 1120W labeled power supply.
Of the 456W, CPUID HWMonitor claims the Xeon E5-2697 V2 packages are running 108 and 115 W respectively,
there are also 2 hard drives, NIC, & usual ancillaries;
gpu-z reports the RX550 in the system is pulling 29W running mfakto.

Perhaps gpu-z or Wattman is giving watts after the on-gpu-card regulation?

Last fiddled with by kriesel on 2019-12-04 at 01:01
kriesel is online now   Reply With Quote
Old 2019-12-04, 01:04   #222
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

205716 Posts
Default

I also think Wattman and GPU-Z are under-reporting the GPU watts. I assume are getting the value from the AMD driver.
Prime95 is offline   Reply With Quote
Old 2019-12-04, 01:14   #223
mrh
 
"mrh"
Oct 2018
Temecula, ca

24×32 Posts
Default

I wonder how much is used just spinning the fans?
mrh is offline   Reply With Quote
Old 2019-12-04, 01:26   #224
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

172178 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I also think Wattman and GPU-Z are under-reporting the GPU watts. I assume are getting the value from the AMD driver.
Wattman on my installation is definitely underreporting gpu power; it's not reporting it at all.
https://www.amd.com/en/support/kb/faq/dh-020 shows Power: 22 watts as the last entry, in blue, of the histogram, but no such item appears in my histogram for either the Radeon7 (6 sensors) or RX550 (5 sensors) (Radeon software version 19.20 on Windows 10 Pro)

Last fiddled with by kriesel on 2019-12-04 at 01:27
kriesel is online now   Reply With Quote
Old 2019-12-04, 07:44   #225
xx005fs
 
"Eric"
Jan 2018
USA

223 Posts
Default

Quote:
Originally Posted by Prime95 View Post
I also think Wattman and GPU-Z are under-reporting the GPU watts. I assume are getting the value from the AMD driver.
AMD monitors the power of the GPU package itself, and it doesn't account for the VRM inefficiencies and such (for Vega at least the HBM power should be included, but for GDDR GPUs like 5700XT and polaris series the GDDR power is NOT included in the GPU power report by softwares). I don't recall any tool that could report the total board power for AMD GPUs like Nvidia does but I generally assume the VRMs to be 85% efficient and calculate the total power draw from there, seems relatively accurate.

On the other hand, Nvidia does measure total board power through a shunt resistor. Really hope AMD figures out a way to implement such measurements for a more accurate power reading through software alone instead of needing to purchase a current clamp or killa watt.
xx005fs is offline   Reply With Quote
Old 2019-12-05, 05:03   #226
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Florida

35 Posts
Default

Hello guys. I just got my 2 Radeon VII cards, had to also swap power cables since they are both 2x8pins instead of 8+6pins on the 2080s. I knew it was a smart decision to buy a 1200w psu last year.

Anyways, I cannot get past this error code and is almost pulling my hair out after hours of trial and error lol.

Code:
2019-12-04 22:55:34 config.txt: 
2019-12-04 22:55:34 config.txt: -use NO_ASM
2019-12-04 22:55:34 config.txt: -device 0
2019-12-04 22:55:34 91076731 FFT 5120K: Width 256x4, Height 64x4, Middle 10; 17.37 bits/word
2019-12-04 22:55:34 using short carry kernels
2019-12-04 22:55:34 OpenCL args "-DEXP=91076731u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xc.5e1a319820cbp-3 -DIWEIGHT_STEP=0xa.59819c05a66dp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DNO_ASM=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-12-04 22:55:34 OpenCL compilation error -11 (args -DEXP=91076731u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xc.5e1a319820cbp-3 -DIWEIGHT_STEP=0xa.59819c05a66dp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DNO_ASM=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0)
2019-12-04 22:55:35 "C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 12: warning: 
          OpenCL extension is now part of core
  #pragma OPENCL EXTENSION cl_khr_fp64 : enable
                           ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1428: error: 
          invalid type conversion
      atomic_store_explicit((atomic_uint *) &ready[gr], 1, memory_order_release, memory_scope_device); 
                            ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1437: error: 
          invalid type conversion
      while(!atomic_load_explicit((atomic_uint *) &ready[gr - 1], memory_order_acquire, memory_scope_device));
                                  ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1496: error: 
          invalid type conversion
      atomic_store_explicit((atomic_uint *) &ready[gr], 1, memory_order_release, memory_scope_device); 
                            ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1504: error: 
          invalid type conversion
      while(!atomic_load_explicit((atomic_uint *) &ready[gr - 1], memory_order_acquire, memory_scope_device));
                                  ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1525: error: 
          argument of type "const __global double2 *" is incompatible with
          parameter of type "const double2 *"
    transpose(WIDTH, BIG_HEIGHT, lds, in, out);
                                      ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1525: error: 
          argument of type "__global double2 *" is incompatible with parameter
          of type "double2 *"
    transpose(WIDTH, BIG_HEIGHT, lds, in, out);
                                          ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL24EF.tmp.cl", line 1530: error: 
          argument of type "const __global double2 *" is incompatible with
          parameter of type "const double2 *"
    transpose(BIG_HEIGHT, WIDTH, lds, in, out);
                                      ^

"C:\Us2019-12-04 22:55:35 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:216 build
2019-12-04 22:55:35 Bye
2019-12-04 22:56:36 config.txt: -carry short -fft +0 -use ORIG_X2
2019-12-04 22:56:36 config.txt: -use NO_ASM
2019-12-04 22:56:36 config.txt: -device 0
2019-12-04 22:56:36 91076731 FFT 5120K: Width 256x4, Height 64x4, Middle 10; 17.37 bits/word
2019-12-04 22:56:36 using short carry kernels
2019-12-04 22:56:36 OpenCL args "-DEXP=91076731u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xc.5e1a319820cbp-3 -DIWEIGHT_STEP=0xa.59819c05a66dp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1 -DNO_ASM=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0"
2019-12-04 22:56:36 OpenCL compilation error -11 (args -DEXP=91076731u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=10u -DWEIGHT_STEP=0xc.5e1a319820cbp-3 -DIWEIGHT_STEP=0xa.59819c05a66dp-4 -DWEIGHT_BIGSTEP=0x9.837f0518db8a8p-3 -DIWEIGHT_BIGSTEP=0xd.744fccad69d68p-4 -DORIG_X2=1 -DNO_ASM=1  -I. -cl-fast-relaxed-math -cl-std=CL2.0)
2019-12-04 22:56:36 "C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 12: warning: 
          OpenCL extension is now part of core
  #pragma OPENCL EXTENSION cl_khr_fp64 : enable
                           ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1428: error: 
          invalid type conversion
      atomic_store_explicit((atomic_uint *) &ready[gr], 1, memory_order_release, memory_scope_device); 
                            ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1437: error: 
          invalid type conversion
      while(!atomic_load_explicit((atomic_uint *) &ready[gr - 1], memory_order_acquire, memory_scope_device));
                                  ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1496: error: 
          invalid type conversion
      atomic_store_explicit((atomic_uint *) &ready[gr], 1, memory_order_release, memory_scope_device); 
                            ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1504: error: 
          invalid type conversion
      while(!atomic_load_explicit((atomic_uint *) &ready[gr - 1], memory_order_acquire, memory_scope_device));
                                  ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1525: error: 
          argument of type "const __global double2 *" is incompatible with
          parameter of type "const double2 *"
    transpose(WIDTH, BIG_HEIGHT, lds, in, out);
                                      ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1525: error: 
          argument of type "__global double2 *" is incompatible with parameter
          of type "double2 *"
    transpose(WIDTH, BIG_HEIGHT, lds, in, out);
                                          ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL179C.tmp.cl", line 1530: error: 
          argument of type "const __global double2 *" is incompatible with
          parameter of type "const double2 *"
    transpose(BIG_HEIGHT, WIDTH, lds, in, out);
                                      ^

"C:\Us2019-12-04 22:56:36 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:216 build
2019-12-04 22:56:36 Bye
I got couple other error codes too, and I was able to get around this error by using -device 0
Code:
Exception gpu_error: DEVICE_NOT_FOUND clGetDeviceIDs(platforms[i], kind, 64, devices, &n) at clwrap.cpp:69 getDeviceIDs
and this error by -use no_asm
Code:
2019-12-04 23:02:16 "C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 12: warning: 
          OpenCL extension is now part of core
  #pragma OPENCL EXTENSION cl_khr_fp64 : enable
                           ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 197: error: an
          "asm" declaration is not allowed here
    X2(u[0], u[2]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 197: error: an
          "asm" declaration is not allowed here
    X2(u[0], u[2]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 198: error: an
          "asm" declaration is not allowed here
    X2_mul_t4(u[1], u[3]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 198: error: an
          "asm" declaration is not allowed here
    X2_mul_t4(u[1], u[3]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 199: error: an
          "asm" declaration is not allowed here
    X2(u[0], u[1]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 199: error: an
          "asm" declaration is not allowed here
    X2(u[0], u[1]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 200: error: an
          "asm" declaration is not allowed here
    X2(u[2], u[3]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 200: error: an
          "asm" declaration is not allowed here
    X2(u[2], u[3]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 266: error: an
          "asm" declaration is not allowed here
    X2(u[0], u[4]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 266: error: an
          "asm" declaration is not allowed here
    X2(u[0], u[4]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 267: error: an
          "asm" declaration is not allowed here
    X2(u[1], u[5]);   u[5] = mul_t8(u[5]);
    ^

"C:\Users\DANNYC~1\AppData\Local\Temp\OCL46E1.tmp.cl", line 267: error: an
          "asm" declaration is not allowed here
    X2(u[1], u[5]);   u[5] = mul_t8(u[2019-12-04 23:02:16 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:216 build
2019-12-04 23:02:16 Bye
Thanks.
dcheuk is offline   Reply With Quote
Old 2019-12-05, 08:20   #227
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

101101011002 Posts
Default

Quote:
Originally Posted by dcheuk View Post
Hello guys. I just got my 2 Radeon VII cards, had to also swap power cables since they are both 2x8pins instead of 8+6pins on the 2080s. I knew it was a smart decision to buy a 1200w psu last year.

Anyways, I cannot get past this error code and is almost pulling my hair out after hours of trial and error lol.
We'd need a bit more info:
- what OS (Linux or Windows)? what GPU driver (e.g. ROCm, amdgpu-pro, etc)?
- how many "devices" does clinfo report? (maybe attach the clinfo output)
- "gpuowl -h": what devices does it report (towards the end)
preda is offline   Reply With Quote
Old 2019-12-05, 08:51   #228
M344587487
 
M344587487's Avatar
 
"Composite as Heck"
Oct 2017

2×52×19 Posts
Default

As a debugging step try just one card plugged in at a time. Then try uninstalling everything nvidia, then try reinstalling everything AMD and OpenCL. Then try a fresh OS.

Last fiddled with by M344587487 on 2019-12-05 at 08:52
M344587487 is offline   Reply With Quote
Old 2019-12-05, 09:52   #229
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Florida

35 Posts
Default

Quote:
Originally Posted by preda View Post
We'd need a bit more info:
- what OS (Linux or Windows)? what GPU driver (e.g. ROCm, amdgpu-pro, etc)?
- how many "devices" does clinfo report? (maybe attach the clinfo output)
- "gpuowl -h": what devices does it report (towards the end)
Oh that's right sorry epic failed.

Windows 10 64 bit.
Radeon Adrenalin 2019 19.9.2.
OpenCL API 2.0

gpuowl -h reports the following
Code:
-device <N>        : select a specific device:
 0 : Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz-8x3600-@0:0.0
Clinfo (from oblomov/clinfo on github, hopefully it's clean) returned:
Code:
Number of platforms:                             1
  Platform Profile:                              FULL_PROFILE
  Platform Version:                              OpenCL 2.1 AMD-APP (2906.10)
  Platform Name:                                 AMD Accelerated Parallel Processing
  Platform Vendor:                               Advanced Micro Devices, Inc.
  Platform Extensions:                           cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices


  Platform Name:                                 AMD Accelerated Parallel Processing
Number of devices:                               2
  Device Type:                                   CL_DEVICE_TYPE_GPU
  Vendor ID:                                     1002h
  Board name:                                    AMD Radeon VII
  Device Topology:                               PCI[ B#3, D#0, F#0 ]
  Max compute units:                             60
  Max work items dimensions:                     3
    Max work items[0]:                           1024
    Max work items[1]:                           1024
    Max work items[2]:                           1024
  Max work group size:                           256
  Preferred vector width char:                   4
  Preferred vector width short:                  2
  Preferred vector width int:                    1
  Preferred vector width long:                   1
  Preferred vector width float:                  1
  Preferred vector width double:                 1
  Native vector width char:                      4
  Native vector width short:                     2
  Native vector width int:                       1
  Native vector width long:                      1
  Native vector width float:                     1
  Native vector width double:                    1
  Max clock frequency:                           1801Mhz
  Address bits:                                  64
  Max memory allocation:                         4244635648
  Image support:                                 Yes
  Max number of images read arguments:           128
  Max number of images write arguments:          64
  Max image 2D width:                            16384
  Max image 2D height:                           16384
  Max image 3D width:                            2048
  Max image 3D height:                           2048
  Max image 3D depth:                            2048
  Max samplers within kernel:                    16
  Max size of kernel argument:                   1024
  Alignment (bits) of base address:              2048
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                                     No
    Quiet NaNs:                                  Yes
    Round to nearest even:                       Yes
    Round to zero:                               Yes
    Round to +ve and infinity:                   Yes
    IEEE754-2008 fused multiply-add:             Yes
  Cache type:                                    Read/Write
  Cache line size:                               64
  Cache size:                                    16384
  Global memory size:                            16978542592
  Constant buffer size:                          4244635648
  Max number of constant args:                   8
  Local memory type:                             Scratchpad
  Local memory size:                             32768
  Max pipe arguments:                            16
  Max pipe active reservations:                  16
  Max pipe packet size:                          4244635648
  Max global variable size:                      3820172032
  Max global variable preferred total size:      16978542592
  Max read/write image args:                     64
  Max on device events:                          1024
  Queue on device max size:                      8388608
  Max on device queues:                          1
  Queue on device preferred size:                262144
  SVM capabilities:
    Coarse grain buffer:                         Yes
    Fine grain buffer:                           Yes
    Fine grain system:                           No
    Atomics:                                     No
  Preferred platform atomic alignment:           0
  Preferred global atomic alignment:             0
  Preferred local atomic alignment:              0
  Kernel Preferred work group size multiple:     64
  Error correction support:                      0
  Unified memory for Host and Device:            0
  Profiling timer resolution:                    1
  Device endianess:                              Little
  Available:                                     Yes
  Compiler available:                            Yes
  Execution capabilities:
    Execute OpenCL kernels:                      Yes
    Execute native function:                     No
  Queue on Host properties:
    Out-of-Order:                                No
    Profiling :                                  Yes
  Queue on Device properties:
    Out-of-Order:                                Yes
    Profiling :                                  Yes
  Platform ID:                                   00007FFD97789FD0
  Name:                                          gfx906
  Vendor:                                        Advanced Micro Devices, Inc.
  Device OpenCL C version:                       OpenCL C 2.0
  Driver version:                                2906.10 (PAL,HSAIL)
  Profile:                                       FULL_PROFILE
  Version:                                       OpenCL 2.0 AMD-APP (2906.10)
  Extensions:                                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_copy_buffer_p2p cl_amd_planar_yuv


  Device Type:                                   CL_DEVICE_TYPE_GPU
  Vendor ID:                                     1002h
  Board name:                                    AMD Radeon VII
  Device Topology:                               PCI[ B#6, D#0, F#0 ]
  Max compute units:                             60
  Max work items dimensions:                     3
    Max work items[0]:                           1024
    Max work items[1]:                           1024
    Max work items[2]:                           1024
  Max work group size:                           256
  Preferred vector width char:                   4
  Preferred vector width short:                  2
  Preferred vector width int:                    1
  Preferred vector width long:                   1
  Preferred vector width float:                  1
  Preferred vector width double:                 1
  Native vector width char:                      4
  Native vector width short:                     2
  Native vector width int:                       1
  Native vector width long:                      1
  Native vector width float:                     1
  Native vector width double:                    1
  Max clock frequency:                           1801Mhz
  Address bits:                                  64
  Max memory allocation:                         4244635648
  Image support:                                 Yes
  Max number of images read arguments:           128
  Max number of images write arguments:          64
  Max image 2D width:                            16384
  Max image 2D height:                           16384
  Max image 3D width:                            2048
  Max image 3D height:                           2048
  Max image 3D depth:                            2048
  Max samplers within kernel:                    16
  Max size of kernel argument:                   1024
  Alignment (bits) of base address:              2048
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                                     No
    Quiet NaNs:                                  Yes
    Round to nearest even:                       Yes
    Round to zero:                               Yes
    Round to +ve and infinity:                   Yes
    IEEE754-2008 fused multiply-add:             Yes
  Cache type:                                    Read/Write
  Cache line size:                               64
  Cache size:                                    16384
  Global memory size:                            16978542592
  Constant buffer size:                          4244635648
  Max number of constant args:                   8
  Local memory type:                             Scratchpad
  Local memory size:                             32768
  Max pipe arguments:                            16
  Max pipe active reservations:                  16
  Max pipe packet size:                          4244635648
  Max global variable size:                      3820172032
  Max global variable preferred total size:      16978542592
  Max read/write image args:                     64
  Max on device events:                          1024
  Queue on device max size:                      8388608
  Max on device queues:                          1
  Queue on device preferred size:                262144
  SVM capabilities:
    Coarse grain buffer:                         Yes
    Fine grain buffer:                           Yes
    Fine grain system:                           No
    Atomics:                                     No
  Preferred platform atomic alignment:           0
  Preferred global atomic alignment:             0
  Preferred local atomic alignment:              0
  Kernel Preferred work group size multiple:     64
  Error correction support:                      0
  Unified memory for Host and Device:            0
  Profiling timer resolution:                    1
  Device endianess:                              Little
  Available:                                     Yes
  Compiler available:                            Yes
  Execution capabilities:
    Execute OpenCL kernels:                      Yes
    Execute native function:                     No
  Queue on Host properties:
    Out-of-Order:                                No
    Profiling :                                  Yes
  Queue on Device properties:
    Out-of-Order:                                Yes
    Profiling :                                  Yes
  Platform ID:                                   00007FFD97789FD0
  Name:                                          gfx906
  Vendor:                                        Advanced Micro Devices, Inc.
  Device OpenCL C version:                       OpenCL C 2.0
  Driver version:                                2906.10 (PAL,HSAIL)
  Profile:                                       FULL_PROFILE
  Version:                                       OpenCL 2.0 AMD-APP (2906.10)
  Extensions:                                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_copy_buffer_p2p cl_amd_planar_yuv
I guess that means opencl cannot find either GPUs. However, mfakto was able to find the gpu, with the exception that -d doesn't seem to work.

Quote:
Originally Posted by M344587487 View Post
As a debugging step try just one card plugged in at a time. Then try uninstalling everything nvidia, then try reinstalling everything AMD and OpenCL. Then try a fresh OS.
Uninstalled all nvidia stuff. I would prefer not to refresh the OS lol.
dcheuk is offline   Reply With Quote
Old 2019-12-05, 10:06   #230
preda
 
preda's Avatar
 
"Mihai Preda"
Apr 2015

22·3·112 Posts
Default

It seems that GpuOwl only sees the CPU as an OpenCL device, while clinfo only sees the 2xGPUs. I don't know why, maybe it has something to do with the OpenCL library that each links. Maybe static vs. dynamic lib. Probably somebody with a similar setup may know more.

Quote:
Originally Posted by dcheuk View Post
Oh that's right sorry epic failed.

Windows 10 64 bit.
Radeon Adrenalin 2019 19.9.2.
OpenCL API 2.0

gpuowl -h reports the following
Code:
-device <N>        : select a specific device:
 0 : Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz-8x3600-@0:0.0
Clinfo (from oblomov/clinfo on github, hopefully it's clean) returned:
Code:
Number of platforms:                             1
  Platform Profile:                              FULL_PROFILE
  Platform Version:                              OpenCL 2.1 AMD-APP (2906.10)
  Platform Name:                                 AMD Accelerated Parallel Processing
  Platform Vendor:                               Advanced Micro Devices, Inc.
  Platform Extensions:                           cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices


  Platform Name:                                 AMD Accelerated Parallel Processing
Number of devices:                               2
  Device Type:                                   CL_DEVICE_TYPE_GPU
  Vendor ID:                                     1002h
  Board name:                                    AMD Radeon VII
  Device Topology:                               PCI[ B#3, D#0, F#0 ]
  Max compute units:                             60
  Max work items dimensions:                     3
    Max work items[0]:                           1024
    Max work items[1]:                           1024
    Max work items[2]:                           1024
  Max work group size:                           256
  Preferred vector width char:                   4
  Preferred vector width short:                  2
  Preferred vector width int:                    1
  Preferred vector width long:                   1
  Preferred vector width float:                  1
  Preferred vector width double:                 1
  Native vector width char:                      4
  Native vector width short:                     2
  Native vector width int:                       1
  Native vector width long:                      1
  Native vector width float:                     1
  Native vector width double:                    1
  Max clock frequency:                           1801Mhz
  Address bits:                                  64
  Max memory allocation:                         4244635648
  Image support:                                 Yes
  Max number of images read arguments:           128
  Max number of images write arguments:          64
  Max image 2D width:                            16384
  Max image 2D height:                           16384
  Max image 3D width:                            2048
  Max image 3D height:                           2048
  Max image 3D depth:                            2048
  Max samplers within kernel:                    16
  Max size of kernel argument:                   1024
  Alignment (bits) of base address:              2048
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                                     No
    Quiet NaNs:                                  Yes
    Round to nearest even:                       Yes
    Round to zero:                               Yes
    Round to +ve and infinity:                   Yes
    IEEE754-2008 fused multiply-add:             Yes
  Cache type:                                    Read/Write
  Cache line size:                               64
  Cache size:                                    16384
  Global memory size:                            16978542592
  Constant buffer size:                          4244635648
  Max number of constant args:                   8
  Local memory type:                             Scratchpad
  Local memory size:                             32768
  Max pipe arguments:                            16
  Max pipe active reservations:                  16
  Max pipe packet size:                          4244635648
  Max global variable size:                      3820172032
  Max global variable preferred total size:      16978542592
  Max read/write image args:                     64
  Max on device events:                          1024
  Queue on device max size:                      8388608
  Max on device queues:                          1
  Queue on device preferred size:                262144
  SVM capabilities:
    Coarse grain buffer:                         Yes
    Fine grain buffer:                           Yes
    Fine grain system:                           No
    Atomics:                                     No
  Preferred platform atomic alignment:           0
  Preferred global atomic alignment:             0
  Preferred local atomic alignment:              0
  Kernel Preferred work group size multiple:     64
  Error correction support:                      0
  Unified memory for Host and Device:            0
  Profiling timer resolution:                    1
  Device endianess:                              Little
  Available:                                     Yes
  Compiler available:                            Yes
  Execution capabilities:
    Execute OpenCL kernels:                      Yes
    Execute native function:                     No
  Queue on Host properties:
    Out-of-Order:                                No
    Profiling :                                  Yes
  Queue on Device properties:
    Out-of-Order:                                Yes
    Profiling :                                  Yes
  Platform ID:                                   00007FFD97789FD0
  Name:                                          gfx906
  Vendor:                                        Advanced Micro Devices, Inc.
  Device OpenCL C version:                       OpenCL C 2.0
  Driver version:                                2906.10 (PAL,HSAIL)
  Profile:                                       FULL_PROFILE
  Version:                                       OpenCL 2.0 AMD-APP (2906.10)
  Extensions:                                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_copy_buffer_p2p cl_amd_planar_yuv


  Device Type:                                   CL_DEVICE_TYPE_GPU
  Vendor ID:                                     1002h
  Board name:                                    AMD Radeon VII
  Device Topology:                               PCI[ B#6, D#0, F#0 ]
  Max compute units:                             60
  Max work items dimensions:                     3
    Max work items[0]:                           1024
    Max work items[1]:                           1024
    Max work items[2]:                           1024
  Max work group size:                           256
  Preferred vector width char:                   4
  Preferred vector width short:                  2
  Preferred vector width int:                    1
  Preferred vector width long:                   1
  Preferred vector width float:                  1
  Preferred vector width double:                 1
  Native vector width char:                      4
  Native vector width short:                     2
  Native vector width int:                       1
  Native vector width long:                      1
  Native vector width float:                     1
  Native vector width double:                    1
  Max clock frequency:                           1801Mhz
  Address bits:                                  64
  Max memory allocation:                         4244635648
  Image support:                                 Yes
  Max number of images read arguments:           128
  Max number of images write arguments:          64
  Max image 2D width:                            16384
  Max image 2D height:                           16384
  Max image 3D width:                            2048
  Max image 3D height:                           2048
  Max image 3D depth:                            2048
  Max samplers within kernel:                    16
  Max size of kernel argument:                   1024
  Alignment (bits) of base address:              2048
  Minimum alignment (bytes) for any datatype:    128
  Single precision floating point capability
    Denorms:                                     No
    Quiet NaNs:                                  Yes
    Round to nearest even:                       Yes
    Round to zero:                               Yes
    Round to +ve and infinity:                   Yes
    IEEE754-2008 fused multiply-add:             Yes
  Cache type:                                    Read/Write
  Cache line size:                               64
  Cache size:                                    16384
  Global memory size:                            16978542592
  Constant buffer size:                          4244635648
  Max number of constant args:                   8
  Local memory type:                             Scratchpad
  Local memory size:                             32768
  Max pipe arguments:                            16
  Max pipe active reservations:                  16
  Max pipe packet size:                          4244635648
  Max global variable size:                      3820172032
  Max global variable preferred total size:      16978542592
  Max read/write image args:                     64
  Max on device events:                          1024
  Queue on device max size:                      8388608
  Max on device queues:                          1
  Queue on device preferred size:                262144
  SVM capabilities:
    Coarse grain buffer:                         Yes
    Fine grain buffer:                           Yes
    Fine grain system:                           No
    Atomics:                                     No
  Preferred platform atomic alignment:           0
  Preferred global atomic alignment:             0
  Preferred local atomic alignment:              0
  Kernel Preferred work group size multiple:     64
  Error correction support:                      0
  Unified memory for Host and Device:            0
  Profiling timer resolution:                    1
  Device endianess:                              Little
  Available:                                     Yes
  Compiler available:                            Yes
  Execution capabilities:
    Execute OpenCL kernels:                      Yes
    Execute native function:                     No
  Queue on Host properties:
    Out-of-Order:                                No
    Profiling :                                  Yes
  Queue on Device properties:
    Out-of-Order:                                Yes
    Profiling :                                  Yes
  Platform ID:                                   00007FFD97789FD0
  Name:                                          gfx906
  Vendor:                                        Advanced Micro Devices, Inc.
  Device OpenCL C version:                       OpenCL C 2.0
  Driver version:                                2906.10 (PAL,HSAIL)
  Profile:                                       FULL_PROFILE
  Version:                                       OpenCL 2.0 AMD-APP (2906.10)
  Extensions:                                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_copy_buffer_p2p cl_amd_planar_yuv
I guess that means opencl cannot find either GPUs. However, mfakto was able to find the gpu, with the exception that -d doesn't seem to work.



Uninstalled all nvidia stuff. I would prefer not to refresh the OS lol.
preda is offline   Reply With Quote
Old 2019-12-05, 10:11   #231
dcheuk
 
dcheuk's Avatar
 
Jan 2019
Florida

35 Posts
Default

Quote:
Originally Posted by preda View Post
It seems that GpuOwl only sees the CPU as an OpenCL device, while clinfo only sees the 2xGPUs. I don't know why, maybe it has something to do with the OpenCL library that each links. Maybe static vs. dynamic lib. Probably somebody with a similar setup may know more.
Well thanks for the suggestion and help. Tomorrow I'll poke around and if nothing works then will pull one card out as M344587487 suggested and see what happens.
dcheuk is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Vega 20 announced with 7.64 TFlops of FP64 M344587487 GPU Computing 4 2018-11-08 16:56
GTX 1180 Mars Volta consumer card specs leaked tServo GPU Computing 20 2018-06-24 08:04
RX Vega performance xx005fs GPU Computing 5 2018-01-17 00:22
Radeon Pro Duo 0PolarBearsHere GPU Computing 0 2016-03-15 01:32
AMD Radeon R9 295X2 firejuggler GPU Computing 33 2014-09-03 21:42

All times are UTC. The time now is 14:34.


Fri Jul 7 14:34:37 UTC 2023 up 323 days, 12:03, 0 users, load averages: 0.57, 0.63, 0.85

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔