View Single Post
Old 2021-09-13, 16:45   #9
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·3·13·37 Posts
Default Intel i7-1165G7/Iris Xe

This is an 11th-generation laptop oriented CPU and IGP combination. Testing was performed on a single sample with 16GB ram in a Dell Inspiron 3501.



lsgpu output is:
Code:
lsgpu, derived/modified from https://gist.github.com/CptFoobar/bcb513d87e574e69c2db
1 Platform found.

Platform 0
1 Device: Intel(R) Iris(R) Xe Graphics
  1.1 Vendor: Intel(R) Corporation
  1.2 Type: CL_DEVICE_TYPE_GPU
  1.3 Hardware version: OpenCL 3.0 NEO
  1.4 Software version: 27.20.100.9365
  1.5 OpenCL version: OpenCL C 1.2
  1.6 Little Endian: Yes
  1.7 Max Clock frequency: 1300 MHz
  1.8 Image support available: Yes
  1.9 Parallel compute units: 96
  1.10 OpenCL Device Availability: Yes
  1.11 OpenCL Compiler Availability: Yes
  1.12 OpenCL Linker Availability: Yes
Mfakto:
Complete fail so far on this IGP on Windows. Tried my usual configuration, with mfakto v0.15pre7; then in pre6, usual config, (auto), then Vectorsize=1, type=Auto, Intel, GCN. I suspect it's an issue with OpenCL v3.0 not being recognized or not handled by mfakto.
The IGP driver is 27.20.100.9365 DCH/Win10 64 (per GPU-Z v2.40.0)
Code:
mfakto 0.15pre6-Win (64bit build)


Runtime options
  Inifile                   mfakto.ini
  Verbosity                 1
  SieveOnGPU                yes
  MoreClasses               yes
  GPUSievePrimes            81157
  GPUSieveProcessSize       24Ki bits
  GPUSieveSize              96Mi bits
  FlushInterval             0
  WorkFile                  worktodo.txt
  ResultsFile               results.txt
  Checkpoints               enabled
  CheckpointDelay           300s
  Stages                    enabled
  StopAfterFactor           class
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                martinella-IrisXeIGP
  TimeStampInResults        yes
  VectorSize                1
WARNING: Unknown setting "Intel" for GPUType, using default (AUTO)
  GPUType                   AUTO
  SmallExp                  no
  UseBinfile                mfakto_Kernels.elf
Compiletime options

Select device - Get device info:
WARNING: Unknown GPU name, assuming GCN. Please post the device name "Intel(R) Iris(R) Xe Graphics (Intel(R) Corporation)" to http://www.mersenneforum.org/showthread.php?t=15646 to have it added to mfakto. Set GPUType in mfakto.ini to select a GPU type yourself to avoid this warning.
WARNING: VectorSize=1 is known to fail on AMD GPUs and drivers. If the selftest fails, please increase VectorSize to 2 at least. See http://devgurus.amd.com/thread/167571 for latest news about this issue.
OpenCL device info
  name                      Intel(R) Iris(R) Xe Graphics (Intel(R) Corporation)
  device (driver) version   OpenCL 3.0 NEO  (27.20.100.9365)
  maximum threads per block 256
  maximum threads per grid  16777216
  number of multiprocessors 96 (6144 compute elements)
  clock rate                1300MHz

Automatic parameters
  threads per grid          0
  optimizing kernels for    GCN

Compiling kernels.
 
    BUILD OUTPUT

Unrecognized build options: -O3
     END OF BUILD OUTPUT
ERROR: load_kernels(0) failed
mfakto -d01 --CLtest output:
Code:
mfakto -d 01 --CLtest
mfakto 0.15pre6-Win (64bit build)


Runtime options
  Inifile                   mfakto.ini
  Verbosity                 1
  SieveOnGPU                yes
  MoreClasses               yes
  GPUSievePrimes            81157
  GPUSieveProcessSize       24Ki bits
  GPUSieveSize              96Mi bits
  FlushInterval             0
  WorkFile                  worktodo.txt
  ResultsFile               results.txt
  Checkpoints               enabled
  CheckpointDelay           300s
  Stages                    enabled
  StopAfterFactor           class
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                martinella-IrisXeIGP
  TimeStampInResults        yes
  VectorSize                1
  GPUType                   GCN
  SmallExp                  no
  UseBinfile                mfakto_Kernels.elf
OpenCL Platform 1/1: Intel(R) Corporation, Version: OpenCL 3.0
Error: No platform found
Error -32 (Invalid platform): clCreateContextFromType(GPU)
Error -34 (Invalid context): clGetContextInfo(CL_CONTEXT_NUM_DEVICES) - assuming one device
Error -34 (Invalid context): clGetContextInfo(numdevs)
Error: Out of memory.
Error -34 (Invalid context): clGetContextInfo(devices)
mfakto -d11 --CLtest output:
Code:
mfakto -d 11 --CLtest
mfakto 0.15pre6-Win (64bit build)


Runtime options
  Inifile                   mfakto.ini
  Verbosity                 1
  SieveOnGPU                yes
  MoreClasses               yes
  GPUSievePrimes            81157
  GPUSieveProcessSize       24Ki bits
  GPUSieveSize              96Mi bits
  FlushInterval             0
  WorkFile                  worktodo.txt
  ResultsFile               results.txt
  Checkpoints               enabled
  CheckpointDelay           300s
  Stages                    enabled
  StopAfterFactor           class
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                martinella-IrisXeIGP
  TimeStampInResults        yes
  VectorSize                1
  GPUType                   GCN
  SmallExp                  no
  UseBinfile                mfakto_Kernels.elf
OpenCL Platform 1/1: Intel(R) Corporation, Version: OpenCL 3.0
Device 1/1: Intel(R) Iris(R) Xe Graphics (Intel(R) Corporation),
device version: OpenCL 3.0 NEO , driver version: 27.20.100.9365
Extensions: cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_intel_command_queue_families cl_intel_subgroups cl_intel_required_subgroup_size cl_intel_subgroups_short cl_khr_spir cl_intel_accelerator cl_intel_driver_diagnostics cl_khr_priority_hints cl_khr_throttle_hints cl_khr_create_command_queue cl_intel_subgroups_char cl_intel_subgroups_long cl_khr_il_program cl_intel_mem_force_host_memory cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_subgroup_clustered_reduce cl_intel_spirv_media_block_io cl_intel_spirv_subgroups cl_khr_spirv_no_integer_wrap_decoration cl_intel_unified_shared_memory_preview cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_intel_planar_yuv cl_intel_packed_yuv cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_3d_image_writes cl_intel_media_block_io cl_khr_gl_sharing cl_khr_gl_depth_images cl_khr_gl_event cl_khr_gl_msaa_sharing cl_intel_dx9_media_sharing cl_khr_dx9_media_sharing cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_intel_d3d11_nv12_media_sharing cl_intel_unified_sharing cl_intel_subgroup_local_block_io cl_intel_simultaneous_sharing
Global memory:6755676160, Global memory cache: 1048576, local memory: 65536, workgroup size: 256, Work dimensions: 3[256, 256, 256, 0, 0] , Max clock speed:1300, compute units:96
Compiling kernels (build options: "-I. -DVECTOR_SIZE=1 -O3 -DMORE_CLASSES -DCL_GPU_SIEVE").
        BUILD OUTPUT

Unrecognized build options: -O3
        END OF BUILD OUTPUT
Error -11 (Build program failure): clBuildProgram
.Error -45 (Invalid program executable): Creating Kernel test_k from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel mfakto_cl_71 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel mfakto_cl_63 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett32_79 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett32_77 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett32_76 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett32_92 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett32_88 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett32_87 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_73 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_69 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_70 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_71 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_88 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_83 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_82 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_barrett15_74 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_mg62 from program. (clCreateKernel)
.Error -45 (Invalid program executable): Creating Kernel cl_mg88 from program. (clCreateKernel)
Error -48 (Invalid kernel): Setting kernel argument. (hi)
Error -48 (Invalid kernel): Setting kernel argument. (lo)
Error -48 (Invalid kernel): Setting kernel argument. (q)
Error -48 (Invalid kernel): Setting kernel argument. (qr)
Error -48 (Invalid kernel): Setting kernel argument. (RES)
loop 1:
Error -48 (Invalid kernel): Enqueuing kernel(clEnqueueNDRangeKernel)
Gpuowl:
V1.9 with DP transform fails verbosely, raining down 2124 lines of text. The startup, unceremonious exit, and first 10 and last 10 errors are shown following, in 67 lines of output:
Code:
gpuOwL v1.9- GPU Mersenne primality checker
Intel(R) Iris(R) Xe Graphics, 96x1300MHz
OpenCL compilation error -11 (args -I. -cl-fast-relaxed-math -cl-std=CL2.0  -DEXP=77936867u -DWIDTH=1024u -DHEIGHT=2048u -DLOG_NWORDS=22u -DFP_DP=1 )
In file included from 1:1:
./gpuowl.cl:67:26: warning: unsupported OpenCL extension 'cl_khr_fp64' - ignoring
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
                         ^
./gpuowl.cl:68:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
typedef double T;
        ^
./gpuowl.cl:69:9: error: unknown type name 'double2'; did you mean 'double'?
typedef double2 T2;
        ^~~~~~~
        double
./gpuowl.cl:69:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
./gpuowl.cl:84:7: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
      ^
./gpuowl.cl:84:12: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
           ^
./gpuowl.cl:84:1: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
^
./gpuowl.cl:84:27: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
                          ^
./gpuowl.cl:117:7: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T neg(T x) { return -x; }
      ^
./gpuowl.cl:117:1: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T neg(T x) { return -x; }
^
...
./gpuowl.cl:789:46: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
KERNEL(256) tail(P(T2) io, Trig smallTrig, P(T2) bigTrig) {
                                             ^
./gpuowl.cl:790:9: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
  local T lds[HEIGHT];
        ^
./gpuowl.cl:791:3: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
  T2 u[N_HEIGHT];
  ^
./gpuowl.cl:792:3: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
  T2 v[N_HEIGHT];
  ^
./gpuowl.cl:796:27: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
KERNEL(256) transposeW(CP(T2) in, P(T2) out, Trig bigTrig) {
                          ^
./gpuowl.cl:796:37: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
KERNEL(256) transposeW(CP(T2) in, P(T2) out, Trig bigTrig) {
                                    ^
./gpuowl.cl:797:9: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
  local T lds[4096];
        ^
./gpuowl.cl:801:27: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
KERNEL(256) transposeH(CP(T2) in, P(T2) out, Trig bigTrig) {
                          ^
./gpuowl.cl:801:37: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
KERNEL(256) transposeH(CP(T2) in, P(T2) out, Trig bigTrig) {
                                    ^
./gpuowl.cl:802:9: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
  local T lds[4096];
        ^


Bye
A retry with -fft M61 appears to work, so it may be usable as PRP DC or PRP DC of LL first test, although it lacks proof generation so would require two PRP tests. Run time estimates are long:
Code:
\gpuowl-v1.9-74f1a38>gpuowl -device 0 -user kriesel -cpu martinella-IrisXeIGP -fft M61
gpuOwL v1.9- GPU Mersenne primality checker
Intel(R) Iris(R) Xe Graphics, 96x1300MHz

OpenCL compilation in 45840 ms, with "-I. -cl-fast-relaxed-math -cl-std=CL2.0  -DEXP=77936867u -DWIDTH=1024u -DHEIGHT=2048u -DLOG_NWORDS=22u -DFGT_61=1 -DLOG_ROOT2=49u "
Note: using long carry kernels
PRP-3: FFT 4M (1024 * 2048 * 2) of 77936867 (18.58 bits/word) [2021-09-13 12:35:42 Central Daylight Time]
Starting at iteration 0
OK        0 / 77936867 [ 0.00%], 0.00 ms/it; ETA 0d 00:00; 0000000000000003 [12:36:04]
OK     1000 / 77936867 [ 0.00%], 40.50 ms/it; ETA 36d 12:52; 9711fce020e74461 [12:37:07]
OK     5000 / 77936867 [ 0.01%], 40.58 ms/it; ETA 36d 14:22; 31d8d3401e6fe48d [12:40:11]
OK    10000 / 77936867 [ 0.01%], 39.37 ms/it; ETA 35d 12:13; fc4f135f7cf4ad29 [12:43:49]
OK    20000 / 77936867 [ 0.03%], 38.88 ms/it; ETA 35d 01:36; 3cd1bd9d5e09cbc5 [12:50:40]
OK    40000 / 77936867 [ 0.05%], 38.25 ms/it; ETA 34d 11:34; dffe1b1b0d748128 [13:03:46]
OK    60000 / 77936867 [ 0.08%], 38.51 ms/it; ETA 34d 17:01; 0945da4dc08bdd95 [13:16:56]
Averaging the last two lines, 38.38 ms/it corresponds to ~7.03 GHD/day alongside prime95. There's also some severe associated reduction in prime95 throughput, to ~31%. The prime95-only benchmark for 4200K fft length required for that same exponent was 7 ms/iter, 142.84 iter/sec, as 4 cores 1 worker, corresponding to 38.51 GHD/day. So that would be a net loss of 38.51*0.69 -7.03 = 19.54 GHzD/day, 51% net loss.

Last fiddled with by kriesel on 2021-09-13 at 19:10
kriesel is offline