mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing > GpuOwl

Reply
 
Thread Tools
Old 2022-10-22, 22:42   #2817
DrobinsonPE
 
Aug 2020

22×41 Posts
Default

RX 6600, AMD Adrenalin Software reported 100W, Kill-A-Watt reported 174W at the plug.

V7
Code:
C:\Users\User\GPUOWL\v72112>gpuowl-win -log 10000 -iters 200000 -prp 77936867
20221022 15:12:06 GpuOwl VERSION v7.2-112-gd6ad1e0-dirty
20221022 15:12:06 GpuOwl VERSION v7.2-112-gd6ad1e0-dirty
20221022 15:12:06 config: -log 20000 -device 0 -maxAlloc 6G
20221022 15:12:06 config: -log 10000 -iters 200000 -prp 77936867
20221022 15:12:06 device 0, unique id ''
20221022 15:12:07 gfx1032-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
20221022 15:12:07 gfx1032-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DWEIGHT_STEP=0.33644726404543274 -DIWEIGHT_STEP=-0.25174750481886216 -DIWEIGHTS={0,-0.44011820345520131,-0.37306474779553728,-0.29798072935699788,-0.21390437908665341,-0.11975874301407295,-0.014337887291734644,-0.44814572555075455,} -DFWEIGHTS={0,0.78609128957452257,0.5950610473469905,0.42446232150303748,0.2721098723818392,0.1360521812214803,0.014546452690911484,0.81207258201996746,}  -cl-std=CL2.0 -cl-finite-math-only "
20221022 15:12:09 gfx1032-0 77936867 OpenCL compilation in 2.12 s
20221022 15:12:09 gfx1032-0 77936867 maxAlloc: 6.0 GB
20221022 15:12:09 gfx1032-0 77936867 P1(0) 0 bits
20221022 15:12:09 gfx1032-0 77936867 PRP starting from beginning
20221022 15:12:10 gfx1032-0 77936867 OK         0 on-load: blockSize 400, 0000000000000003
20221022 15:12:10 gfx1032-0 77936867 validating proof residues for power 8
20221022 15:12:10 gfx1032-0 77936867 Proof using power 8
20221022 15:12:12 gfx1032-0 77936867 OK       800   0.00% 1579c241dc63eca6 1831 us/it + check 0.77s + save 0.12s; ETA 1d 15:39
20221022 15:12:30 gfx1032-0 77936867 OK     10000   0.01% fc4f135f7cf4ad29 1836 us/it + check 0.77s + save 0.12s; ETA 1d 15:44
20221022 15:12:49 gfx1032-0 77936867 OK     20000   0.03% 3cd1bd9d5e09cbc5 1842 us/it + check 0.77s + save 0.12s; ETA 1d 15:52
20221022 15:13:09 gfx1032-0 77936867 OK     30000   0.04% c4e0ff35e3290d98 1844 us/it + check 0.77s + save 0.12s; ETA 1d 15:55
20221022 15:13:28 gfx1032-0 77936867 OK     40000   0.05% dffe1b1b0d748128 1845 us/it + check 0.77s + save 0.12s; ETA 1d 15:55
20221022 15:13:47 gfx1032-0 77936867 OK     50000   0.06% 52e286945371ed29 1846 us/it + check 0.77s + save 0.12s; ETA 1d 15:57
20221022 15:14:07 gfx1032-0 77936867 OK     60000   0.08% 0945da4dc08bdd95 1846 us/it + check 0.77s + save 0.17s; ETA 1d 15:55
20221022 15:14:26 gfx1032-0 77936867 OK     70000   0.09% 7131fa4eb77f4bb2 1846 us/it + check 0.77s + save 0.12s; ETA 1d 15:56
20221022 15:14:45 gfx1032-0 77936867 OK     80000   0.10% 8d76071d27ee4221 1846 us/it + check 0.77s + save 0.11s; ETA 1d 15:56
20221022 15:15:05 gfx1032-0 77936867 OK     90000   0.12% 0bacff453b2f470e 1847 us/it + check 0.77s + save 0.11s; ETA 1d 15:56
20221022 15:15:24 gfx1032-0 77936867 OK    100000   0.13% 6d7296b9e2830f50 1847 us/it + check 0.77s + save 0.12s; ETA 1d 15:56
20221022 15:15:44 gfx1032-0 77936867 OK    110000   0.14% 8cbfd4435622bda7 1847 us/it + check 0.77s + save 0.15s; ETA 1d 15:56
20221022 15:16:03 gfx1032-0 77936867 OK    120000   0.15% 79ae5dad855057ad 1847 us/it + check 0.77s + save 0.18s; ETA 1d 15:55
20221022 15:16:22 gfx1032-0 77936867 OK    130000   0.17% 50c97bcbf876231f 1847 us/it + check 0.77s + save 0.12s; ETA 1d 15:55
20221022 15:16:42 gfx1032-0 77936867 OK    140000   0.18% e1db15f897271496 1847 us/it + check 0.77s + save 0.18s; ETA 1d 15:55
20221022 15:17:01 gfx1032-0 77936867 OK    150000   0.19% 127631386c6a9b17 1847 us/it + check 0.77s + save 0.12s; ETA 1d 15:55
20221022 15:17:20 gfx1032-0 77936867 OK    160000   0.21% 25b7b6206fc6f085 1847 us/it + check 0.77s + save 0.11s; ETA 1d 15:55
20221022 15:17:40 gfx1032-0 77936867 OK    170000   0.22% 416816b0d9f4bba8 1847 us/it + check 0.77s + save 0.12s; ETA 1d 15:54
20221022 15:17:59 gfx1032-0 77936867 OK    180000   0.23% 6bee5d054f770861 1847 us/it + check 0.77s + save 0.12s; ETA 1d 15:54
20221022 15:18:19 gfx1032-0 77936867 OK    190000   0.24% f37f068f014b18a0 1848 us/it + check 0.77s + save 0.11s; ETA 1d 15:54
20221022 15:18:37 gfx1032-0 77936867 Stopping, please wait..
20221022 15:18:38 gfx1032-0 77936867 OK    200000   0.26% f0b04b45b0855bd2 1848 us/it + check 0.77s + save 0.12s; ETA 1d 15:54
20221022 15:18:38 gfx1032-0 Exiting because "stop requested"
20221022 15:18:38 gfx1032-0 Bye
V6
Code:
C:\Users\User\GPUOWL\v611380>gpuowl-win -log 10000 -iters 200000 -prp 77936867
2022-10-22 15:22:14 gpuowl v6.11-380-g79ea0cc
2022-10-22 15:22:14 config: -log 20000 -device 0
2022-10-22 15:22:14 config: -log 10000 -iters 200000 -prp 77936867
2022-10-22 15:22:14 device 0, unique id ''
2022-10-22 15:22:14 gfx1032-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2022-10-22 15:22:14 gfx1032-0 Expected maximum carry32: 583B0000
2022-10-22 15:22:14 gfx1032-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-10-22 15:22:17 gfx1032-0 OpenCL compilation in 2.85 s
2022-10-22 15:22:18 gfx1032-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2022-10-22 15:22:18 gfx1032-0 validating proof residues for power 8
2022-10-22 15:22:18 gfx1032-0 Proof using power 8
2022-10-22 15:22:20 gfx1032-0 77936867 OK      800   0.00%; 1853 us/it; ETA 1d 16:07; 1579c241dc63eca6 (check 0.78s)
2022-10-22 15:22:38 gfx1032-0 77936867 OK    10000   0.01%; 1857 us/it; ETA 1d 16:12; fc4f135f7cf4ad29 (check 0.78s)
2022-10-22 15:22:58 gfx1032-0 77936867 OK    20000   0.03%; 1861 us/it; ETA 1d 16:17; 3cd1bd9d5e09cbc5 (check 0.78s)
2022-10-22 15:23:17 gfx1032-0 77936867 OK    30000   0.04%; 1861 us/it; ETA 1d 16:17; c4e0ff35e3290d98 (check 0.78s)
2022-10-22 15:23:36 gfx1032-0 77936867 OK    40000   0.05%; 1862 us/it; ETA 1d 16:18; dffe1b1b0d748128 (check 0.78s)
2022-10-22 15:23:56 gfx1032-0 77936867 OK    50000   0.06%; 1863 us/it; ETA 1d 16:19; 52e286945371ed29 (check 0.78s)
2022-10-22 15:24:15 gfx1032-0 77936867 OK    60000   0.08%; 1863 us/it; ETA 1d 16:18; 0945da4dc08bdd95 (check 0.78s)
2022-10-22 15:24:35 gfx1032-0 77936867 OK    70000   0.09%; 1864 us/it; ETA 1d 16:19; 7131fa4eb77f4bb2 (check 0.78s)
2022-10-22 15:24:54 gfx1032-0 77936867 OK    80000   0.10%; 1864 us/it; ETA 1d 16:19; 8d76071d27ee4221 (check 0.78s)
2022-10-22 15:25:14 gfx1032-0 77936867 OK    90000   0.12%; 1864 us/it; ETA 1d 16:19; 0bacff453b2f470e (check 0.78s)
2022-10-22 15:25:33 gfx1032-0 77936867 OK   100000   0.13%; 1865 us/it; ETA 1d 16:19; 6d7296b9e2830f50 (check 0.78s)
2022-10-22 15:25:52 gfx1032-0 77936867 OK   110000   0.14%; 1864 us/it; ETA 1d 16:18; 8cbfd4435622bda7 (check 0.78s)
2022-10-22 15:26:12 gfx1032-0 77936867 OK   120000   0.15%; 1865 us/it; ETA 1d 16:18; 79ae5dad855057ad (check 0.78s)
2022-10-22 15:26:31 gfx1032-0 77936867 OK   130000   0.17%; 1865 us/it; ETA 1d 16:18; 50c97bcbf876231f (check 0.78s)
2022-10-22 15:26:51 gfx1032-0 77936867 OK   140000   0.18%; 1864 us/it; ETA 1d 16:17; e1db15f897271496 (check 0.78s)
2022-10-22 15:27:10 gfx1032-0 77936867 OK   150000   0.19%; 1865 us/it; ETA 1d 16:17; 127631386c6a9b17 (check 0.78s)
2022-10-22 15:27:30 gfx1032-0 77936867 OK   160000   0.21%; 1864 us/it; ETA 1d 16:17; 25b7b6206fc6f085 (check 0.78s)
2022-10-22 15:27:49 gfx1032-0 77936867 OK   170000   0.22%; 1864 us/it; ETA 1d 16:16; 416816b0d9f4bba8 (check 0.78s)
2022-10-22 15:28:08 gfx1032-0 77936867 OK   180000   0.23%; 1865 us/it; ETA 1d 16:16; 6bee5d054f770861 (check 0.78s)
2022-10-22 15:28:28 gfx1032-0 77936867 OK   190000   0.24%; 1864 us/it; ETA 1d 16:16; f37f068f014b18a0 (check 0.78s)
2022-10-22 15:28:46 gfx1032-0 Stopping, please wait..
2022-10-22 15:28:47 gfx1032-0 77936867 OK   200000   0.26%; 1865 us/it; ETA 1d 16:16; f0b04b45b0855bd2 (check 0.78s)
2022-10-22 15:28:47 gfx1032-0 Exiting because "stop requested"
2022-10-22 15:28:47 gfx1032-0 Bye
DrobinsonPE is offline   Reply With Quote
Old 2022-10-23, 02:47   #2818
DrobinsonPE
 
Aug 2020

22·41 Posts
Default

RX 6600 Efficiency running gpuowl. See attached picture.
Attached Thumbnails
Click image for larger version

Name:	RX 6600 GPUOWL Efficiency.png
Views:	54
Size:	55.0 KB
ID:	27502  
DrobinsonPE is offline   Reply With Quote
Old 2022-10-23, 21:41   #2819
DrobinsonPE
 
Aug 2020

22×41 Posts
Default

As a follow-up to this post https://www.mersenneforum.org/showpo...postcount=2763, attached is a picture of the gpuowl efficiency for a RX6500XT.
Attached Thumbnails
Click image for larger version

Name:	RX 6500XT GPUOWL Efficiency.png
Views:	54
Size:	56.0 KB
ID:	27508  
DrobinsonPE is offline   Reply With Quote
Old 2022-11-07, 19:45   #2820
DrobinsonPE
 
Aug 2020

22×41 Posts
Default

Ryzen 7-5700G APU, GPUOWL versions 7 and 6.
Code:
Ryzen 7-5700G GPUOWL testing.
Monitor GPU(SOC) power use using Ryzen Master
Monitor plug power use using a power monitor

Idle
Plug Power = 31W, SOC Power = 5W

v7.2-112
Plug Power = 70W, SOC Power = 30W, GFX Clock = 2000MHz

C:\Users\User\GPUOWL\v72112>gpuowl-win -iters 200000 -prp 77936867
20221106 20:02:28 GpuOwl VERSION v7.2-112-gd6ad1e0-dirty
20221106 20:02:28 GpuOwl VERSION v7.2-112-gd6ad1e0-dirty
20221106 20:02:28 config: -log 20000
20221106 20:02:28 config: -iters 200000 -prp 77936867
20221106 20:02:28 device 0, unique id ''
20221106 20:02:28 gfx90c-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
20221106 20:02:28 gfx90c-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DWEIGHT_STEP=0.33644726404543274 -DIWEIGHT_STEP=-0.25174750481886216 -DIWEIGHTS={0,-0.44011820345520131,-0.37306474779553728,-0.29798072935699788,-0.21390437908665341,-0.11975874301407295,-0.014337887291734644,-0.44814572555075455,} -DFWEIGHTS={0,0.78609128957452257,0.5950610473469905,0.42446232150303748,0.2721098723818392,0.1360521812214803,0.014546452690911484,0.81207258201996746,}  -cl-std=CL2.0 -cl-finite-math-only "
20221106 20:02:28 gfx90c-0 77936867 ASM compilation failed, retrying compilation using NO_ASM
20221106 20:02:30 gfx90c-0 77936867 OpenCL compilation in 1.74 s
20221106 20:02:30 gfx90c-0 77936867 maxAlloc: 0.0 GB
20221106 20:02:30 gfx90c-0 77936867 You should use -maxAlloc if your GPU has more than 4GB memory. See help '-h'
20221106 20:02:30 gfx90c-0 77936867 P1(0) 0 bits
20221106 20:02:30 gfx90c-0 77936867 PRP starting from beginning
20221106 20:02:33 gfx90c-0 77936867 OK         0 on-load: blockSize 400, 0000000000000003
20221106 20:02:33 gfx90c-0 77936867 validating proof residues for power 8
20221106 20:02:33 gfx90c-0 77936867 Proof using power 8
20221106 20:02:43 gfx90c-0 77936867 OK       800   0.00% 1579c241dc63eca6 7620 us/it + check 3.08s + save 0.12s; ETA 6d 20:57
20221106 20:03:53 gfx90c-0 77936867     10000 fc4f135f7cf4ad29 7606
20221106 20:05:12 gfx90c-0 77936867 OK     20000   0.03% 3cd1bd9d5e09cbc5 7607 us/it + check 3.08s + save 0.12s; ETA 6d 20:39
20221106 20:06:28 gfx90c-0 77936867     30000 c4e0ff35e3290d98 7605
20221106 20:07:47 gfx90c-0 77936867 OK     40000   0.05% dffe1b1b0d748128 7608 us/it + check 3.08s + save 0.12s; ETA 6d 20:37
20221106 20:09:03 gfx90c-0 77936867     50000 52e286945371ed29 7605
20221106 20:10:23 gfx90c-0 77936867 OK     60000   0.08% 0945da4dc08bdd95 7608 us/it + check 3.08s + save 0.18s; ETA 6d 20:35
20221106 20:11:39 gfx90c-0 77936867     70000 7131fa4eb77f4bb2 7606
20221106 20:12:58 gfx90c-0 77936867 OK     80000   0.10% 8d76071d27ee4221 7609 us/it + check 3.08s + save 0.12s; ETA 6d 20:34
20221106 20:14:14 gfx90c-0 77936867     90000 0bacff453b2f470e 7591
20221106 20:15:33 gfx90c-0 77936867 OK    100000   0.13% 6d7296b9e2830f50 7593 us/it + check 3.08s + save 0.12s; ETA 6d 20:10
20221106 20:16:49 gfx90c-0 77936867    110000 8cbfd4435622bda7 7590
20221106 20:18:08 gfx90c-0 77936867 OK    120000   0.15% 79ae5dad855057ad 7592 us/it + check 3.07s + save 0.18s; ETA 6d 20:07
20221106 20:19:24 gfx90c-0 77936867    130000 50c97bcbf876231f 7590
20221106 20:20:43 gfx90c-0 77936867 OK    140000   0.18% e1db15f897271496 7592 us/it + check 3.08s + save 0.18s; ETA 6d 20:03
20221106 20:21:59 gfx90c-0 77936867    150000 127631386c6a9b17 7590
20221106 20:23:18 gfx90c-0 77936867 OK    160000   0.21% 25b7b6206fc6f085 7592 us/it + check 3.08s + save 0.12s; ETA 6d 20:01
20221106 20:24:34 gfx90c-0 77936867    170000 416816b0d9f4bba8 7591
20221106 20:25:53 gfx90c-0 77936867 OK    180000   0.23% 6bee5d054f770861 7592 us/it + check 3.08s + save 0.12s; ETA 6d 19:59
20221106 20:27:09 gfx90c-0 77936867    190000 f37f068f014b18a0 7590
20221106 20:28:25 gfx90c-0 77936867 Stopping, please wait..
20221106 20:28:28 gfx90c-0 77936867 OK    200000   0.26% f0b04b45b0855bd2 7592 us/it + check 3.07s + save 0.12s; ETA 6d 19:56
20221106 20:28:28 gfx90c-0 Exiting because "stop requested"
20221106 20:28:28 gfx90c-0 Bye

v6.11-380
Plug Power = 70W, SOC Power = 30W, GFX Clock = 2000MHz

C:\Users\User\GPUOWL\v611380>gpuowl-win -iters 200000 -prp 77936867
2022-11-06 20:36:33 gpuowl v6.11-380-g79ea0cc
2022-11-06 20:36:33 config: -log 20000
2022-11-06 20:36:33 config: -iters 200000 -prp 77936867
2022-11-06 20:36:33 device 0, unique id ''
2022-11-06 20:36:33 gfx90c-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2022-11-06 20:36:33 gfx90c-0 Expected maximum carry32: 583B0000
2022-11-06 20:36:34 gfx90c-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DAMDGPU=1 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-11-06 20:36:34 gfx90c-0 ASM compilation failed, retrying compilation using NO_ASM
2022-11-06 20:36:36 gfx90c-0 OpenCL compilation in 2.06 s
2022-11-06 20:36:39 gfx90c-0 77936867 OK        0 loaded: blockSize 400, 0000000000000003
2022-11-06 20:36:39 gfx90c-0 validating proof residues for power 8
2022-11-06 20:36:39 gfx90c-0 Proof using power 8
2022-11-06 20:36:49 gfx90c-0 77936867 OK      800   0.00%; 7718 us/it; ETA 6d 23:05; 1579c241dc63eca6 (check 3.12s)
2022-11-06 20:39:20 gfx90c-0 77936867 OK    20000   0.03%; 7708 us/it; ETA 6d 22:50; 3cd1bd9d5e09cbc5 (check 3.12s)
2022-11-06 20:41:57 gfx90c-0 77936867 OK    40000   0.05%; 7714 us/it; ETA 6d 22:55; dffe1b1b0d748128 (check 3.12s)
2022-11-06 20:44:35 gfx90c-0 77936867 OK    60000   0.08%; 7709 us/it; ETA 6d 22:46; 0945da4dc08bdd95 (check 3.13s)
2022-11-06 20:47:12 gfx90c-0 77936867 OK    80000   0.10%; 7710 us/it; ETA 6d 22:45; 8d76071d27ee4221 (check 3.12s)
2022-11-06 20:49:49 gfx90c-0 77936867 OK   100000   0.13%; 7692 us/it; ETA 6d 22:18; 6d7296b9e2830f50 (check 3.12s)
2022-11-06 20:52:26 gfx90c-0 77936867 OK   120000   0.15%; 7690 us/it; ETA 6d 22:13; 79ae5dad855057ad (check 3.12s)
2022-11-06 20:55:03 gfx90c-0 77936867 OK   140000   0.18%; 7691 us/it; ETA 6d 22:12; e1db15f897271496 (check 3.12s)
2022-11-06 20:57:40 gfx90c-0 77936867 OK   160000   0.21%; 7690 us/it; ETA 6d 22:08; 25b7b6206fc6f085 (check 3.11s)
2022-11-06 21:00:17 gfx90c-0 77936867 OK   180000   0.23%; 7692 us/it; ETA 6d 22:08; 6bee5d054f770861 (check 3.12s)
2022-11-06 21:02:47 gfx90c-0 Stopping, please wait..
2022-11-06 21:02:53 gfx90c-0 77936867 OK   200000   0.26%; 7690 us/it; ETA 6d 22:03; f0b04b45b0855bd2 (check 3.12s)
2022-11-06 21:02:53 gfx90c-0 Exiting because "stop requested"
2022-11-06 21:02:53 gfx90c-0 Bye
DrobinsonPE is offline   Reply With Quote
Old 2022-11-10, 10:00   #2821
moebius
 
moebius's Avatar
 
Jul 2009
Germany

11×61 Posts
Default

Update: gpuOwl benchmarks online (new link)
There are now 78 gpus/igps in the list
moebius is offline   Reply With Quote
Old 2022-11-12, 14:36   #2822
moebius
 
moebius's Avatar
 
Jul 2009
Germany

11·61 Posts
Default Intel Arc A770 .log file

to mihai : The log is from a user of the guru3d Forums. Is there no way to emulate FP64 double precision?

https://www.techpowerup.com/gpu-specs/arc-a770.c3914
Code:
2022-11-12 12:36:27 Note: not found 'config.txt'
2022-11-12 12:36:27 config: -iters 30000 -prp 77936867 
2022-11-12 12:36:27 device 0, unique id ''
2022-11-12 12:36:27 Intel(R) Arc(TM) A770 Graphics-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2022-11-12 12:36:27 Intel(R) Arc(TM) A770 Graphics-0 Expected maximum carry32: 583B0000
2022-11-12 12:36:28 Intel(R) Arc(TM) A770 Graphics-0 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2022-11-12 12:36:28 Intel(R) Arc(TM) A770 Graphics-0 ASM compilation failed, retrying compilation using NO_ASM
2022-11-12 12:36:28 Intel(R) Arc(TM) A770 Graphics-0 OpenCL compilation error -11 (args -DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xa.c42d0d7cec038p-5 -DIWEIGHT_STEP_MINUS_1=-0x8.0e50c8817ddf8p-5  -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only  -DNO_ASM=1)
2022-11-12 12:36:28 Intel(R) Arc(TM) A770 Graphics-0 1:190:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
typedef double T;
        ^
1:191:9: error: unknown type name 'double2'; did you mean 'double'?
typedef double2 T2;
        ^~~~~~~
        double
1:191:9: error: use of type 'double' requires cl_khr_fp64 extension to be enabled
1:201:7: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
      ^
1:201:12: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
           ^
1:201:1: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
^
1:201:27: error: use of type 'T2' (aka 'double') requires cl_khr_fp64 extension to be enabled
T2 U2(T a, T b) { return (T2)(a, b); }
                          ^
1:207:11: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T add1_m2(T x, T y) {
          ^
1:207:16: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T add1_m2(T x, T y) {
               ^
1:207:1: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T add1_m2(T x, T y) {
^
1:216:11: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T sub1_m2(T x, T y) {
          ^
1:216:16: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T sub1_m2(T x, T y) {
               ^
1:216:1: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T sub1_m2(T x, T y) {
^
1:226:11: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T mul1_m2(T x, T y) {
          ^
1:226:16: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T mul1_m2(T x, T y) {
               ^
1:226:1: error: use of type 'T' (aka 'double') requires cl_khr_fp64 extension to be enabled
T mul1_m2(T x, T y) {
^
1:237:23: error: use of type 'T'2022-11-12 12:36:28 Intel(R) Arc(TM) A770 Graphics-0 Exception gpu_error: BUILD_PROGRAM_FAILURE clBuildProgram at clwrap.cpp:246 build
2022-11-12 12:36:28 Intel(R) Arc(TM) A770 Graphics-0 Bye
Attached Thumbnails
Click image for larger version

Name:	OlZ2Kyh.gif
Views:	34
Size:	33.4 KB
ID:	27616  

Last fiddled with by moebius on 2022-11-12 at 14:42
moebius is offline   Reply With Quote
Old 2022-11-12, 15:10   #2823
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2×29×127 Posts
Default

Quote:
Originally Posted by moebius View Post
to mihai : The log is from a user of the guru3d Forums. Is there no way to emulate FP64 double precision?

https://www.techpowerup.com/gpu-specs/arc-a770.c3914
Possibly related: Note "OpenCL v3.0". Which is a subset of OpenCL v2.0, not an enhancement. So might not run Gpuowl anyway. And its memory bandwidth falls short of Radeon VIIs'.

https://www.tomshardware.com/news/in...-fp64-hardware confirms absence of FP64 in Intel consumer GPUs is by design. So that GPU is a candidate for mfakto not gpuowl.

Last fiddled with by kriesel on 2022-11-12 at 15:18
kriesel is online now   Reply With Quote
Old 2022-11-12, 16:17   #2824
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1CC616 Posts
Default Performance running dual instances on a GPU can be quite low

Two instances of PRP/GEC/proof, on different exponents, fft lengths, gpuowl versions, & folders, on the same Radeon VII GPU, hard drive, & Windows 10 system.

Instance one, gpuowl v6.11-357, 24M fft
2022-11-12 08:55:48 asr2/radeonii3 441001669 OK 370000 0.08%; 5199 us/it; ETA 26d 12:24; ec331713dc7bebd0 (check 4.99s)
2022-11-12 08:56:45 asr2/radeonii3 441001669 OK 380000 0.09%; 5225 us/it; ETA 26d 15:27; 172512b889f17371 (check 3.76s)
above is solo, average 5212 us/it = 191.86 iter/sec

below is tandem run, average 23227.5 us/it = 43.05 iter/sec, 22.44% of solo rate
2022-11-12 09:00:43 asr2/radeonii3 441001669 OK 390000 0.09%; 22837 us/it; ETA 116d 11:08; 7e3fdd9c4387b9bb (check 10.00s)
2022-11-12 09:04:50 asr2/radeonii3 441001669 OK 400000 0.09%; 23618 us/it; ETA 120d 10:35; 4e624b7b64e47ca0 (check 10.39s)

Instance two, v6.11-380, 40M fft length
2022-11-12 08:12:25 asr2/radeonvii3 730000031 OK 504100000 69.05%; 8874 us/it; ETA 23d 04:49; 2eaaed73ed1ae87b (check 11.07s)
2022-11-12 08:27:23 asr2/radeonvii3 730000031 OK 504200000 69.07%; 8877 us/it; ETA 23d 04:49; 471c1329830f6f1c (check 11.08s)
above is solo, 8875.5 us/it average, 112.67 iter/sec

below is tandem run, 64141 us/it, 15.59 iter/sec, 13.84% of solo rate
2022-11-12 09:07:49 asr2/radeonvii3 730000031 OK 504300000 69.08%; 64141 us/it; ETA 167d 13:17; 3545113094a2a77d (check 74.50s)

Combining tandem rates of the two tandem instances, 22.44% + 13.84% = 36.38% of solo rate, less than 3/8!
kriesel is online now   Reply With Quote
Old 2022-11-14, 18:49   #2825
storm5510
Random Account
 
storm5510's Avatar
 
Aug 2009
Not U. + S.A.

7×192 Posts
Default

I still use a 6.x version of gpuOwl. Why? Because I like running stand-alone P-1's. I have participated in this project since 2006 and am used to what I run and how I run it. I have an issue with PRP. It is an acronym containing the word "probably." It is probabilistic and not deterministic like Lucas Lehmer. IMHO, it was not a good idea for all the programs to jump into PRP with both feet. I do not see the advantages of going that route.

Off topic: I still use a 29.x version of Prime95. It does not contain all the options relative to PRP. I find it easier to use.

So, I will continue with what I am used to. No further.
storm5510 is offline   Reply With Quote
Old 2022-11-14, 19:34   #2826
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

2·29·127 Posts
Default

Quote:
Originally Posted by storm5510 View Post
I have an issue with PRP. It is an acronym containing the word "probably." It is probabilistic and not deterministic like Lucas Lehmer.
Please reconsider. A PRP-C result is definitely composite, no "probably" about it. In the rare case of PRP-P (or LL-P) that is not a software bug, multiple LL tests must be run to confirm the newly discovered Mersenne prime. We haven't needed to do that in nearly four years now.

Code:
Primality test     Error rate at ~100M    Error rate at ~100Mdigit    Effort to prove a composite
LL                        2%                         20%                      2.04 - 2.5
PRP/GEC                 ~0ppm                      ~0ppm                         2.00
PRP/GEC/proof           ~0ppm                      ~0ppm                <1.01 (~1+1/2proofpower)
Any time someone runs an LL first test, it is essentially wasted cycles. It is quicker and more reliable to double check an LL first test's Mersenne number with PRP/GEC/proof (effort ~1.01) than with an additional LL DC, & TC & QC when needed (effort ~1.04 at wavefront, increasing to ~+1.50 at 100Mdigit and worsening at longer run times for higher exponents).

PRP with GEC is far more reliable than LL with every available error check for LL. Error rate of PRP/GEC is essentially zero, orders of magnitude lower error rate than LL in practice on real hardware. PRP with proof generation means primality can be certainly determined composite for the overwhelming majority that are composite.

PRP with proof generation means a Mersenne number can be ruled out as a prime with effort < 1.01 primality tests (proof power 8 or higher). PRP with GEC but without proof takes two PRP runs, which will have extremely low error rate so require ~2.00 primality tests effort. Due to the typical LL error rate of 2%/test, LL needs 2.04 primality tests on average, to show a Mersenne number composite, at more than double the cost and with less certainty than PRP first test & proof.

PRP with proof generation and cert is also immune to deliberately faking results, even if the person running the Cert assignment issued by the server is the same person that submitted the proof file to the server.

The one drawback of PRP with proof is it needs more disk space, either locally or elsewhere on the network, in which to store many temporary residues. Longer cert time can be traded for reduced space requirements. A proof power 5 still saves ~97% of verification effort while needing ~3% of the temporaries space of a power 10 proof.

See also https://www.mersenneforum.org/showpo...06&postcount=9

Last fiddled with by kriesel on 2022-11-14 at 19:39
kriesel is online now   Reply With Quote
Old 2022-11-15, 01:51   #2827
moebius
 
moebius's Avatar
 
Jul 2009
Germany

67110 Posts
Default

Quote:
Originally Posted by kriesel View Post
Possibly related: Note "OpenCL v3.0". Which is a subset of OpenCL v2.0, not an enhancement. So might not run Gpuowl anyway. And its memory bandwidth falls short of Radeon VIIs'.https://www.tomshardware.com/news/in...-fp64-hardware confirms absence of FP64 in Intel consumer GPUs is by design. So that GPU is a candidate for mfakto not gpuowl.
You're right only the Arc A310 , A350 and the Alchemist Pro-Series: Arc Pro A40, Pro A50, Pro A30M (will) have FP64 capability but also with OpenCl 3.0 where you don't know exactly if gpuOwl is running at all: https://www.techpowerup.com/gpu-spec...ort=generation

Last fiddled with by moebius on 2022-11-15 at 01:52
moebius is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1719 2023-01-16 15:51
GPUOWL AMD Windows OpenCL issues xx005fs GpuOwl 0 2019-07-26 21:37
Testing an expression for primality 1260 Software 17 2015-08-28 01:35
Testing Mersenne cofactors for primality? CRGreathouse Computer Science & Computational Number Theory 18 2013-06-08 19:12
Primality-testing program with multiple types of moduli (PFGW-related) Unregistered Information & Answers 4 2006-10-04 22:38

All times are UTC. The time now is 04:43.


Tue Jan 31 04:43:16 UTC 2023 up 166 days, 2:11, 0 users, load averages: 0.53, 0.74, 0.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔