mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

mathwiz 2020-09-04 14:34

I've just started playing with gpuOwl (latest release from the git repo).

How do I submit proofs generated by gpuOwl? Is there a script somewhere? For results.txt I know I can just paste that into the manual submission form.

moebius 2020-09-04 14:45

[QUOTE=mathwiz;556011]I've just started playing with gpuOwl (latest release from the git repo).How do I submit proofs generated by gpuOwl? Is there a script somewhere? For results.txt I know I can just paste that into the manual submission form.[/QUOTE]

Go to this page

[B][SIZE="1"][URL="https://mersenneforum.org/showpost.php?p=551735&postcount=9"]Kriesel proof upload reference material[/URL][/SIZE][/B]

kriesel 2020-09-04 14:46

[QUOTE=mathwiz;556011]I've just started playing with gpuOwl (latest release from the git repo).

How do I submit proofs generated by gpuOwl? Is there a script somewhere? For results.txt I know I can just paste that into the manual submission form.[/QUOTE]
See [URL="https://www.mersenneforum.org/showpost.php?p=553120&postcount=26"]post 26[/URL] of the gpuowl reference thread [url]https://www.mersenneforum.org/showthread.php?t=23391[/url]

Xyzzy 2020-09-04 18:58

1 Attachment(s)
[URL="https://pcpartpicker.com/product/7dtKHx/amd-radeon-pro-w5500-8-gb-video-card-100-506095"]5500[/URL]:[CODE]2020-06-05 17:13:16 gfx1012-0 OpenCL compilation in 3.10 s
2020-06-05 17:13:17 gfx1012-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003
2020-06-05 17:13:21 gfx1012-0 77936867 OK 800 0.00%; 2982 us/it; ETA 2d 16:34; 1579c241dc63eca6 (check 1.27s)
2020-06-05 17:23:18 gfx1012-0 77936867 OK 200000 0.26%; 2991 us/it; ETA 2d 16:35; f0b04b45b0855bd2 (check 1.28s)
2020-06-05 17:33:15 gfx1012-0 77936867 OK 400000 0.51%; 2979 us/it; ETA 2d 16:10; c03f94396a5aa29e (check 1.27s)
2020-06-05 17:43:17 gfx1012-0 77936867 OK 600000 0.77%; 3004 us/it; ETA 2d 16:32; b9decd65ca71b629 (check 1.28s)[/CODE][URL="https://pcpartpicker.com/product/KBtWGX/evga-geforce-gtx-1080-ti-11gb-ftw-gaming-icx-video-card-11g-p4-6696-kr"]1080 Ti[/URL]:[CODE]2020-09-04 13:24:28 GeForce GTX 1080 Ti-0 OpenCL compilation in 2.02 s
2020-09-04 13:24:29 GeForce GTX 1080 Ti-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003
2020-09-04 13:24:32 GeForce GTX 1080 Ti-0 77936867 OK 800 0.00%; 2481 us/it; ETA 2d 05:43; 1579c241dc63eca6 (check 1.04s)
2020-09-04 13:32:54 GeForce GTX 1080 Ti-0 77936867 OK 200000 0.26%; 2514 us/it; ETA 2d 06:18; f0b04b45b0855bd2 (check 1.04s)
2020-09-04 13:41:12 GeForce GTX 1080 Ti-0 77936867 OK 400000 0.51%; 2483 us/it; ETA 2d 05:29; c03f94396a5aa29e (check 1.05s)
2020-09-04 13:49:27 GeForce GTX 1080 Ti-0 77936867 OK 600000 0.77%; 2473 us/it; ETA 2d 05:07; b9decd65ca71b629 (check 1.06s)[/CODE]:mike:

mathwiz 2020-09-04 21:53

[QUOTE=kriesel;556013]See [URL="https://www.mersenneforum.org/showpost.php?p=553120&postcount=26"]post 26[/URL] of the gpuowl reference thread [url]https://www.mersenneforum.org/showthread.php?t=23391[/url][/QUOTE]

OK. So basically:

1. manually upload results.txt
2. ./upload.py <username> <proof file>

?

ATH 2020-09-04 23:21

[QUOTE=ATH;555763]I'm trying to force 3M FFT now and see if it helps.[/QUOTE]

Since I started to force 3M FFT all 15+ results have been good, so gpuowl is most likely using a too aggressive FFT selection around 54M.

I tested a 54.1M exponent in Prime95 and it chose 2880K compared to 2.75M = 2816K for gpuowl.

Xyzzy 2020-09-05 00:04

[URL="https://pcpartpicker.com/product/vPVBD3/gigabyte-video-card-gvn98txtreme6gd"]980 Ti[/URL]:[CODE]2020-09-04 17:42:56 GeForce GTX 980 Ti-0 OpenCL compilation in 1.83 s
2020-09-04 17:42:58 GeForce GTX 980 Ti-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003
2020-09-04 17:43:04 GeForce GTX 980 Ti-0 77936867 OK 800 0.00%; 4221 us/it; ETA 3d 19:23; 1579c241dc63eca6 (check 1.73s)
2020-09-04 17:57:13 GeForce GTX 980 Ti-0 77936867 OK 200000 0.26%; 4258 us/it; ETA 3d 19:56; f0b04b45b0855bd2 (check 1.75s)
2020-09-04 18:11:28 GeForce GTX 980 Ti-0 77936867 OK 400000 0.51%; 4263 us/it; ETA 3d 19:49; c03f94396a5aa29e (check 1.75s)
2020-09-04 18:25:42 GeForce GTX 980 Ti-0 77936867 OK 600000 0.77%; 4262 us/it; ETA 3d 19:34; b9decd65ca71b629 (check 1.75s)[/CODE]:mike:

Prime95 2020-09-05 02:18

[QUOTE=ATH;556070]Since I started to force 3M FFT all 15+ results have been good, so gpuowl is most likely using a too aggressive FFT selection around 54M.

I tested a 54.1M exponent in Prime95 and it chose 2880K compared to 2.75M = 2816K for gpuowl.[/QUOTE]


Here is a gpuowl run on a Radeon VII. Note the pErr values show there is a 1 in 200 chance of a 0.5 roundoff error during the entire PRP test. This roughly what Mihai and I were targetting (actually it is a bit higher as I never analyzed SMALL_HEIGHT=512).

Does your nVidia run show significantly different pErr values??

[CODE]george@haswell:~/testbed/gpuowl$ ./gpuowl -prp 54094109 -use STATS
2020-09-05 02:09:22 gpuowl v6.11-351-ge930f9e-dirty
2020-09-05 02:09:22 config: -device 0 -log 30000
2020-09-05 02:09:22 config: -prp 54094109 -use STATS
2020-09-05 02:09:22 device 0, unique id ''
2020-09-05 02:09:22 gfx906+sram-ecc-0 54094109 FFT: 2.75M 256:11:512 (18.76 bpw)
2020-09-05 02:09:22 gfx906+sram-ecc-0 Expected maximum carry32: 50F80000
2020-09-05 02:09:23 gfx906+sram-ecc-0 OpenCL args "-DEXP=54094109u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DPM1=0 -DAMDGPU=1 -DMM_CHAIN=2u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.73cb21a140106p-3 -DIWEIGHT_STEP_MINUS_1=-0x1.3aab2a02e6dbp-3 -DSTATS=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-09-05 02:09:32 gfx906+sram-ecc-0 OpenCL compilation in 8.82 s
2020-09-05 02:09:33 gfx906+sram-ecc-0 54094109 OK 0 loaded: blockSize 400, 0000000000000003
2020-09-05 02:09:33 gfx906+sram-ecc-0 validating proof residues for power 8
2020-09-05 02:09:33 gfx906+sram-ecc-0 Proof using power 8
2020-09-05 02:09:34 gfx906+sram-ecc-0 54094109 OK 800 0.00%; 1012 us/it; ETA 0d 15:12; c9130b584a5c2453 (check 0.51s)
2020-09-05 02:10:04 gfx906+sram-ecc-0 Roundoff: N=29673, mean 0.246350, SD 0.014346, CV 0.058235, max 0.364949, z 17.7 (pErr 0.429960%)
2020-09-05 02:10:04 gfx906+sram-ecc-0 54094109 OK 30000 0.06%; 1001 us/it; ETA 0d 15:02; 8035402484acbe73 (check 0.53s)
2020-09-05 02:10:34 gfx906+sram-ecc-0 Roundoff: N=30475, mean 0.246477, SD 0.014465, CV 0.058689, max 0.376085, z 17.5 (pErr 0.523875%)
[/CODE]

moebius 2020-09-05 03:26

[QUOTE=Prime95;556089]Here is a gpuowl run on a Radeon VII. [/QUOTE]
Here is a gpuowl run on a Vega 64 availabale in Germany (maybe in EU) for 269,- EURO , but only with PCI Express x16 3.0 max Solution 2560 x 1600 only DVI

C[SIZE="1"][B]:\Users\gesch\gpuowl-v6.11-364-g36f4e2a>gpuowl-win -prp 54094109
2020-09-05 05:07:11 gpuowl v6.11-364-g36f4e2a
2020-09-05 05:07:11 config: -user geschwen
2020-09-05 05:07:11 config: -cpu AMD_RXVega64
2020-09-05 05:07:11 config: -prp 54094109
2020-09-05 05:07:11 device 0, unique id ''
2020-09-05 05:07:11 AMD_RXVega64 54094109 FFT: 2.75M 256:11:512 (18.76 bpw)
2020-09-05 05:07:11 AMD_RXVega64 Expected maximum carry32: 50F80000
2020-09-05 05:07:12 AMD_RXVega64 OpenCL args "-DEXP=54094109u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DPM1=0 -DAMDGPU=1 -DMM_CHAIN=2u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0xb.9e590d0a0083p-6 -DIWEIGHT_STEP_MINUS_1=-0x9.d559501736d8p-6 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-09-05 05:07:12 AMD_RXVega64 ASM compilation failed, retrying compilation using NO_ASM
2020-09-05 05:07:18 AMD_RXVega64 OpenCL compilation in 6.88 s
2020-09-05 05:07:19 AMD_RXVega64 54094109 OK 0 loaded: blockSize 400, 0000000000000003
2020-09-05 05:07:19 AMD_RXVega64 validating proof residues for power 8
2020-09-05 05:07:19 AMD_RXVega64 Proof using power 8
2020-09-05 05:07:20 AMD_RXVega64 54094109 OK 800 0.00%; 1035 us/it; ETA 0d 15:33; c9130b584a5c2453 (check 0.47s)
2020-09-05 05:11:01 AMD_RXVega64 Stopping, please wait[/B][/SIZE]

[URL="https://www.ebay.de/itm/ASUS-ROG-Strix-Radeon-RX-Vega-64-Special-Edition-Gaming-Grafikkarte-/184431161372?_trksid=p2349526.m4383.l4275.c10#rwid"]https://www.ebay.de/itm/ASUS-ROG-Strix-Radeon-RX-Vega-64-Special-Edition-Gaming-Grafikkarte-/184431161372?_trksid=p2349526.m4383.l4275.c10#rwid[/URL]

and a present for Mr. Ernst Mayer, available again by chance.
[URL="https://www.mindfactory.de/product_info.php/16GB-XFX-Radeon-VII-Aktiv-PCIe-3-0-x16--Retail-_1296273.html"]https://www.mindfactory.de/product_info.php/16GB-XFX-Radeon-VII-Aktiv-PCIe-3-0-x16--Retail-_1296273.html[/URL]

ATH 2020-09-05 14:42

[QUOTE=Prime95;556089]Here is a gpuowl run on a Radeon VII. Note the pErr values show there is a 1 in 200 chance of a 0.5 roundoff error during the entire PRP test. This roughly what Mihai and I were targetting (actually it is a bit higher as I never analyzed SMALL_HEIGHT=512).

Does your nVidia run show significantly different pErr values??[/QUOTE]

No, it is weird. I will do an entire exponent with 2.75M FFT and -use STATS on.

I had about 11 DC mismatches (not all confirmed yet) with P100 and V100 cards, and now none since I started using 3M FFT. All failures are >54M, I have done many DCs in 51M-53M without any issues.

[CODE]root@cuda2:/content/drive/My Drive/test# ./gpuowl -prp 54094109 -use STATS
2020-09-05 14:35:09 gpuowl v6.11-380-g79ea0cc
2020-09-05 14:35:09 config: -use CARRY32,OUT_WG=64,OUT_SIZEX=8,OUT_SPACING=4,IN_WG=64,IN_SIZEX=8,IN_SPACING=4
2020-09-05 14:35:10 config: -prp 54094109 -use STATS
2020-09-05 14:35:10 device 0, unique id ''
2020-09-05 14:35:10 Tesla V100-SXM2-16GB-0 54094109 FFT: 2.75M 256:11:512 (18.76 bpw)
2020-09-05 14:35:10 Tesla V100-SXM2-16GB-0 Expected maximum carry32: 50F80000
2020-09-05 14:35:11 Tesla V100-SXM2-16GB-0 OpenCL args "-DEXP=54094109u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DPM1=0 -DMM_CHAIN=2u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.73cb21a140106p-3 -DIWEIGHT_STEP_MINUS_1=-0x1.3aab2a02e6dbp-3 -DCARRY32=1 -DIN_SIZEX=8 -DIN_SPACING=4 -DIN_WG=64 -DOUT_SIZEX=8 -DOUT_SPACING=4 -DOUT_WG=64 -DSTATS=1 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-09-05 14:35:15 Tesla V100-SXM2-16GB-0

2020-09-05 14:35:15 Tesla V100-SXM2-16GB-0 OpenCL compilation in 3.89 s
2020-09-05 14:35:15 Tesla V100-SXM2-16GB-0 54094109 OK 0 loaded: blockSize 400, 0000000000000003
2020-09-05 14:35:15 Tesla V100-SXM2-16GB-0 validating proof residues for power 8
2020-09-05 14:35:15 Tesla V100-SXM2-16GB-0 Proof using power 8
2020-09-05 14:35:16 Tesla V100-SXM2-16GB-0 54094109 OK 800 0.00%; 304 us/it; ETA 0d 04:34; c9130b584a5c2453 (check 0.23s)
2020-09-05 14:36:16 Tesla V100-SXM2-16GB-0 Roundoff: N=200098, mean 0.246357, SD 0.014325, CV 0.058147, max 0.365655, z 17.7 (pErr 0.415867%)
2020-09-05 14:36:16 Tesla V100-SXM2-16GB-0 54094109 OK 200000 0.37%; 304 us/it; ETA 0d 04:33; 40642cc2c4948d4e (check 0.22s)
2020-09-05 14:37:17 Tesla V100-SXM2-16GB-0 Roundoff: N=200900, mean 0.246390, SD 0.014403, CV 0.058455, max 0.411859, z 17.6 (pErr 0.471407%)
2020-09-05 14:37:18 Tesla V100-SXM2-16GB-0 54094109 OK 400000 0.74%; 305 us/it; ETA 0d 04:33; c24b99c4d1130b46 (check 0.22s)
2020-09-05 14:38:19 Tesla V100-SXM2-16GB-0 Roundoff: N=200900, mean 0.246320, SD 0.014335, CV 0.058195, max 0.382945, z 17.7 (pErr 0.420977%)
2020-09-05 14:38:19 Tesla V100-SXM2-16GB-0 54094109 OK 600000 1.11%; 305 us/it; ETA 0d 04:32; 68a6004748a738a2 (check 0.22s)
2020-09-05 14:39:20 Tesla V100-SXM2-16GB-0 Roundoff: N=200900, mean 0.246378, SD 0.014331, CV 0.058168, max 0.361628, z 17.7 (pErr 0.420930%)
2020-09-05 14:39:20 Tesla V100-SXM2-16GB-0 54094109 OK 800000 1.48%; 305 us/it; ETA 0d 04:31; 98ac6e527b6749ef (check 0.22s)
2020-09-05 14:39:37 Tesla V100-SXM2-16GB-0 Stopping, please wait..
2020-09-05 14:39:37 Tesla V100-SXM2-16GB-0 Roundoff: N=55738, mean 0.246389, SD 0.014440, CV 0.058606, max 0.371908, z 17.6 (pErr 0.499662%)
2020-09-05 14:39:37 Tesla V100-SXM2-16GB-0 54094109 OK 855200 1.58%; 307 us/it; ETA 0d 04:32; 3dfde62dfb22e13c (check 0.22s)
2020-09-05 14:39:37 Tesla V100-SXM2-16GB-0 Exiting because "stop requested"
2020-09-05 14:39:37 Tesla V100-SXM2-16GB-0 Bye
[/CODE]

Viliam Furik 2020-09-05 15:25

[QUOTE=moebius;556093]...and a present for Mr. Ernst Mayer, available again by chance.
[URL="https://www.mindfactory.de/product_info.php/16GB-XFX-Radeon-VII-Aktiv-PCIe-3-0-x16--Retail-_1296273.html"]https://www.mindfactory.de/product_info.php/16GB-XFX-Radeon-VII-Aktiv-PCIe-3-0-x16--Retail-_1296273.html[/URL][/QUOTE]
I checked the link. It seems that they only ship to some European countries, unfortunately not including Slovakia.


All times are UTC. The time now is 22:48.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.