mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cloud Computing (https://www.mersenneforum.org/forumdisplay.php?f=134)
-   -   Google Diet Colab Notebook (https://www.mersenneforum.org/showthread.php?t=24646)

Fan Ming 2020-06-29 13:25

1 Attachment(s)
Complied new version of gpuowl (commit E5A8F2C:[url]https://github.com/preda/gpuowl/commit/e5a8f2c5942b3bdb88d92c160bc5dcd953ae33cf[/url]) on Google colab.
Not tested, but should work.

moebius 2020-06-29 14:15

[QUOTE=Fan Ming;549346]Complied new version of gpuowl.Not tested, but should work.[/QUOTE]
Unfortunately, it still doesn't work for me.

2020-06-29 14:12:09 gpuowl
2020-06-29 14:12:09 Note: not found 'config.txt'
2020-06-29 14:12:09 config: -prp 333999549
2020-06-29 14:12:09 device 0, unique id ''
2020-06-29 14:12:09 Exception gpu_error: clGetPlatformIDs(16, platforms, (unsigned *) &nPlatforms) at clwrap.cpp:71 getDeviceIDs
2020-06-29 14:12:09 Bye

kriesel 2020-06-29 15:49

[QUOTE=moebius;549349]Unfortunately, it still doesn't work for me.

2020-06-29 14:12:09 gpuowl
2020-06-29 14:12:09 Note: [B]not found 'config.txt'[/B]
2020-06-29 14:12:09 config: -prp 333999549
2020-06-29 14:12:09 device 0, unique id ''
2020-06-29 14:12:09 Exception gpu_error: clGetPlatformIDs(16, platforms, (unsigned *) &nPlatforms) at clwrap.cpp:71 getDeviceIDs
2020-06-29 14:12:09 Bye[/QUOTE]Consider using worktodo.txt for what changes (work assignments), and config.txt for what doesn't normally change. Create a config.txt that tells gpuowl what device to use, and retry.

Fan Ming 2020-06-29 15:56

[QUOTE=moebius;549349]Unfortunately, it still doesn't work for me.

2020-06-29 14:12:09 gpuowl
2020-06-29 14:12:09 Note: not found 'config.txt'
2020-06-29 14:12:09 config: -prp 333999549
2020-06-29 14:12:09 device 0, unique id ''
2020-06-29 14:12:09 Exception gpu_error: clGetPlatformIDs(16, platforms, (unsigned *) &nPlatforms) at clwrap.cpp:71 getDeviceIDs
2020-06-29 14:12:09 Bye[/QUOTE]

Is this a GPU session(Runtime type->Accelerator should be GPU)? It seems that no GPU was detected.

moebius 2020-06-29 17:06

[QUOTE=Fan Ming;549355]Is this a GPU session(Runtime type->Accelerator should be GPU)? It seems that no GPU was detected.[/QUOTE]

No, but I've changed this. Thanks for the support.

Unfortunately, the Tesla K80 does not seem to run flawlessly.


2020-06-29 17:02:24 gpuowl
2020-06-29 17:02:24 Note: not found 'config.txt'
2020-06-29 17:02:24 config: -prp 333999549
2020-06-29 17:02:24 device 0, unique id ''
2020-06-29 17:02:24 Tesla K80-0 333999549 FFT: 18M 1K:9:1K (17.70 bpw)
2020-06-29 17:02:24 Tesla K80-0 Expected maximum carry32: 6E890000
2020-06-29 17:02:26 Tesla K80-0 OpenCL args "-DEXP=333999549u -DWIDTH=1024u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -DPM1=0 -DMM2_CHAIN=1u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.e0807b194ba13p-3 -DIWEIGHT_STEP_MINUS_1=-0x1.8530a90b718e5p-3 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-06-29 17:02:29 Tesla K80-0

2020-06-29 17:02:29 Tesla K80-0 OpenCL compilation in 2.77 s
2020-06-29 17:02:36 Tesla K80-0 333999549 OK 0 loaded: blockSize 400, 0000000000000003
2020-06-29 17:02:53 Tesla K80-0 333999549 EE 800 0.00%; 13847 us/it; ETA 53d 12:43; e5e2d3b8fb255c88 (check 5.99s)
2020-06-29 17:02:59 Tesla K80-0 333999549 OK 0 loaded: blockSize 400, 0000000000000003
2020-06-29 17:03:16 Tesla K80-0 333999549 EE 800 0.00%; 13862 us/it; ETA 53d 14:06; e5e2d3b8fb255c88 (check 6.00s) 1 errors
2020-06-29 17:03:22 Tesla K80-0 333999549 OK 0 loaded: blockSize 400, 0000000000000003
2020-06-29 17:03:39 Tesla K80-0 333999549 EE 800 0.00%; 13889 us/it; ETA 53d 16:36; e5e2d3b8fb255c88 (check 5.99s) 2 errors
2020-06-29 17:03:39 Tesla K80-0 3 sequential errors, will stop.
2020-06-29 17:03:39 Tesla K80-0 Exiting because "too many errors"
2020-06-29 17:03:39 Tesla K80-0 Bye

Fan Ming 2020-06-29 17:12

[QUOTE=moebius;549361]No, but I've changed this. Thanks for the support.

Unfortunately, the Tesla K80 does not seem to run flawlessly.

[/QUOTE]

Try the previous version I provided last year. Maybe optimizations in newer gpuowl caused unstable run, or the compiled new binary isn't well itself.

Fan Ming 2020-06-29 17:13

[QUOTE=moebius;549361]No, but I've changed this. Thanks for the support.

Unfortunately, the Tesla K80 does not seem to run flawlessly.


....
2020-06-29 17:02:24 Tesla K80-0 333999549 FFT: 18M 1K:9:1K (17.70 bpw)
2020-06-29 17:02:24 Tesla K80-0 Expected maximum carry32: 6E890000
2020-06-29 17:02:26 Tesla K80-0 OpenCL args "-DEXP=333999549u -DWIDTH=1024u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -DPM1=0 -DMM2_CHAIN=1u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.e0807b194ba13p-3 -DIWEIGHT_STEP_MINUS_1=-0x1.8530a90b718e5p-3 [COLOR="Red"][B]-cl-unsafe-math-optimizations[/B][/COLOR] -cl-std=CL2.0 -cl-finite-math-only "
2020-06-29 17:02:29 Tesla K80-0

[/QUOTE]

Or try to cancel these optimizations.

chalsall 2020-06-29 17:19

[QUOTE=moebius;549361]Unfortunately, the Tesla K80 does not seem to run flawlessly.[/QUOTE]

When you get a K80 (the slowest GPU provided by Colab) choose "Factory Reset", and then reattach. Within a few attempts, you should get a P4, a P100, or a coveted T4.

moebius 2020-06-29 18:14

1) LL in the double check range seems to work with the new compiled version.

2020-06-29 17:58:56 gpuowl
2020-06-29 17:58:56 Note: not found 'config.txt'
2020-06-29 17:58:56 config: -ll 53537243
2020-06-29 17:58:56 device 0, unique id ''
2020-06-29 17:58:56 Tesla T4-0 53537243 FFT: 2.75M 256:11:512 (18.57 bpw)
2020-06-29 17:58:56 Tesla T4-0 Expected maximum carry32: 46D30000
2020-06-29 17:58:56 Tesla T4-0 OpenCL args "-DEXP=53537243u -DWIDTH=256u -DSMALL_HEIGHT=512u -DMIDDLE=11u -DPM1=0 -DMM_CHAIN=1u -DMM2_CHAIN=1u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.6730c3da2ac15p-2 -DIWEIGHT_STEP_MINUS_1=-0x1.09ea3b9d0d90fp-2 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only "
2020-06-29 17:58:56 Tesla T4-0

2020-06-29 17:58:56 Tesla T4-0 OpenCL compilation in 0.01 s
2020-06-29 17:58:56 Tesla T4-0 53537243 LL 119000 loaded: b7fe47edf1df9c7f
2020-06-29 18:02:54 Tesla T4-0 53537243 LL 200000 0.37%; 2937 us/it; ETA 1d 19:31; d1177beaa336d0f0
2020-06-29 18:07:47 Tesla T4-0 53537243 LL 300000 0.56%; 2932 us/it; ETA 1d 19:21; 3cf6b471d02c92b7


2) canceling -cl-unsafe-math-optimizations

Unfortunately I am not so familiar with gpuowl to be able to do that quickly.


3) with -prp 333999549 the same error occurred again with the Tesla T4


I will run double checks for now.

kriesel 2020-06-29 19:09

[QUOTE=moebius;549370]2) canceling -cl-unsafe-math-optimizations[/QUOTE]
-h is your friend. Help output includes the option

-safeMath : do not use -cl-unsafe-math-optimizations (OpenCL)

moebius 2020-06-29 20:08

[QUOTE=kriesel;549375]-h is your friend. Help output includes the option

-safeMath : do not use -cl-unsafe-math-optimizations (OpenCL)[/QUOTE]


Thanks I missed that, I will try that later. I let the DC go through before colab throws me out.


2020-06-29 19:35:51 Tesla T4-0 53537243 OK 2000000 (jacobi == -1)
2020-06-29 19:40:44 Tesla T4-0 53537243 LL 2200000 4.11%; 2934 us/it; ETA 1d 17:51; 0ff568d9b44b522e
2020-06-29 19:45:37 Tesla T4-0 53537243 LL 2300000 4.30%; 2933 us/it; ETA 1d 17:44; 251446abea375f31
2020-06-29 19:50:31 Tesla T4-0 53537243 LL 2400000 4.48%; 2933 us/it; ETA 1d 17:40; ccfb451c0f8d3213
2020-06-29 19:55:24 Tesla T4-0 53537243 LL 2500000 4.67%; 2932 us/it; ETA 1d 17:34; 29045b8768c16eff
2020-06-29 20:00:17 Tesla T4-0 53537243 LL 2600000 4.86%; 2937 us/it; ETA 1d 17:33; bf1b2da355335b21
2020-06-29 20:00:17 Tesla T4-0 53537243 OK 2500000 (jacobi == -1)
2020-06-29 20:05:12 Tesla T4-0 53537243 LL 2700000 5.04%; 2941 us/it; ETA 1d 17:32; 4131529510cd3e19
/


All times are UTC. The time now is 22:30.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.