mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cloud Computing (https://www.mersenneforum.org/forumdisplay.php?f=134)
-   -   Google Diet Colab Notebook (https://www.mersenneforum.org/showthread.php?t=24646)

chalsall 2019-10-21 19:39

[QUOTE=kriesel;528512]!top -d n sorta does. Here n=120.[/QUOTE]

I stand corrected! :smile:

Thanks. You just taught me something.

mnd9 2019-10-21 20:37

I'm still a bit confused about using Kaggle. If I leave a job running in the edit window, it seems to time out and power off after a short while (well before the 6 hour max) making me lose everything. I tried committing, but then I'm confused as to how to re-enter the session and the see the output from my code cells or how to download any output files from the committed session. When I click on the committed session I see the "code" page which I can't really discern, and if I click "edit" it seems to just open a new draft session...

Can someone give me some basic pointers on this?

EdH 2019-10-21 20:55

[QUOTE=Dylan14;528508]@EdH: issuing locate cuda.h yields the following results:
. . .
so maybe try adding /usr/include/linux/ to your makefile (where you define where the cuda library is located).[/QUOTE]
The "--with-cuda=" option worked for cuda.h, but the troubles aren't over:
[code]checking cuda.h usability... yes
checking cuda.h presence... yes
checking for cuda.h... yes
checking that CUDA Toolkit version is at least 3.0... no
configure: error: a newer version of the CUDA Toolkit is needed
[/code][code]cuda-toolkit-10-0/unknown,now 10.0.130-1 amd64 [installed,automatic]
CUDA Toolkit 10.0 meta-package
cuda-toolkit-10-1/unknown,now 10.1.243-1 amd64 [installed]
CUDA Toolkit 10.1 meta-package
[/code]Now 10 is somehow older than 3? (maybe 1 compared to 3, instead of 10). I guess I'm going to have to into the configure code, make a change and see where that goes.

More later. Thanks all!

chalsall 2019-10-21 22:32

[QUOTE=mnd9;528521]I'm still a bit confused about using Kaggle. If I leave a job running in the edit window, it seems to time out and power off after a short while (well before the 6 hour max) making me lose everything. I tried committing, but then I'm confused as to how to re-enter the session and the see the output from my code cells or how to download any output files from the committed session. When I click on the committed session I see the "code" page which I can't really discern, and if I click "edit" it seems to just open a new draft session...

Can someone give me some basic pointers on this?[/QUOTE]

All I can offer you is my own empirical observations. They might be of some use.

I have found that the Kaggle Browser-based User Interface (UI) somewhat confusing.

Right now I have a GPU attached instance up and running, working away. But the UI tells me that "The kernel is powered off. Click this banner to turn it back on."

Two things:

1. I'm logged into the instance "tail -f"'ing logs. I know it's still running.

2. My "GPU Quota" continues to count down. I once wasted ~8.5 hours of a 9-hour instance "happening" this way...

I've found that clicking on the banner causes a restart of the instance -- any SSH connections immediately drop, and the instance which becomes available by way of the UI is "virgin" (although it's "uptime" might be several hours).

With regards to "Committed" jobs, my understanding is there is somewhere within the VM's FS you can place data for later harvesting. I haven't investigated that myself; I believe others here have.

As an aside, it's a good thing I "Hash". Last Saturday was ~12 km up and down steep hills. And interacting with humans in "meat-space".

If it wasn't for that weekly event, I might never get off my sorry little ass... :smile:

PhilF 2019-10-21 22:40

1 Attachment(s)
[QUOTE=mnd9;528521]I'm still a bit confused about using Kaggle. If I leave a job running in the edit window, it seems to time out and power off after a short while (well before the 6 hour max) making me lose everything. I tried committing, but then I'm confused as to how to re-enter the session and the see the output from my code cells or how to download any output files from the committed session. When I click on the committed session I see the "code" page which I can't really discern, and if I click "edit" it seems to just open a new draft session...

Can someone give me some basic pointers on this?[/QUOTE]

After getting your code all ready to run, click on commit. Then you should get a screen like the one I have included here. At the top is a link to your committed run. You can use that link to get to that session, or, once the code has completed, your session will be listed over on the right side of the screen (where you can see my previous V1 through V6 commits; V7 will appear there once the code completes and exits).

Please note that files are not saved, only screen output is retained. What I have done in the code you see here is run some ECM curves (via GMP-ECM), which is configured to save its output to a file called ecm-out.txt. Then, in my code, after the line that invokes ecm, I use !cat ecm-out.txt to send it to the (virtual) screen.

After the run is complete, I can call up that commit (V7 in this case), and simply copy/paste the output into a local text file.

Hope this helps...

mnd9 2019-10-21 23:20

[QUOTE=chalsall;528530]All I can offer you is my own empirical observations. They might be of some use.

I have found that the Kaggle Browser-based User Interface (UI) somewhat confusing.

Right now I have a GPU attached instance up and running, working away. But the UI tells me that "The kernel is powered off. Click this banner to turn it back on."

Two things:

1. I'm logged into the instance "tail -f"'ing logs. I know it's still running.

2. My "GPU Quota" continues to count down. I once wasted ~8.5 hours of a 9-hour instance "happening" this way...

I've found that clicking on the banner causes a restart of the instance -- any SSH connections immediately drop, and the instance which becomes available by way of the UI is "virgin" (although it's "uptime" might be several hours).

With regards to "Committed" jobs, my understanding is there is somewhere within the VM's FS you can place data for later harvesting. I haven't investigated that myself; I believe others here have.

As an aside, it's a good thing I "Hash". Last Saturday was ~12 km up and down steep hills. And interacting with humans in "meat-space".

If it wasn't for that weekly event, I might never get off my sorry little ass... :smile:[/QUOTE]

Thanks Chris for your insight—today my kernels kept powering off, with all cells grayed out so it seemed my only option was to click power on which loses everything as you noted.

How are you able to see things are still running with power “off” without access to the code cells? It seems I’m missing something...

PhilF 2019-10-21 23:29

[QUOTE=mnd9;528534]Thanks Chris for your insight—today my kernels kept powering off, with all cells grayed out so it seemed my only option was to click power on which loses everything as you noted.

How are you able to see things are still running with power “off” without access to the code cells? It seems I’m missing something...[/QUOTE]

I've noticed that too. If a kernel powers off, don't click on the banner to power it back on. Instead, close the window. Then open a new Kaggle window, go to your Notebooks, and choose the powered off notebook from there. Going at it that way, I have found the files are still there.

chalsall 2019-10-21 23:33

[QUOTE=mnd9;528534]How are you able to see things are still running with power “off” without access to the code cells? It seems I’m missing something...[/QUOTE]

It's a bit "Geeky", but please [URL="https://mersenneforum.org/showthread.php?t=24840"]see this[/URL].

These instances are full-blown Ubuntu (read: Linux) environments. Although all incoming network traffic is firewalled, all outgoing (and "established") traffic is allowed.

Thanks to the GPL, it is relatively trivial to get shell access into things like this.

mnd9 2019-10-21 23:35

[QUOTE=PhilF;528532]After getting your code all ready to run, click on commit. Then you should get a screen like the one I have included here. At the top is a link to your committed run. You can use that link to get to that session, or, once the code has completed, your session will be listed over on the right side of the screen (where you can see my previous V1 through V6 commits; V7 will appear there once the code completes and exits).

Please note that files are not saved, only screen output is retained. What I have done in the code you see here is run some ECM curves (via GMP-ECM), which is configured to save its output to a file called ecm-out.txt. Then, in my code, after the line that invokes ecm, I use !cat ecm-out.txt to send it to the (virtual) screen.

After the run is complete, I can call up that commit (V7 in this case), and simply copy/paste the output into a local text file.

Hope this helps...[/QUOTE]

This is helpful! What if I wanted to work on a single exponent over several sessions (eg a wavefront exponent using cudalucas or gpuowl)? Is there a way to cat the checkpoint file and snag it? And when you say the run “completes” you mean reaches the time limit and is killed right?

PhilF 2019-10-21 23:39

[QUOTE=mnd9;528537]This is helpful! What if I wanted to work on a single exponent over several sessions (eg a wavefront exponent using cudalucas or gpuowl)? Is there a way to cat the checkpoint file and snag it? And when you say the run “completes” you mean reaches the time limit and is killed right?[/QUOTE]

No, I mean the executable exits to the shell, like GMP-ECM does when its work is complete. Unfortunately, mprime does not do that.

I don't know about the other executables people are running. If they won't exit, I assume the job is killed after 9 hours, but then the cat statement(s) would never get a chance to execute. :no:

However, if the executable can send its output data to the screen as it is running, then you probably could go into the job after 9 hours and retrieve that screen output.

axn 2019-10-22 02:45

[QUOTE=mnd9;528521]I'm still a bit confused about using Kaggle. If I leave a job running in the edit window, it seems to time out and power off after a short while (well before the 6 hour max) making me lose everything. I tried committing, but then I'm confused as to how to re-enter the session and the see the output from my code cells or how to download any output files from the committed session.[/QUOTE]
Once your commited job finished, you will be able to see all your files in the Output tab

[QUOTE=PhilF;528532]Please note that files are not saved, only screen output is retained.[/QUOTE]
That's exactly upside down. Only files are retained in Output tab. Screen output is lost (unless you redirect it to a file).

[QUOTE=mnd9;528537]Is there a way to cat the checkpoint file and snag it?[/QUOTE]
Yes. In the Output tab, all the files in the kaggle folder will be available once the session completes (either the code has run to completion, or session was killed after 9 hours (CPU) / 6 hrs (GPU))

EDIT:- Key to the kingdom: https://www.kaggle.com/<yourid>/<yourkernel>/


All times are UTC. The time now is 22:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.