mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cloud Computing (https://www.mersenneforum.org/forumdisplay.php?f=134)
-   -   Google Diet Colab Notebook (https://www.mersenneforum.org/showthread.php?t=24646)

kriesel 2019-10-22 19:27

No Colab for you!
 
Had been reliably getting new session on one account and iffy on the other. Shortly before noon both terminated early, and I haven't been able to get a session on either for hours, so can no longer test any scripts at the moment.

PhilF 2019-10-22 20:26

[QUOTE=kriesel;528617]Had been reliably getting new session on one account and iffy on the other. Shortly before noon both terminated early, and I haven't been able to get a session on either for hours, so can no longer test any scripts at the moment.[/QUOTE]

CPU or GPU?

mnd9 2019-10-22 21:58

1 Attachment(s)
So my Kaggle commits just finished after running 9 hours, and no output tab exists! I was running a single long job with no intention of it finishing, but was hoping to collect the checkpoint file and resume with another session... it says exited with error code 137 in the log.

The run info section also says output size 0 (see attached).

Does it matter than I'm moving my input and exe to /usr/loca/bin/ and running everything there?

Is there somewhere else I need to move/run files in order to be captured as output??

kriesel 2019-10-22 22:08

1 Attachment(s)
[QUOTE=PhilF;528621]CPU or GPU?[/QUOTE]No VM backend at all. With gpu accelerator, or without.

PhilF 2019-10-22 22:15

[QUOTE=mnd9;528626]So my Kaggle commits just finished after running 9 hours, and no output tab exists! I was running a single long job with no intention of it finishing, but was hoping to collect the checkpoint file and resume with another session... it says exited with error code 137 in the log.

The run info section also says output size 0 (see attached).

Does it matter than I'm moving my input and exe to /usr/loca/bin/ and running everything there?

Is there somewhere else I need to move/run files in order to be captured as output??[/QUOTE]

In the upper left click on where it says "1 commit". Then on that screen click on the output tab.

mnd9 2019-10-22 23:07

1 Attachment(s)
[QUOTE=PhilF;528629]In the upper left click on where it says "1 commit". Then on that screen click on the output tab.[/QUOTE]

Here's what I see when I click "1 commit" -- nothing is clearly labelled as tabs, and clicking on any of the columns returns me to the same page I showed in my prior post.

PhilF 2019-10-23 00:19

[QUOTE=mnd9;528634]Here's what I see when I click "1 commit" -- nothing is clearly labelled as tabs, and clicking on any of the columns returns me to the same page I showed in my prior post.[/QUOTE]

On my screen every word on that "Version 1" line is clickable (even though it doesn't look like it), and takes me to the screen where the Output tab is.

The difference is that I have a green check mark on the far left, indicating a successful run, instead of a red X. If your Version 1 isn't clickable, then that must be why.

mnd9 2019-10-23 00:40

[QUOTE=PhilF;528639]On my screen every word on that "Version 1" line is clickable (even though it doesn't look like it), and takes me to the screen where the Output tab is.

The difference is that I have a green check mark on the far left, indicating a successful run, instead of a red X. If your Version 1 isn't clickable, then that must be why.[/QUOTE]

This is what’s confusing—my version 1 is clickable it just returns me to the page with Code, Data, Log, Comments but no output.

In post 418, axn said I should have an output tab regardless of it the kernel is killed after 9 hours or completes... I lost 9 hours of GPU quota and the kernel ran so why no output...?

Also can anyone confirm my log makes it look like it failed with error code 137 in 5 seconds, but gives no explanation and didn’t stop running, supposedly that’s a memory error code but I’m running something requiring little memory that I know works from trying it in draft mode

Finally I’d like to test something but I need help from a Python savvy user out there: is there a way to issue a keyboard interrupt (I.e ctrl + c) after a certain time delay? My thought is maybe I’ll program into my code to interrupt my script before the kernel times out and maybe that will result in a successful “complete” status as all of the code cells will run hopefully giving me some output... thoughts?

PhilF 2019-10-23 01:50

[QUOTE=mnd9;528644]This is what’s confusing—my version 1 is clickable it just returns me to the page with Code, Data, Log, Comments but no output.

In post 418, axn said I should have an output tab regardless of it the kernel is killed after 9 hours or completes... I lost 9 hours of GPU quota and the kernel ran so why no output...?[/quote]

I'll queue up a mprime job tomorrow that won't complete in 9 hours, then let you know if I can get to its output.

[QUOTE=mnd9;528644]Finally I’d like to test something but I need help from a Python savvy user out there: is there a way to issue a keyboard interrupt (I.e ctrl + c) after a certain time delay? My thought is maybe I’ll program into my code to interrupt my script before the kernel times out and maybe that will result in a successful “complete” status as all of the code cells will run hopefully giving me some output... thoughts?[/QUOTE]

That's a good idea, and could help when using mprime also.

axn 2019-10-23 02:40

[QUOTE=mnd9;528626]Does it matter than I'm moving my input and exe to /usr/loca/bin/ and running everything there?

Is there somewhere else I need to move/run files in order to be captured as output??[/QUOTE]

The output tab contains the contents of the default folder (/kaggle/<i forget what it is>). If the checkpoint files / results files etc. are not written in that particular folder, it won't show up in Output tab.

EDIT:- /kaggle/working

axn 2019-10-23 02:54

[QUOTE=PhilF;528649]I'll queue up a mprime job tomorrow that won't complete in 9 hours, then let you know if I can get to its output[/QUOTE]
No need. This is what I'm doing. If my run completes in 9 hours, I get results.json.txt, else I get the pXXXXX files. I also get the mprime executable and all the accompanying text file, because I run the program directly from the default folder.


All times are UTC. The time now is 22:59.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.