mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
Thread Tools
Old 2019-10-22, 19:27   #430
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

153D16 Posts
Default No Colab for you!

Had been reliably getting new session on one account and iffy on the other. Shortly before noon both terminated early, and I haven't been able to get a session on either for hours, so can no longer test any scripts at the moment.
kriesel is online now   Reply With Quote
Old 2019-10-22, 20:26   #431
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

2·7·47 Posts
Default

Quote:
Originally Posted by kriesel View Post
Had been reliably getting new session on one account and iffy on the other. Shortly before noon both terminated early, and I haven't been able to get a session on either for hours, so can no longer test any scripts at the moment.
CPU or GPU?
PhilF is online now   Reply With Quote
Old 2019-10-22, 21:58   #432
mnd9
 
Jun 2019
Boston, MA

3·13 Posts
Default

So my Kaggle commits just finished after running 9 hours, and no output tab exists! I was running a single long job with no intention of it finishing, but was hoping to collect the checkpoint file and resume with another session... it says exited with error code 137 in the log.

The run info section also says output size 0 (see attached).

Does it matter than I'm moving my input and exe to /usr/loca/bin/ and running everything there?

Is there somewhere else I need to move/run files in order to be captured as output??
Attached Thumbnails
Click image for larger version

Name:	test.jpg
Views:	59
Size:	145.4 KB
ID:	21171  
mnd9 is offline   Reply With Quote
Old 2019-10-22, 22:08   #433
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,437 Posts
Default

Quote:
Originally Posted by PhilF View Post
CPU or GPU?
No VM backend at all. With gpu accelerator, or without.
Attached Thumbnails
Click image for larger version

Name:	no backend.png
Views:	72
Size:	23.1 KB
ID:	21172  

Last fiddled with by kriesel on 2019-10-22 at 22:28
kriesel is online now   Reply With Quote
Old 2019-10-22, 22:15   #434
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

65810 Posts
Default

Quote:
Originally Posted by mnd9 View Post
So my Kaggle commits just finished after running 9 hours, and no output tab exists! I was running a single long job with no intention of it finishing, but was hoping to collect the checkpoint file and resume with another session... it says exited with error code 137 in the log.

The run info section also says output size 0 (see attached).

Does it matter than I'm moving my input and exe to /usr/loca/bin/ and running everything there?

Is there somewhere else I need to move/run files in order to be captured as output??
In the upper left click on where it says "1 commit". Then on that screen click on the output tab.
PhilF is online now   Reply With Quote
Old 2019-10-22, 23:07   #435
mnd9
 
Jun 2019
Boston, MA

478 Posts
Default

Quote:
Originally Posted by PhilF View Post
In the upper left click on where it says "1 commit". Then on that screen click on the output tab.
Here's what I see when I click "1 commit" -- nothing is clearly labelled as tabs, and clicking on any of the columns returns me to the same page I showed in my prior post.
Attached Thumbnails
Click image for larger version

Name:	versions.jpg
Views:	73
Size:	42.1 KB
ID:	21173  
mnd9 is offline   Reply With Quote
Old 2019-10-23, 00:19   #436
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

2·7·47 Posts
Default

Quote:
Originally Posted by mnd9 View Post
Here's what I see when I click "1 commit" -- nothing is clearly labelled as tabs, and clicking on any of the columns returns me to the same page I showed in my prior post.
On my screen every word on that "Version 1" line is clickable (even though it doesn't look like it), and takes me to the screen where the Output tab is.

The difference is that I have a green check mark on the far left, indicating a successful run, instead of a red X. If your Version 1 isn't clickable, then that must be why.
PhilF is online now   Reply With Quote
Old 2019-10-23, 00:40   #437
mnd9
 
Jun 2019
Boston, MA

3·13 Posts
Default

Quote:
Originally Posted by PhilF View Post
On my screen every word on that "Version 1" line is clickable (even though it doesn't look like it), and takes me to the screen where the Output tab is.

The difference is that I have a green check mark on the far left, indicating a successful run, instead of a red X. If your Version 1 isn't clickable, then that must be why.
This is what’s confusing—my version 1 is clickable it just returns me to the page with Code, Data, Log, Comments but no output.

In post 418, axn said I should have an output tab regardless of it the kernel is killed after 9 hours or completes... I lost 9 hours of GPU quota and the kernel ran so why no output...?

Also can anyone confirm my log makes it look like it failed with error code 137 in 5 seconds, but gives no explanation and didn’t stop running, supposedly that’s a memory error code but I’m running something requiring little memory that I know works from trying it in draft mode

Finally I’d like to test something but I need help from a Python savvy user out there: is there a way to issue a keyboard interrupt (I.e ctrl + c) after a certain time delay? My thought is maybe I’ll program into my code to interrupt my script before the kernel times out and maybe that will result in a successful “complete” status as all of the code cells will run hopefully giving me some output... thoughts?

Last fiddled with by mnd9 on 2019-10-23 at 01:15 Reason: More info
mnd9 is offline   Reply With Quote
Old 2019-10-23, 01:50   #438
PhilF
 
PhilF's Avatar
 
Feb 2005
Colorado

29216 Posts
Default

Quote:
Originally Posted by mnd9 View Post
This is what’s confusing—my version 1 is clickable it just returns me to the page with Code, Data, Log, Comments but no output.

In post 418, axn said I should have an output tab regardless of it the kernel is killed after 9 hours or completes... I lost 9 hours of GPU quota and the kernel ran so why no output...?
I'll queue up a mprime job tomorrow that won't complete in 9 hours, then let you know if I can get to its output.

Quote:
Originally Posted by mnd9 View Post
Finally I’d like to test something but I need help from a Python savvy user out there: is there a way to issue a keyboard interrupt (I.e ctrl + c) after a certain time delay? My thought is maybe I’ll program into my code to interrupt my script before the kernel times out and maybe that will result in a successful “complete” status as all of the code cells will run hopefully giving me some output... thoughts?
That's a good idea, and could help when using mprime also.
PhilF is online now   Reply With Quote
Old 2019-10-23, 02:40   #439
axn
 
axn's Avatar
 
Jun 2003

5,087 Posts
Default

Quote:
Originally Posted by mnd9 View Post
Does it matter than I'm moving my input and exe to /usr/loca/bin/ and running everything there?

Is there somewhere else I need to move/run files in order to be captured as output??
The output tab contains the contents of the default folder (/kaggle/<i forget what it is>). If the checkpoint files / results files etc. are not written in that particular folder, it won't show up in Output tab.

EDIT:- /kaggle/working

Last fiddled with by axn on 2019-10-23 at 02:50
axn is online now   Reply With Quote
Old 2019-10-23, 02:54   #440
axn
 
axn's Avatar
 
Jun 2003

13DF16 Posts
Default

Quote:
Originally Posted by PhilF View Post
I'll queue up a mprime job tomorrow that won't complete in 9 hours, then let you know if I can get to its output
No need. This is what I'm doing. If my run completes in 9 hours, I get results.json.txt, else I get the pXXXXX files. I also get the mprime executable and all the accompanying text file, because I run the program directly from the default folder.
axn is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Alternatives to Google Colab kriesel Cloud Computing 11 2020-01-14 18:45
Notebook enzocreti enzocreti 0 2019-02-15 08:20
Computer Diet causes Machine Check Exception -- need heuristics help Christenson Hardware 32 2011-12-25 08:17
Computer diet - Need help garo Hardware 41 2011-10-06 04:06
Workunit diet ? dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 14:08.


Fri Aug 6 14:08:35 UTC 2021 up 14 days, 8:37, 1 user, load averages: 2.86, 2.82, 2.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.