![]() |
[QUOTE=xx005fs;527954]Has anyone attempted to run 2 colab instances on one account without interrupting the other one? that is running 1 on CPU workload and another on GPU. If it's indeed possible then I could always attempt to run some mprime alongside GPUOWL.[/QUOTE]
It is indeed possible. However, empirical experiments suggest it is best to run 1 instance per account -- otherwise the sessions seem to get interrupted at approximately the six-hour mark, instead of twelve. Note that the dual-run experiments were both GPU instances, however (across multiple Google accounts). You might have more success running 1 GPU and 1 CPU -- give it a try and see! |
This was the message that I received after trying to connect when I had done more than 30 hours of execution time, quite a bit more actually:
[CODE] 20191014_121438: *** FATAL ERROR *** 20191014_121438: GPU not found. Make sure GPU is enabled (Runtime menu -> Change runtime type).[/CODE] |
Could you write down how to run an exe (compiled cuda code) on Kaggle?
On Google Colab I know one successful way (just following your way): compile the code with cuda toolkit (for this you even don't need a gpu), upload to Google drive, then run it from a notebook. |
[QUOTE=R. Gerbicz;527963]Could you write down how to run an exe (compiled cuda code) on Kaggle?[/QUOTE]
[CODE]!/PATH/TO/EXECUTABLE/EXECUTABLE[/CODE] Or are you instead asking how to get the executable into Kaggle? In that case: [CODE]!wget URL_OF_PACKAGE !tar -xzvf PACKAGE !PATH/EXECUTABLE[/CODE] BTW, it wasn't clear from your post to whom you were speaking. There are many concurrent sub-threads going on at the moment; best to try to explicitly address the person who's attention you are seeking. |
[QUOTE=xx005fs;527953]. With 10 instances running (kaggle's upper CPU instance limit) the power will be significant. [/QUOTE]
How does a kaggle mprime script differ from a google colab script? |
[QUOTE=Prime95;527968]How does a kaggle mprime script differ from a google colab script?[/QUOTE]
Other than the possible attaching of a Google Drive to a Colab instance for persistent storage, no difference. For other packages, there might be a need to install a particular set of libraries; it's safe to just ask to have installed everything you need -- that way your Notebook will work unaltered across both platforms. To share, for Kaggle CPU batch jobs, I've been setting up small little "payloads" which includes mprime, worktodo.txt, prime.txt and local.txt. The worktodo file contains work assignments that will complete within nine hours (P-1 jobs in my case). I then rename the Notebook (P0 -> P9; so I can keep track of whats running from the Kaggle UI) and "Run" menu -> "Commit". mprime reserves the work from Primenet, does the work, reports back, and exits. The instance then shuts down, and I can relaunch another. |
[QUOTE=chalsall;527961]It is indeed possible. However, empirical experiments suggest it is best to run 1 instance per account -- otherwise the sessions seem to get interrupted at approximately the six-hour mark, instead of twelve.
Note that the dual-run experiments were both GPU instances, however (across multiple Google accounts). You might have more success running 1 GPU and 1 CPU -- give it a try and see![/QUOTE]I ran a cpu mprime instance from one browser on one system and one gmail account, and a gpu mfaktc instance from a different browser on a different system and a different account and therefore a separate google drive root folder, and got what appeared to be the full 12 hour duration of each. What I would find interesting is a way to run on one colab session, a cpu task and a gpu task, perhaps by launching one as a subprocess from a script before running the other from the same script. |
[QUOTE=kriesel;527971]What I would find interesting is a way to run on one colab session, a cpu task and a gpu task, perhaps by launching one as a subprocess from a script before running the other from the same script.[/QUOTE]
Indeed! This has always been in the back of my mind. Now that I have my [URL="https://instanceroot.com/"]Reverse SSH Tunnel Service[/URL] in production (sorry for the plug, but it actually works), I plan to explore this in more depth, and look at having mprime jobs run concurrently to mfaktc. Early experiments suggested that having the CPU at 100% had no impact on mfaktc's throughput on the K80s, but did (slightly) on the T4s. But, it appears that Google aren't giving out T4s to anyone who works the way we do anymore. P.S. Just to share, my next "driving problem" is to figure out how to use these resources to run [URL="https://www.openfoam.com/"]OpenFOAM[/URL] jobs... |
[QUOTE=Prime95;527968]How does a kaggle mprime script differ from a google colab script?[/QUOTE]
I actually won't recommend runninng 10 instances of mprime on kaggle since every time you complete your 9 hour session, you have to export the data into a brand new dataset version, then repeat, and with 10 it's impossible to manage. Otherwise executing the program would be the same, just that when you download the mprime program you have to make sure that it's in the /kaggle/working directory instead of the default location. Also, use Dylan's script to preconfigure everything so that the initial setup steps that requires console interactions can be stopped. This is because instead of the persistent storage you will get using Google colab, but there's no easy way to do that in kaggle |
[QUOTE=xx005fs;527978]...then repeat, and with 10 it's impossible to manage.[/QUOTE]
My apologies for a flippant response, but this is also serious... Never send a human to do a machine's job... |
[QUOTE=chalsall;527970]Other than the possible attaching of a Google Drive to a Colab instance for persistent storage, no difference.[/QUOTE]
I played with Kaggle last night a bit. I got it to work, but I really need some persistent storage. I guess your comment confirms that Kaggle does not offer persistent storage? Has anyone been successful at connecting a Google Drive account to Kaggle? |
| All times are UTC. The time now is 22:50. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.