![]() |
|
|
#441 | |
|
Einyen
Dec 2003
Denmark
315810 Posts |
Quote:
When you start your notebook remember to enable internet on the right side, if your script is downloading the files from somewhere. Test the script by starting it in the editor so you are sure it works before committing it. |
|
|
|
|
|
|
#442 |
|
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
5,437 Posts |
I managed to get a couple sessions very early this morning, finish a threads benchmarking run, and run a selftest, in CUDAPm1 v0.20 CUDA 5.5. See the selftest section of https://www.mersenneforum.org/showpo...28&postcount=5 for some details.
Has anyone else done CUDAPm1 selftests? obtained a successful selftest of CUDAPm1 on Colab, or seen fails? What version, what exponent, etc.? Last fiddled with by kriesel on 2019-10-23 at 13:02 |
|
|
|
|
|
#443 | ||
|
Einyen
Dec 2003
Denmark
2×1,579 Posts |
I found out how to restart a kernel from the command line with Kaggle API.
I have a kernel here that successfully ran for 9 hours on mprime (no GPU) and finished: kernel3abfb13938. I'm getting the output files with my script I explained in post #424 using Kaggle API and wput to upload the files I need to an ftp site. Do not power off the kernel, because it needs to still exist, so we can restart it. Now I can download the script from the kernel using: ~/.local/bin/kaggle kernels pull -w -m <username>/kernel3abfb13938 -w means it downloads to the current directory otherwise use "-p PATH" It downloads 2 files: kernel3abfb13938.pynb + kernel-metadata.json. Now I want to rename it to something sensible like "mprime1", so I rename the .pynb file to mprime1.pynb. The kernel-metadata.json looks like this: Code:
{
"id": "<username>/kernel3abfb13938",
"id_no": 6347611,
"title": "kernel3abfb13938",
"code_file": "kernel3abfb13938.ipynb",
"language": "python",
"kernel_type": "notebook",
"is_private": true,
"enable_gpu": false,
"enable_internet": true,
"keywords": [],
"dataset_sources": [],
"kernel_sources": [],
"competition_sources": []
}
Now we restart the kernel with the new name with: ~/.local/bin/kaggle kernels push -p PATH where PATH is the folder the 2 files is located in, and it will say: Quote:
I suspect I can just push the same 2 files next time without "pulling" down the script again, but I have not tested this yet. If we combine this with xx005fs's method of having output files attached as a dataset, we can probably restart it without downloading and uploading the files. Quote:
When you have a dataset attached to a notebook, how do you use the files in the script? Do you have to run an "import" command or something like that, or do the files just appear in "/kaggle/working" ready for use right away? Last fiddled with by ATH on 2019-10-23 at 18:16 |
||
|
|
|
|
|
#444 |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2×67×73 Posts |
|
|
|
|
|
|
#445 |
|
"Eric"
Jan 2018
USA
21210 Posts |
I just use a command to copy whatever is in the /kaggle/input/<your dataset> to /kaggle/working.
|
|
|
|
|
|
#446 | ||
|
Feb 2005
Colorado
2·7·47 Posts |
Quote:
Quote:
I performed another test that confirmed what I reported earlier. If you get disconnected from a non-committed session, DO NOT click on the banner to power it back on. That will reset the connection and all will be lost. However, if you note the name of the disconnected notebook, close the window it is running in completely, open a new Kaggle browser window, then choose that notebook from your notebook list, you will get dumped back into your previously disconnected session. In my case at least, the total session time was correct and incrementing; also the CPU usage was at 100%. However, the code block had the "Play" arrow next to it, not the stop button, making one think that it isn't running. Also, the screen output window was gone, which might further make one think the code isn't running. But it is. Only when the CPU usage indicator dropped to near zero could I tell that my code had completed its run. At that point !ls -l (actually it takes 2 of them for some reason) reveals that your working directory and output file(s) are still there. |
||
|
|
|
|
|
#447 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2·67·73 Posts |
Quote:
It is quota'ed to 5GB of storage, even though "df -h" shows the "/" partition has something like 490 GB of free space. Doing a "dd" into a test file caused a "out of space" error at exactly 5G. Interestingly, there appears to be about 1 TB of storage scattered around the file system. I've had no problem created files several hundred GBs in size in locations other than "/kaggle/working/". These, obviously, don't survive restarting, but the storage is (temporarily) there if you ever needed it (can't think why you would need anything larger than 5G, mind you). |
|
|
|
|
|
|
#448 | |
|
"Eric"
Jan 2018
USA
22×53 Posts |
Quote:
|
|
|
|
|
|
|
#449 |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
230668 Posts |
My understanding is there are HUGE datasets available to train with; I think you have to ask (the UI) for them to be exposed within the FS. I haven't investigated that myself, but the primary purpose of this environment is to provide (human) training in AI, so the datasets must be available (somehow).
|
|
|
|
|
|
#450 |
|
"Dylan"
Mar 2017
10010001002 Posts |
If anyone wants to work the ranges above 1G with mfaktc on the Colab (to help the project on mersenne.ca), I have devised some code in Python to do it. See below:
Code:
#Script to automate trial factoring of Mersennes above 1G, using mfaktx.
#Requires wget and curl.
#version history:
#v0.01 - Testing version.
#v0.02 - First public release.
#import needed packages
import sys, os, subprocess, signal, time
#set path to mfaktx
mfaktx_path = "C:\\Users\\Dylan\\Desktop\\mfaktc\\mfaktc-0.21\\"
#names of executables, change if needed
MFAKTX = 'mfaktc.exe'
WGET = 'wget.exe'
CURL = 'curl.exe'
#specify certain parameters for later, when we go and fetch assignments:
TF_LIMIT = str(71)
TF_MIN = str(68)
MAXASSIGNMENTS = str(1)
BIGGEST = str(1)
#changes should not be needed below
print ("---------------------------------")
print ("This is tf1G.py v0.02, a Python script to automate Mersenne trial factoring for exponents above 1 billion.")
print ("It is copyleft, 2019, by Dylan Delgado.")
print ("---------------------------------")
#run checks to see if we have the paths correct
if not os.path.exists(mfaktx_path):
print("The path for Mfaktx does not exist. Check your setting for mfaktx_path.")
sys.exit()
else:
#do we have mfactx?
if not os.path.exists(mfaktx_path + MFAKTX):
print("Mfaktx does not exist. Check your path or name of your executable.")
sys.exit()
#Now we define our URL
URL = "https://www.mersenne.ca/tf1G.php?download_worktodo=1&tf_limit=" + TF_LIMIT + "&tf_min=" + TF_MIN + "&max_assignments=" + MAXASSIGNMENTS + "&biggest=" + BIGGEST
print(URL)
#delete a file (code courtesy of Brian Gladman)
def delete_file(fn):
if os.path.exists(fn):
try:
os.unlink(fn)
except WindowsError:
pass
#submit work to mersenne.ca
def submit_results():
print("Submitting work...")
subprocess.run([CURL, "-F", "results_file=@results.txt", "https://www.mersenne.ca/bulk-factors.php"])
delete_file(mfaktx_path + "results.txt")
#main loop - fetch work with wget, run mfaktx, and submit results
while(True):
#check if we have a worktodo.txt
if not os.path.exists(mfaktx_path + 'worktodo.txt'):
print("No work to do, fetching more work in 5 seconds...")
time.sleep(5)
submit_results()
subprocess.run([WGET, URL, "-Oworktodo.txt"])
#check if worktodo.txt is empty
elif os.stat(mfaktx_path + "worktodo.txt").st_size == 0:
print("No work to do, fetching more work in 5 seconds...")
time.sleep(5)
submit_results()
subprocess.run([WGET, URL, "-Oworktodo.txt"])
#run mfaktc
subprocess.run([mfaktx_path + MFAKTX])
Last fiddled with by Dylan14 on 2019-10-24 at 23:17 |
|
|
|
|
|
#451 |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
100110001101102 Posts |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Alternatives to Google Colab | kriesel | Cloud Computing | 11 | 2020-01-14 18:45 |
| Notebook | enzocreti | enzocreti | 0 | 2019-02-15 08:20 |
| Computer Diet causes Machine Check Exception -- need heuristics help | Christenson | Hardware | 32 | 2011-12-25 08:17 |
| Computer diet - Need help | garo | Hardware | 41 | 2011-10-06 04:06 |
| Workunit diet ? | dsouza123 | NFSNET Discussion | 5 | 2004-02-27 00:42 |