mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   Running through an endless loop in Google Colab (https://www.mersenneforum.org/showthread.php?t=26994)

JuanTutors 2021-07-13 02:55

Running through an endless loop in Google Colab
 
Still working out the kinks on how to use Google Colab. Please, no jabs! I am pasting my code, and also the endless loop it seems to be running into:

[code]
#@title The GPU72 Trial Factoring Colaboratory Notebook. { output-height: 8, form-width: "90%", display-mode: "form" }
#@markdown -- Please see https://www.gpu72.com/ for more details.
#@markdown <br><br>
#@markdown -- Enter your Access Key below. Leave blank to work anonymously.
#@markdown <br><br>
#@markdown -- Leave "Logging_Level" as Default unless you are a developer.
Access_Key = "acccesskeyacccesskeyacccesskey" #@param {type:"string"}
Logging_Level = "Default" #@param ["Default", "Verbose"]

import subprocess
import signal

!echo "Beginning GPU Trial Factoring Environment Bootstrapping..."
!echo "Please see https://www.gpu72.com/ for additional details."
!echo

!mkdir -p /home/gpu72/
%cd -q /home/gpu72/

!wget -qO bootstrap.tgz https://www.gpu72.com/colab/bootstrap.tgz
!tar -xzf bootstrap.tgz >/dev/null

p = subprocess.Popen(['./bootstrap.pl', Access_Key, Logging_Level],
stdout=subprocess.PIPE,
universal_newlines=True,
bufsize=0)

try:
for line in p.stdout:
line=line[:-1]
print(line)
except KeyboardInterrupt:
print("\nExiting...")
p.send_signal(signal.SIGINT)
!./comms.pl ShutDown 5
print("Done.")

Insert_Work = "Factor=N/A,332370583,78,79" #@param {type:"string"}
Delete_Results = True #@param {type:"boolean"}

import os.path
from google.colab import drive

if not os.path.exists('/content/drive/My Drive'):
drive.mount('/content/drive')

%cd '/content/drive/My Drive/colab/'
!ls -lha
!echo

if Insert_Work:
print("Add string is:" + Insert_Work + "\n")
f=open("worktodo.add","w+")
f.write(Insert_Work + "\n")
f.close()
!cat worktodo.add
!echo

!cat "results.txt"
!echo

if Delete_Results:
print("Deleting results!\n")
!rm results.txt

!cat worktodo.txt
!wc -l worktodo.txt
[/code]

And here is the loop it's running into:

[code]
20210713_025217 ( 0:07): running a simple selftest...
20210713_025222 ( 0:07): Selftest statistics
20210713_025222 ( 0:07): number of tests 107
20210713_025222 ( 0:07): successfull tests 107
20210713_025222 ( 0:07): selftest PASSED!
20210713_025222 ( 0:07): Fetching initial work...
20210713_025223 ( 0:07): Running GPU type Tesla T4

[/code]

Repeats the same thing over and over. I don't see any errors yet, although I'm shooting in the dark at this point. Any pointers to get it started?

Uncwilly 2021-07-13 06:30

Does it look like it has fetched work? It might be a problem on Chris's end.

JuanTutors 2021-07-13 10:23

[QUOTE=Uncwilly;583088]Does it look like it has fetched work? It might be a problem on Chris's end.[/QUOTE]

I used the code given in one of the posts as is to assign it TF work on a specific exponent. That may be the issue. I know that a few posts mentioned having trouble with the Tesla T4 architecture, like perhaps the code just doesn't work with that GPU?

[code]
Insert_Work = "Factor=N/A,332370583,78,79" #@param {type:"string"}
[/code]

Uncwilly 2021-07-13 12:38

The T4 was working last I saw one.

chalsall 2021-07-13 14:00

[QUOTE=Uncwilly;583088]It might be a problem on Chris's end.[/QUOTE]

Yup. Sorry. My bad. Fixed.

JuanTutors 2021-07-13 23:52

[QUOTE=chalsall;583107]Yup. Sorry. My bad. Fixed.[/QUOTE]
Thanks! Seems to be running but I can't seem to get it to work on assignments of my choosing. I have tried the following code on line 38 posted above, meaning I tried all of that code with line 38 containing one of these two lines.
[code]
Insert_Work = "Factor=N/A,332370583,78,79" #@param {type:"string"}
[/code]
[code]
Insert_Work = "Factor=[ASSIGNMENT ID FROM MERSENNE.ORG],332370583,78,79" #@param {type:"string"}
[/code]
I just got assigned a few hundred assignments to TF instead of factoring the assignment I chose. I tried on a third google account and now I seem to be doing a TF and a P-1 on some smaller exponents, one of them being 120698189. I read through the instructions and I seem to have copied/pasted it correctly.

Flaukrotist 2021-07-14 17:11

I think there are some problems with your code. You are basically using the scripts of GPU72 to run an instance of mprime together with an instance of mfaktc. Chris has written this code in a way to avoid using a google drive. And it is designed to fetch work from GPU72.com in a loop until the session dies. The actual execution you can't see in the code you pasted because it is hidden in some perl code (bootstrap.pl) which is loaded after the start from gpu72.com.

So when I get it right, you will never reach the code block where you define your worktodo.add contents. And even if you ever reach it, it won't have any effect because the mfaktc execution doesn't use google drive and will never detect your worktodo.add. The program of GPU72 is just not defined for a scenario where you set a work item explicitely. Instead you do what GPU72 serves you based on the work type you selected when you created the colab notebook reference on GPU72. So, either you do that, then you should use exactly the code from gpu72.com without any changes. Or if you are interested enough in defining your own work items you have to write your own script based on python experience and the very helpful colab section in Kriesel's help pages to run mprime and/or mfaktc using google drive.

chalsall 2021-07-14 18:38

[QUOTE=Flaukrotist;583190]So, either you do that, then you should use exactly the code from gpu72.com without any changes. Or if you are interested enough in defining your own work items you have to write your own script based on python experience and the very helpful colab section in Kriesel's help pages to run mprime and/or mfaktc using google drive.[/QUOTE]

Ah... Thank you for that. I didn't have the bandwidth to parse the code delta.

I /did/ have a bug on the server-side (a stupid constrained select for a particular work type) which was corrected.

But, yeah. As with all things "tech", you touch it, you own it! :wink:

kriesel 2021-07-14 19:27

[QUOTE=Flaukrotist;583190]I think there are some problems with your code. ... you will never reach the code block where you define your worktodo.add contents. And even if you ever reach it, it won't have any effect because the mfaktc execution doesn't use google drive and will never detect your worktodo.add. ... So, either you do that, then you should use exactly the code from gpu72.com without any changes. Or if you are interested enough in defining your own work items you have to write your own script based on python experience and the very helpful colab section in Kriesel's help pages to run mprime and/or mfaktc using google drive.[/QUOTE]Thanks for the sleuthing and the positive [URL="https://www.mersenneforum.org/showthread.php?t=24839"]colab thread[/URL] mention (which covers attempts of almost all GIMPS software title on Colab free, by various originators of scripts). I haven't seen anything that indicates T4 won't run any appropriately selected and configured app. T4 is just slow in DP, and may be limited on max exponent like any other GPU is.

Sounds like Juan created a chimera. Shouldn't shock anyone much that half a horse plus half a tortoise doesn't run well.
A good outcome is it led to detection and resolution of a GPU72 bug.

mathwiz 2021-07-14 22:40

[QUOTE=kriesel;583200]Sounds like Juan created a chimera. Shouldn't shock anyone much that half a horse plus half a tortoise doesn't run well.[/QUOTE]

Would that not he... a centaurtoise?

LaurV 2021-07-18 08:11

[QUOTE=mathwiz;583205]Would that not he... a centaurtoise?[/QUOTE]
Nope. That's a turthorse.


All times are UTC. The time now is 15:08.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.