Register FAQ Search Today's Posts Mark Forums Read

2019-09-12, 23:31   #45
chalsall
If I May

"Chris Halsall"
Sep 2002

100110111100112 Posts

Quote:
 Originally Posted by De Wandelaar From now on, GPU usage is limited to 30 hours per week.
It never fails...

On the other hand, we can hardly complain about them ***giving*** each of us 1,500 GHzD of free compute every week!!!

Also, in some ways, this is comforting. It means they're OK with us doing what we're doing.

2019-09-12, 23:45   #46
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

142238 Posts

Quote:
 Originally Posted by chalsall Also, in some ways, this is comforting. It means they're OK with us doing what we're doing.
Until they discover they can't monetise the work and ban all of your accounts.

You don't really think it is free do you?

 2019-09-13, 03:58 #47 Dylan14     "Dylan" Mar 2017 2·293 Posts I have been running the Colab script for the GPU72 project, and while it runs well, there should be a way for it not to request more assignments than the time allots. I was thinking something along these lines (in psuedocode): Code: * upon running the script, get the current timestamp (in Unix time) and detect what platform we are on (Colab or Kaggle) and call it start * while the timestamp < start + 12 hours (Colab) or 9 hours (Kaggle), fetch an assignment * estimate the amount of time it would take to run the assignment, and add to the current timestamp * if the estimated time of completion would be after the deadline, drop the assignment and break
2019-09-13, 11:52   #48
chalsall
If I May

"Chris Halsall"
Sep 2002

997110 Posts

Quote:
 Originally Posted by Dylan14 I have been running the Colab script for the GPU72 project, and while it runs well, there should be a way for it not to request more assignments than the time allots. I was thinking something along these lines (in psuedocode):
Thanks for the idea; I've been modeling different strategies in my head, trying to come to convergence.

The problem with your suggestion is with mfaktc you can't know when a factor will be found, and so it's important to keep a bit of a buffer of work queued. This is more of an issue with Colab than Kaggle, in that with the former if you run out of work and mfaktc stops, the GPU's availability will be wasted.

My current methodology is to keep three candidates in the worktodo file. The checkpoint file for the currently being worked candidate is uploaded to the server every two minutes.

If a candidate assigned to a Colab / Kaggle instance is more than 12 hours old it is recycled if no checkpoint file was returned.

What I'm currently working on is reissuing unfinished candidates back to the same GPU72 worker with the checkpoint file so they can continue, and finish off the work.

By the end of the weekend I'll have the UI stuff built on GPU72, to let anyone participate. However, if anyone else would like to give this a whirl, please PM me with your GPU72 account details (UN, Display Name or email).

And thanks to the current beta-testers. Lots of great feedback (and factors found!).

 2019-09-13, 13:33 #49 Chuck     May 2011 Orange Park, FL 3·13·23 Posts I was assigned a Tesla T4 (1720 GHzD/D) for my first two 12-hour sessions (it disconnects automatically after that time), but this morning I am running on a much slower K80 (410 GHzD/D) The checkpoint capability is much more important with this much slower GPU. The estimated time to complete a 69M 74->75 TF is a little over three hours, so there is potential for a loss of three hours computing time. Maybe I should consider stopping the run and reconnecting after three exponents have been processed (assuming no factor found) until the checkpoint code is in place. Last fiddled with by Chuck on 2019-09-13 at 14:22 Reason: Restart
2019-09-13, 16:38   #50
chalsall
If I May

"Chris Halsall"
Sep 2002

100110111100112 Posts

Quote:
 Originally Posted by Chuck I was assigned a Tesla T4 (1720 GHzD/D) for my first two 12-hour sessions (it disconnects automatically after that time), but this morning I am running on a much slower K80 (410 GHzD/D)
LOL... Yeah, a bit ironic, but... It's now disappointing to only get a K80 for free!

Quote:
 Originally Posted by Chuck Maybe I should consider stopping the run and reconnecting after three exponents have been processed (assuming no factor found) until the checkpoint code is in place.
The most important part of the checkpoint code ***is*** in place; everyone is now running this. The most work which will be lost is two minutes (and, thus, on average only one minute).

By EOD today I'll have the code implemented to be able to give back the assignment to complete. Importantly, the previous worker will be given back the assignment.

Edit: BTW, you can see the checkpoint file status by looking at your Assignments page. The percentage completed is calculated from the submitted checkpoint files.

Last fiddled with by chalsall on 2019-09-13 at 16:47

 2019-09-14, 13:06 #51 Chuck     May 2011 Orange Park, FL 3×13×23 Posts No backend with GPU available This morning my notebook disconnected in the middle of a run and when I attempted to reconnect I got the message Code: Failed to assign a backend No backend with GPU available. Would you like to use a runtime with no accelerator? I don't know if this is due to overuse of GPUs by my account or general unavailability of hardware. Is the Colab honeymoon over?
 2019-09-14, 14:17 #52 Chuck     May 2011 Orange Park, FL 89710 Posts Running again... An hour later I tried reloading the notebook and it is running again with a T4. Last fiddled with by Chuck on 2019-09-14 at 14:17 Reason: 1 hour
 2019-09-14, 15:22 #53 pinhodecarlos     "Carlos Pinho" Oct 2011 Milton Keynes, UK 52×199 Posts Can I use two accounts from the same IP address?
2019-09-14, 15:34   #54
De Wandelaar

"Yves"
Jul 2017
Belgium

73 Posts

Quote:
 Originally Posted by pinhodecarlos Can I use two accounts from the same IP address?
I'm doing so and until now no problem ...

2019-09-14, 15:36   #55
chalsall
If I May

"Chris Halsall"
Sep 2002

233638 Posts

Quote:
 Originally Posted by pinhodecarlos Can I use two accounts from the same IP address?
Yup. I'm currently running two different accounts concurrently, from the same IP. And, in fact, from the same browser (different tabs).

I've found the GPU backend availability can vary considerably. Sometimes one account can get a GPU, while the other can't. Sometimes neither can, and sometimes both can.

A small sample-set suggests the GPUs are in high demand during "working hours" Eastern time, and then opens up at around 1800 (2200 UTC).

 Similar Threads Thread Thread Starter Forum Replies Last Post kriesel Cloud Computing 11 2020-01-14 18:45 enzocreti enzocreti 0 2019-02-15 08:20 Christenson Hardware 32 2011-12-25 08:17 garo Hardware 41 2011-10-06 04:06 dsouza123 NFSNET Discussion 5 2004-02-27 00:42

All times are UTC. The time now is 16:52.

Thu Oct 28 16:52:40 UTC 2021 up 97 days, 11:21, 0 users, load averages: 1.29, 1.29, 1.31