![]() |
[QUOTE=kriesel;533043]Now I'm repeatedly getting no GPU at all. On the account from which I made many unsuccessful tries to get a K80:
[CODE]NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. [/CODE]Meanwhile the other account I'm using has no trouble getting a P100.[/QUOTE] Today: the one where I need a K80 to finish benchmarking gets a P100; the one where I don't care if it issues a P100, can't get any gpu. (I had started setting that account up for P100 since it was reliably getting gpus, usually P100, and the other one had been getting nothing) cpu-only colab now for me. |
[QUOTE=kriesel;533068]cpu-only colab now for me.[/QUOTE]
Hmmm... This morning I spun up three instances. Got two P100s and a T4 (at ~50% throughput). It seems Google, like many gods, works in mysterious ways... :wink: |
And, I got a K80 right away. . .
|
On Kaggle, [U]all[/U] of my CPU instances for the last few days have been Skylake Xeons. Up until then those were rare, with Haswell Xeons being the norm.
|
[QUOTE=chalsall;533069]It seems Google, like many gods, works in mysterious ways... :wink:[/QUOTE]If at all. Like a bureaucracy.
|
[QUOTE=kriesel;533068]
cpu-only colab now for me.[/QUOTE] I hear you! I started playing with colab using chalsall GPU72 TF notebook. Everything went well for some days, then I started getting some occasional "No GPU backend available" messages. Later I setup the notebook to run mprime and began running ECM, quitting using GPUs. Overtime, I added several instances, using some gmail accounts I have access to, and for several weeks I have been running 4 colab instances (CPU only, mprime doing ECM on small exponents). Always getting 12 hours, no disconnects whatsoever, always a TPU available upon any session restart. A couple of days ago, I threw some GPU instances into the mix (more precisely, on two of the accounts I am using I added a GPU instance). It wasn´t very long until I started getting the "no GPU backend available" message again. Today, on top of that and for the very first time, I am getting also "No TPU available" on the accounts where I was concurrently running a GPU instance. The other two accounts, that have just run TPUs, don´t suffer from that problem. So yes, looks like our friends at Google are becoming fed up with us overusing their precious GPUs... I will return to CPU only instances and see what happens (if and when I regain access to them, that is). Oh, well, we shouldn´t complain, did we? |
1 Attachment(s)
Introducing tf1G.py v0.11:
The major change to the script this time is the addition of the min and max exponent variables, which are allowable parameters on mersenne.ca. In addition, some sanity checks were added. |
[QUOTE=lycorn;533076]I hear you!
... Today, on top of that and for the very first time, I am getting also "No TPU available" on the accounts where I was concurrently running a GPU instance. The other two accounts, that have just run TPUs, don´t suffer from that problem. So yes, looks like our friends at Google are becoming fed up with us overusing their precious GPUs... I will return to CPU only instances and see what happens (if and when I regain access to them, that is). Oh, well, we shouldn´t complain, did we?[/QUOTE]I remind myself that it's for free, and to be grateful it's ever available. What do you run on the TPUs? I'm not aware of any GIMPS use for them. If running mprime only, I select "NONE", to leave the TPUs available for others to use. The good news is that about an hour ago, I got K80s on both accounts. So they do still exist! It will take a few more such sessions on one of the accounts to finish out the current exponent's benchmark in P-1. |
On the TPUs (which are actually CPUs) I run ECM factoring on exponents < 1M.
Today I got them available again so I´m back to 4 instances CPU-only. I´m pretty sure I won´t have any more problems (unless, of course, the Google gods/godesses decide it´s time to cut the resources altogether). |
[QUOTE=lycorn;533095]On the TPUs (which are actually CPUs) I run ECM factoring on exponents < 1M.
Today I got them available again so I´m back to 4 instances CPU-only. I´m pretty sure I won´t have any more problems (unless, of course, the Google gods/godesses decide it´s time to cut the resources altogether).[/QUOTE]Google says that TPUs are ASICs. [url]https://cloud.google.com/tpu/docs/tpus[/url] "Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads. TPUs are designed from the ground up with the benefit of Google’s deep experience and leadership in machine learning." |
[QUOTE=kriesel;533122]Google says that TPUs are ASICs.
[url]https://cloud.google.com/tpu/docs/tpus[/url] "Tensor Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads. TPUs are designed from the ground up with the benefit of Google’s deep experience and leadership in machine learning."[/QUOTE] That is true, but all my colab instances CPUs are identified as Intel Xeon @ 2.30GHz Linux64 on My Account->CPUs option from mersenne.org menu. Next time I use it I´ll select "no accelerator" and see what happens. It won´t probably change anything, meaning the "Intel Xeon @ 2.30GHz" reported is simply the VM´s CPU made available on each session. |
| All times are UTC. The time now is 23:03. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.