![]() |
[QUOTE=storm5510;530212]It might actually take much longer to run a DC this way instead of on a local GPU with [I]CUDALucas[/I].[/QUOTE]It's not instead of, it's in addition to.
|
[QUOTE=kriesel;530345]It's not instead of, it's in addition to.[/QUOTE]
This implies a person could jump from one to another with one, or more, checkpoint files. Is this correct? |
[QUOTE=storm5510;530351]This implies a person could jump from one to another with one, or more, checkpoint files. Is this correct?[/QUOTE]
Yes. Checkpoints are compatible, assuming you run a mfaktc newer than 1.18 or so, when they were changed, and assuming you do not interchange "special" versions (like less classes). You can freely move assignments and checkpoint files between computers, colab included. However keep in mind that moving the checkpoint alone means nothing, unless you have the assignment in worktodo too. This is how mfaktX works, it gets the work from the worktodo file and [U]then[/U] it checks for checkpoint. The checkpoint only stores the last class that was done for an exponent, and it will not do again the classes already done. Each class is sieved and powmoded separate. I used this method to split huge assignments (like M666666667 to 86 bits or so) between more computers/cards, by creating "fake" checkpoints so each computer/card does different classes. |
[QUOTE=storm5510;530351]This implies a person could jump from one to another with one, or more, checkpoint files. Is this correct?[/QUOTE]Not necessary. Start and finish each exponent separately works. Colab finishes what it starts, owned gpu finishes what it starts. Completely parallel. If you decide to move an mprime or gpuowl run off Colab, yes, the same app run on your own pc can finish what was started on Colab and stored on Google drive, and vice versa (as long as the application versions are compatible). [url]https://www.mersenneforum.org/showpost.php?p=508637&postcount=12[/url]
|
Hmm got the following error while running colab on TF.
[CODE]Failed to execute cell. Could not send execute message to runtime: TypeError: Cannot read property 'getKernelInfo' of null Cannot read property 'getKernelInfo' of null TypeError: Cannot read property 'getKernelInfo' of null at d (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:3474:140) at w8 (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:3474:275) at za.program_ (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:3467:302) at Ba (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:12:336) at za.next_ (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:10:453) at Da.next (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:13:206) at b (https://colab.research.google.com/v2/external/external_polymer_binary.js?vrz=colab-20191111-080000-RC00_279737042:22:43) [/CODE] |
[QUOTE=dcheuk;530482]Hmm got the following error while running colab on TF.[/QUOTE]
Hmmm... That is a *deep* error. Never seen it before myself. Taking a stab in the dark, this looks like the "supervisor" hosting the VM is undergoing maintaince, or having a hardware issue. Edit: Actually, maybe not really that deep an error. Did you try reconnecting? |
[QUOTE=storm5510;530212]Sometimes, I think maybe [I]Colab[/I] sees all we do as crypto-mining because of the high utilization.[/QUOTE]
While I lived in the university apartments, they (the university IT department) thought I was `mining crypto currency' due to comparatively larger electricity consumption and `suspicious network activites,' and tried to discipline me for such misbehavior. I had to send them a bunch of friendly emails explaining that I was using it to parallel computing data for a research project. :smile: |
[QUOTE=chalsall;530484]Hmmm... That is a *deep* error. Never seen it before myself.
Taking a stab in the dark, this looks like the "supervisor" hosting the VM is undergoing maintaince, or having a hardware issue. Edit: Actually, maybe not really that deep an error. Did you try reconnecting?[/QUOTE] Yes, after reconnecting every seems to work fine. Only saw this error message once. Error codes are scary. I noticed the colab now halts my session every couple hours instead of full 12 hours now. I guess they're onto us hehehe |
Colab Exiting on "Getting Initial Work" Phase.
I tried several times including restarting the tunnels.
|
[QUOTE=petrw1;530947]I tried several times including restarting the tunnels.[/QUOTE]
There appears to have been a change in the underlying VM on Colab. The mfaktc executable which has worked since the beginning of September is no longer working on Colab (but still is under Kaggle). Absolutely no changes to the bootstrap payload nor server code. I'm currently seriously handicapped wrt workstation capability. If anyone can build a mfaktc which works in the new environment, please post it here or email it to me. An exceptionally unhappy day today. Tomorrow (or, actually, now, today) us unlikely to be much more fun... |
[QUOTE=chalsall;530954]There appears to have been a change in the underlying VM on Colab.
The mfaktc executable which has worked since the beginning of September is no longer working on Colab (but still is under Kaggle). Absolutely no changes to the bootstrap payload nor server code.[/QUOTE] I can confirm that it is an issue with Colab, not an issue with chalsall's creation. Mfaktc has stopped working for me on Colab, and I don't use chalsall's tunneling approach. It went from [CODE]ERROR: get_next_assignment(): no valid assignment found in "worktodo.txt"[/CODE]to [CODE] ./mfaktc.exe: error while loading shared libraries: libcudart.so.10.0: cannot open shared object file: No such file or directory[/CODE]somewhere in Nov 16 to Nov 18, after I replenished an exhausted worktodo file. Meanwhile gpuowl and mprime continue to work. [URL]https://www.mersenneforum.org/showthread.php?p=527911#post527911[/URL] Unfortunately, while [URL]https://download.mersenne.ca/[/URL] has NVIDIA dlls for Windows, it does not have the corresponding .so files for linux, perhaps because there are so many flavors. So off to NVIDIA for a download for x86_64 ubuntu: [URL]https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804[/URL] |
| All times are UTC. The time now is 13:53. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.