![]() |
Colab migration to mprime v30.3b6
Just a heads up for those using the GPU72_TF Notebook.
I'm in the process of migrating to the latest version of mprime to take advantage of the better algorithm George et al recently implemented. Unfortunately, the checkpoint files aren't compatible for stage 2. I don't have much time available at the moment, so I've done a "quick and dirty" hack. If anyone sees in their logs something about "bad checksum; temporarily skipping" on an assignment, don't worry -- your work isn't lost. I'm waiting for a few workers to finish off some older v29 assignments (should be a couple of days). Then everything will be processed using v30 moving forward. |
I will try to kill my worker right after the current stage 2 finishes. So that nothing gets tangled up in the switch over.
|
[QUOTE=Uncwilly;582222]I will try to kill my worker right after the current stage 2 finishes. So that nothing gets tangled up in the switch over.[/QUOTE]
That's not necessary. Just let things run as usual. What we're waiting for is a few v29s to finish their stage 2 work. Any v30s will work as deep into stage 2 as they can (after entering stage 2 from stage 1) while they survive, and then will be given stage 1 work (when they're restarted) until later this week. |
I have seen the errors you mention in the logs (not on the screen output). It seems indeed harmless, none of my work seems to be lost (yet).
One strange thing that happens for all new instances (not for the old which are still running) is that I see every line in the output doubled. For both GPU and CPU outputs. It looks like two instances of mprime and two instances of mfaktk are running in parallel, but from the iteration times, there is only one instance of each (otherwise the GHzDays/Day should halve). |
[QUOTE=LaurV;582231]One strange thing that happens for all new instances (not for the old which are still running) is that I see every line in the output doubled. For both GPU and CPU outputs. It looks like two instances of mprime and two instances of mfaktk are running in parallel, but from the iteration times, there is only one instance of each (otherwise the GHzDays/Day should halve).[/QUOTE]
No idea. Stopping and restarting /should/ be sane. There's IPC specifically for shutting down all children. But... If you see it again, please take a screenshot and PM me (or post here; whatever). Then, try a "Factory Reset". There were no code deltas on the payloads handed out to the ephemeral instances, other than the v30 mprime executable. |
[URL="https://www.mersenne.org/report_exponent/default.php?exp_lo=119744587&full=1"]119744587[/URL]
This one has been hanging around (and a couple others) in my "Assignments" page, even though they were turned in quite a while ago. Normally no big deal, but I've since gotten more assignments, and turned those in, where they properly disappeared. Not sure why this one is "stuck". (so is 119747029 and 119863531) |
Seems like (in my case anyway) all of the recently acquired 119M exponents aren't updating on GPU72 after submitting the TF results to PrimeNet.
|
| All times are UTC. The time now is 13:41. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.