mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > PrimeNet > GPU to 72

Reply
Thread Tools
Old 2020-05-16, 15:40   #4896
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Runtime Error View Post
It seems that a notebook instance likes to first start a new exponent upon launch, and if it finishes, it will move on to any partially completed jobs. But unless it gets a T4, it will not finish the job within 12 hours. I currently have a handful of partially completed 81-bit jobs, and this evening's notebooks just started fresh exponents. Do you have any advice?
A new instance will first be re-issued work partially completed, in descending order. However, an assignment is only re-issued after no updates for 30 minutes.

This can lead to a bit of a "queue" if someone does the "Factory Reset" trick a few times -- anything which hasn't had work done on it is thrown back into the pool, but if the instance isn't reset within two minutes it will report back some progress, and then the candidate is held until completion.

My advice is to just stick with it -- all work will (eventually) be completed.

P.S. Oh, also... I set up a P-1 assignment for myself in 332M as a test. 17 days on the lone CPU core... I don't think it will make sense to make this worktype available to the Colab instances.

Last fiddled with by chalsall on 2020-05-16 at 15:41 Reason: Smelling mistake.
chalsall is offline   Reply With Quote
Old 2020-05-16, 16:37   #4897
Runtime Error
 
Sep 2017
USA

110011102 Posts
Default

Quote:
Originally Posted by chalsall View Post
A new instance will first be re-issued work partially completed, in descending order. However, an assignment is only re-issued after no updates for 30 minutes.

This can lead to a bit of a "queue" if someone does the "Factory Reset" trick a few times -- anything which hasn't had work done on it is thrown back into the pool, but if the instance isn't reset within two minutes it will report back some progress, and then the candidate is held until completion.

My advice is to just stick with it -- all work will (eventually) be completed.

P.S. Oh, also... I set up a P-1 assignment for myself in 332M as a test. 17 days on the lone CPU core... I don't think it will make sense to make this worktype available to the Colab instances.
Got it. I've been cycling until I get P100s, that makes sense. Thanks!

And wow 17 days = 34+ days with the time limitations. Ouch!
Runtime Error is offline   Reply With Quote
Old 2020-05-17, 17:19   #4898
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

230478 Posts
Default

Quote:
Originally Posted by Runtime Error View Post
Got it. I've been cycling until I get P100s, that makes sense. Thanks!
So, I've been thinking about how to handle these long-running tasks (and the restarting of same) in a better way.

One thing which could be done immediately would be to drop the number of assigned tasks per instance down to two, instead of three. The chances of a factor being found immediately after the start of the next job are very, very small, so mfaktc would (almost) never run out of work.

However, something to put out there... It would also be possible to only assign a single job at a time. The downside to this is there would be about 30 seconds of wasted compute between a job being finished, the next job being fetched, and mfaktc starting up again (along with the short self-test).

Thoughts? Perhaps make this optional, on a per-instance basis?
chalsall is offline   Reply With Quote
Old 2020-05-17, 17:45   #4899
axn
 
axn's Avatar
 
Jun 2003

2×3×7×112 Posts
Default

Quote:
Originally Posted by chalsall View Post
The downside to this is there would be about 30 seconds of wasted compute between a job being finished, the next job being fetched, and mfaktc starting up again (along with the short self-test).
The horror, the horror!
axn is online now   Reply With Quote
Old 2020-05-17, 17:58   #4900
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

D5D16 Posts
Default

I think the benefit of having fewer half-done assignments hanging around is well worth losing a minute a day or thereabouts.
James Heinrich is offline   Reply With Quote
Old 2020-05-17, 18:51   #4901
Runtime Error
 
Sep 2017
USA

2×103 Posts
Default

Quote:
Originally Posted by chalsall View Post
So, I've been thinking about how to handle these long-running tasks (and the restarting of same) in a better way.

One thing which could be done immediately would be to drop the number of assigned tasks per instance down to two, instead of three. The chances of a factor being found immediately after the start of the next job are very, very small, so mfaktc would (almost) never run out of work.

However, something to put out there... It would also be possible to only assign a single job at a time. The downside to this is there would be about 30 seconds of wasted compute between a job being finished, the next job being fetched, and mfaktc starting up again (along with the short self-test).

Thoughts? Perhaps make this optional, on a per-instance basis?
Sweet! I currently have 9 in-progress assignments at the 81-bit level. One of them has been stuck at 95% for a few days, presumably due to me cycling. The one-assignment per notebook rule for 81-bits would be welcomed!

Also, I just got kicked out of Colab after 1hr30min
Runtime Error is offline   Reply With Quote
Old 2020-05-18, 03:47   #4902
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2·4,909 Posts
Default

GPUto72 is not seeing (in the stats) the factor found up in the 332M range on one of my colab sessions.
Uncwilly is offline   Reply With Quote
Old 2020-05-18, 04:10   #4903
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default

Quote:
Originally Posted by Runtime Error View Post
Sweet! I currently have 9 in-progress assignments at the 81-bit level. One of them has been stuck at 95% for a few days, presumably due to me cycling. The one-assignment per notebook rule for 81-bits would be welcomed!

Also, I just got kicked out of Colab after 1hr30min
That hasn't happened to me, yet. However, I am feeling that "THEY" are onto my dodges around usage limits. The GPU Nazi is giving me a lot of "No GPU for You!" I only seem to be able to run GPUs on my paid account, and that has shot its wad, at the moment.

EDIT: But when I turned off GPU in the settings of all the notebooks, and then tried running, I got 4 CPU/P-1 instances running with RAM showing 'BUSY' on all of them. I think I was getting more push-back when I left GPUs enabled, when I knew it wasn't going to let me have them. There were more 'excess sessions' warnings and it seemed the system would only allow one. This preemptive disabling of GPUs might help with free accounts, too. I can usually run two notebooks on free accounts. Trying to run multiple accounts simultaneously has not worked out for me at least, so 2 (free) or 4 (paid) notebooks are the limits in my experience. I would be happy to hear if others have gotten more going. (I know that chalsall and others are running multiple instances through VPNs and other more abstruse means, but I am not running in that class so far. Ignorance and laziness stand in the way.)

Last fiddled with by kladner on 2020-05-18 at 04:53
kladner is offline   Reply With Quote
Old 2020-05-18, 20:37   #4904
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

262716 Posts
Default

Quote:
Originally Posted by Runtime Error View Post
The one-assignment per notebook rule for 81-bits would be welcomed!
OK, version 0.423 is now in production. Does the one assignment at a time thing.

A bit of wasted compute (as a %) for the shorter running jobs, but I'm too busy at the moment on other things to make this smarter. I have mapped out a solution of giving the next assignment a few minutes before the first expires, but it will take a little while to implement.
chalsall is offline   Reply With Quote
Old 2020-05-18, 20:50   #4905
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

262716 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
GPUto72 is not seeing (in the stats) the factor found up in the 332M range on one of my colab sessions.
Copy. Will drill down in the next 48 hours.
chalsall is offline   Reply With Quote
Old 2020-05-18, 23:19   #4906
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2×4,909 Posts
Default

Quote:
Originally Posted by chalsall View Post
Copy. Will drill down in the next 48 hours.
Specifically I should have said the graph.
Uncwilly is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Status Primeinator Operation Billion Digits 5 2011-12-06 02:35
62 bit status 1997rj7 Lone Mersenne Hunters 27 2008-09-29 13:52
OBD Status Uncwilly Operation Billion Digits 22 2005-10-25 14:05
1-2M LLR status paulunderwood 3*2^n-1 Search 2 2005-03-13 17:03
Status of 26.0M - 26.5M 1997rj7 Lone Mersenne Hunters 25 2004-06-18 16:46

All times are UTC. The time now is 09:38.


Mon Aug 2 09:38:30 UTC 2021 up 10 days, 4:07, 0 users, load averages: 1.29, 1.19, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.