![]() |
|
|
#23 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
230028 Posts |
Quote:
All's cool! ![]() P.S. I think it would be great if you made your AMI available publicly for anyone else who wanted to play. |
|
|
|
|
|
|
#24 | ||
|
"/X\(‘-‘)/X\"
Jan 2013
22×733 Posts |
Quote:
Quote:
|
||
|
|
|
|
|
#25 |
|
"/X\(‘-‘)/X\"
Jan 2013
22×733 Posts |
Speaking of making the AMI, the biggest concern is handling exponents that don't get completed when an instance is terminated. I'm doing that now with rsync/ssh and manually looking at the backup directories, but that's not turn-key easy to set up. I have some ideas of features that GPU72 could implement to make it easier:
1. Being able to request a 1 day time when fetching work. If the machine can do 280 GHz-day/day, and the largest assignment (LL, 71->74) is about 50 GHz-day, making the needed buffer of 2 assignments 100 GHz, assignments should be complete in 9 hours. A 200 GHz-day/day machine could complete in 12 hours. 2. Being able to specify a machine name that is stored with the assignment, and then manually going through the assignment list when an instance is terminated and unreserving those assignments (which could be scripted). More convenient would be a way of unreserving all the assignments of a machine by name. I may go the route of creating another script to deal with the stale backup directories, with the option of unreserving anything stale, or making a new worktodo.txt with the lost exponents (along with any checkpoint files). |
|
|
|
|
|
#26 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2×5×7×139 Posts |
Quote:
You have to look at the Primenet Exponent Status Report, and then take into consideration the Primenet Thresholds Report. Then, look at the Trial Factored Depth Change Report vs the Exponent Status Change Report (this is for the 30M to 40M range). ...and then try to figure out how things stand. This is further complicated by the fact that the category thresholds move every day as candidates are DCed and LLed, and the fact that you can't entirely rely on the DC or LL completion rates for an estimate as to how many assignments will be requested per day, since some Cat 4 assignments can be held for *years* before being recycled (even under the new assignment / recycling rules). But the short version is... We've probably now got about a month's worth of DC Cat 4 buffer -- LL Cat 4 is right at the edge. I'm actually in the process of migrating my EC2 instances to LL Cat 4 work based on the recent surge in the DC range. |
|
|
|
|
|
|
#27 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2·5·7·139 Posts |
Quote:
![]() But if I may suggest an alternative solution space... I've been using a custom AMI which is launched with a 1GB volume attached at launch (at /dev/sdb auto-mounted at /home). A crontab then contains @Reboot entries which launches the mfaktc.exe(s) processes and a mprime process after a successful launch. This has the advantage that when offline an instance's store is only 1GB, and the same AMI could be used by anyone. The downside is only one instances can be launched at a time from the "Management Console", in order to select the 1GB volume to attach. Thoughts? P.S. I'm very much enjoying this experiment, and working with you. Collaboration is good! |
|
|
|
|
|
|
#28 | |||
|
If I May
"Chris Halsall"
Sep 2002
Barbados
260216 Posts |
Quote:
Quote:
Quote:
|
|||
|
|
|
|
|
#29 | ||
|
"/X\(‘-‘)/X\"
Jan 2013
22×733 Posts |
Quote:
Quote:
Last fiddled with by Mark Rose on 2014-06-25 at 18:35 |
||
|
|
|
|
|
#30 | ||||
|
"/X\(‘-‘)/X\"
Jan 2013
22·733 Posts |
Quote:
I think a simpler solution will be just giving up on the assignments, and telling people to not bid too close to spot to avoid frequent instance termination. Amazon also says that they try to gracefully shutdown instances, so I may write a shutdown script that stops mfaktc, submits any final work to primenet and unreserved any assignments in work to do. All that can happen as is right now. It only fails if the instance dies too quickly. Quote:
Quote:
On the EC2 instances, I'm only buffering 25 GHz-day of what GPU72 decides of DCTF. The largest assignments are 7 GHz-day and the cron runs once an hour, so that works. On EC2 I'm more concerned about handling lost assignments than coping with unexpected ISP or GPU72 downtime. Quote:
|
||||
|
|
|
|
|
#31 |
|
May 2013
East. Always East.
11·157 Posts |
A lot of this is going right over my head but it sounds promising enough.
One question I do have is regarding the buffer that has been mentioned so much. I've just been grabbing assignments in batches of 100 per card, which can last me two weeks if they're all 71 -> 74. It always seemed like a bit much but I figured it wasn't hurting anything. Should I be doing smaller batches? |
|
|
|
|
|
#32 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
2·5·7·139 Posts |
Quote:
|
|
|
|
|
|
|
#33 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
100110000000102 Posts |
Quote:
But this gives the advantage of distributing an AMI and a sample /home image. LOL, we're arguing about technical details which costs cents per day!
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Let GPU72 Decide and other questions | jschwar313 | GPU to 72 | 11 | 2016-10-14 19:16 |
| Factor not recorded by GPU72 | bayanne | GPU to 72 | 24 | 2014-05-16 09:20 |
| GPU72 out of 332M exponents? | Uncwilly | GPU to 72 | 16 | 2014-04-11 11:31 |
| Cooperative Agreement or Capitalist Takeover? You decide! | cheesehead | Lounge | 97 | 2013-11-16 21:19 |
| GPU72.COM is down | swl551 | GPU to 72 | 1 | 2013-01-11 12:54 |