mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
 
Thread Tools
Old 2016-07-15, 06:28   #12
GP2
 
GP2's Avatar
 
Sep 2003

2·1,289 Posts
Default Setting up an EFS filesystem: terminating the instance you created

After you finish creating all the prime-init.txt and local-init.txt files in each of the subdirectories in the previous step, you can stay logged in to observe what happens when you launch the instances that will do the actual LL testing. For instance you can run the following commands:

Code:
cd /mnt-efs/mprime/instances
ls -lrt *
(or instead of "mnt-efs", choose the same name you chose in the previous set of commands)

You should see only the subdirectories and files you created as part of the setup, namely: c4.large c4.xlarge c4.2xlarge c4.4xlarge c4.8xlarge and their associated local-init.txt and prime-init.txt files, however this will change when you launch some new instances to start running mprime.

If you logged in to an existing instance in the previous sections in order to do the setup and configuration, then you can just keep it running for its initial purpose. However, if you launched a new instance solely for the purpose of carrying out the initial setup and configuration of the EFS file system, then do remember to terminate it eventually, otherwise it will keep running up a bill even if it is sitting idle.

Merely logging out of the ssh client sesssion will not terminate the instance. To terminate the instance:

Go to the EC2 console at http://console.aws.amazon.com/ec2/, then click on the "Instances" link in the left-hand-side menu.

Make sure you are in the same AWS region where you launched the instance in the previous steps, and change it if necessary. The region name is indicated at the top right part of the page.

Click on the instance you want to terminate with the left mouse button to select it, then click on it again with the right mouse button to bring up a popup menu. In the popup menu, select "Instance State" and "Terminate". In the resulting confirmation window, click on the blue "Yes, Terminate" button.


Next section: Using "spot instances" to run mprime

Last fiddled with by GP2 on 2016-07-27 at 07:03
GP2 is offline   Reply With Quote
Old 2016-07-15, 06:44   #13
GP2
 
GP2's Avatar
 
Sep 2003

2·1,289 Posts
Default Using "spot instances" to run mprime

At last we are done with the one-time setup and configuration, and we're ready to actually start number crunching some LL tests. We will do so using Amazon's "spot instances".

Some applications require 24/7 availability: for example, websites should never be down. However, an application like mprime/Prime95 does not have strict deadlines or uptime requirements, and can tolerate being interrupted.

Cloud platforms like Amazon EC2 and Google Compute Engine (but not Microsoft Azure) offer big discounts to customer who are willing to let their cloud applications get interrupted whenever a higher-paying customer shows up. Amazon's version of interruptible instances are called "spot instances". We will now launch one.

Go to the EC2 console at http://console.aws.amazon.com/ec2/, then click on the "Spot Requests" link in the left-hand-side menu.

Make sure you are in the same AWS region where you created the EFS filesystem in the previous steps, and change it if necessary. The region name is indicated at the top right part of the page.

You might see a "Pricing History" button at the top (if not, don't worry). Make note of it, it can be useful in the future. Its use is described below.


Launching a spot instance to run mprime

Click on the blue "Request Spot Instances" button (or the blue "Get started" button)

In the next page, set the following:

Request type: Request and Maintain

Target capacity: 1 instances

AMI: Amazon Linux AMI

(Advanced note: unfortunately you can't choose an "instance store" AMI
instead of an EBS-backed AMI in order to save EBS storage charges,
because the former isn't compatible with c4 instances)

Instance type(s): It should say "c4.large". Note: if you see "c3.large" then click on the black "x" circle to get rid of it, then click on the "Select" button and choose "c4.large". Note: NEVER run instances in the "c3" family they are roughly the same cost as the more modern "c4" family but considerably slower.

Allocation strategy: Lowest price

Network: Keep this at the default, it will say something like "vpc-........ (default)"

Availability zone: No preference (launch in cheapest Availability

Maximum price: Set your max price (per instance/hour)


When "Maximum price" is set to "Set your max price", a "Pricing History" button appears next to it. If you click on it, a window pops up, and you can change the header settings:

Code:
Product: Linux/UNIX   Instance type: c4.large   Date range: 1 day   Availability zone: All Zones
By looking at the graphs in the Spot Instance Pricing History window, you can get a feel for what the cost trends are. In this window, change the Instance type in the header settings if necessary to "c4.large" (not "c3.large" or any of the others).

Note that spot instance prices fluctuate according to market forces, and different regions can have quite different prices, however prices should average roughly 2 cents (per hour) or so for instance type c4.large in a region located in the US. Make a mental note of what level you want to set your limit price at; if the spot price graph rises above that level, your spot instances will terminate. New spot instances will launch automatically when the price drops back down, and they will resume the work left behind by the old ones.

Click the blue "Close" button to close the Spot Instance Pricing History window.


Now that you have formed some idea of what you want your limit price to be, enter it into the "$" field next to the "Set your max price" selection. For instance, enter 0.0175 if you want to pay no more than 1.75 cents per hour for your spot instance.

Click on the blue "Next" button

In the next page,

Instance store: ignore this, we don't need it.

EBS volumes: ignore this, we are using the EFS filesystem now instead.

EBS-optimized: ignore this

Monitoring enabled: ignore this, note that CloudWatch detailed monitoring would incur additional charges

Health check: select the checkbox for "Replace unhealthy instances"

Tenancy: keep this at "Default - run a shared hardware instance". Note: selecting "Dedicated" instead would incur large minimum charges.

Interruption behavior: keep this at "Terminate". If you set it to "Stop", it should still work, but if your AWS account is more than one year old, you will unnecessarily incur storage charges for the 8 GB EBS root volumes for each instance, even while your instances are not running.

User data: skip over this for the moment, we'll get back to it shortly

Instance Tags: you probably don't need these, but define some if you want.



Key pair name: choose the key pair name that was mentioned (or created) in the "Make sure that you have a key pair for ssh logins" section earlier

IAM instance profile: choose the IAM role (IAM instance role) that was mentioned (or created) in the "Make sure your IAM instance role exists and it has the right permissions" section, in other words mprime-instance-role or whatever you named it.

IAM fleet role: keep the "aws-ec2-spot-fleet-tagging-role"


Security groups: select "default"

Auto-assign IPv4 Public IP: keep this at "Use subnet setting (Enable)"



Request valid from: keep this at "Now"
Request valid until: you can keep this at "1 year from now"

Also, although it is fairly unimportant, you should unselect the "Terminate instances at expiration" box.



Now go back to the "User data" field.

Select the "As file" option. A "Choose File" button will appear, but before we can choose a file we have to create it first.

Download the "user_data_mprime" file that is an attachment to this forum post. Its exact name will vary depending on the version, for instance "user_data_mprime_1.10.txt".

That is the file you will use, but before using it, you must make one change.

In the Setting up an EFS filesystem: create the filesystem section, you made note of the File System ID of the newly-created EFS filesystem, which is of the form fs-xxxxxxxx (where each "x" is a hexadecimal digit). The exact value will be unique to you, and will be different for each AWS region where you created an EFS filesystem.

For each AWS region where you created an EFS filesystem, there should be a pair of lines of the form:

Code:
region-xxxx-x)
    readonly FILE_SYSTEM_ID=fs-xxxxxxxx;;
Change the "region-xxxx-x" to the AWS region where you created the EFS filesystem (for instance, "us-east-1" for N. Virginia, "us-east-2" for Ohio, "us-west-2" for Oregon, "eu-west-1" for Ireland, etc.)

Change the "fs-xxxxxxxx" to the correct value for the EFS filesystem, as described above; this value will be different for each AWS region, and it will be unique to you and no other user.

If you don't use a particular region, and haven't created an EFS filesystem in that region, then just delete the corresponding lines.

After you downloaded the "user_data_mprime" attached file and made the above edits to set the correct FILE_SYSTEM_ID values, click on the "Choose File" button and select that file.



Now that you've filled in the User Data field and everything else, click on the blue "Review" button.

Check the "Max price" is what you wanted it to be. Make sure that "Key pair name" and "IAM instance profile" are what you wanted them to be.

Click on the blue "Launch" button.


Next section: Monitoring the progress of the spot instance(s)
Attached Files
File Type: txt user_data_mprime_1.16.txt (15.8 KB, 319 views)

Last fiddled with by GP2 on 2018-04-04 at 07:07 Reason: updating user data script (runs itself at reboot time too, not just instance launch time; take into account that spot instances now stoppable; dirsync)
GP2 is offline   Reply With Quote
Old 2016-07-15, 07:06   #14
GP2
 
GP2's Avatar
 
Sep 2003

2×1,289 Posts
Default Monitoring the progress of the spot instance(s)

EC2 console

Go to the EC2 console at http://console.aws.amazon.com/ec2/, then click on the "Spot Requests" link in the left-hand-side menu.

Make sure you are in the AWS region where you launched the spot instance in the previous step. The region name is indicated at the top right part of the page.

Select "Request type: fleet" and "State: all". You should see a line for your newly-created spot request. Click on it.

As long as the current spot price in at least one of the "availability zones" in the current AWS region was lower than the limit price you specified when you created your spot request, an instance will launch within a minute or two.

In the bottom half of the page, click on the "Instances" tab. When your instance launches, it will be visible there. It will have an instance ID of the form i-................. (where each "." is a hexadecimal digit, and there are 17 hexadecimal digits).

Now click on the "Instances" link in the left-hand-side menu. You will see the newly launched instance, and the "Instance State" will be "running", with a green ball next to it. The "Instance Type" will be c4.large (assuming that's what you selected when you made the spot request), and the "Availability Zone" name will be the same as the region, but with an extra letter added (such as "a", "b", etc.)

Click on the "Spot Requests" link in the left-hand-side menu. You can click on the "Pricing History" button again. You can change the "Availability Zone" setting to show only one graph, the one for the availability zone where your instance actually launched. This graph will tell you the actual hourly rate that you'll be paying in each passing hour.

SSH client terminal window

Presumably, you left the ssh client program terminal window open for the time being.

If that window is still open, you can enter the commands:

Code:
cd /mnt-efs/mprime/instances
ls -lrt *
(or instead of "mnt-efs", use whatever name you chose earlier)

At first, you should see only the subdirectories you created as part of the setup, namely: c4.large c4.xlarge c4.2xlarge c4.4xlarge c4.8xlarge; however after the instance launches, you will notice an extra subdirectory underneath c4.large, which will have a name identical to the instance ID. You can go to that subdirectory:

Code:
cd c4.large/
Now type cd i-, but instead of hitting the Enter button, hit the Tab button. The command line should then auto-complete with the full name of the subdirectory, assuming it is the only subdirectory there. Now you can hit Enter.

You are now in the work directory where mprime is running.

Enter the command
Code:
ls -lrt *
for a directory listing, you will notice a worktodo.txt file and a file whose name is of the form p....... which is the save file.

Enter the command
Code:
cat worktodo.txt
to view the contents of the worktodo.txt file. There will be lines of the form Test= or DoubleCheck= , which list the exponents of the Mersenne numbers you are testing.

Rerun the previous commands:
Code:
cd /mnt-efs/mprime/instances
ls -lrt *
(or instead of "mnt-efs", use whatever name you chose earlier)

The "ls" commands shows the time-and-date for last modification of the i-................. subdirectory. As time goes on, if you repeat the ls -lrt * command, you will see that time-and-date change, because the p....... save files get written at regular intervals. If this time-and-date ever lags behind the current clock time and calendar date, then it almost certainly means that mprime has stopped working for some reason (usually because the spot instance was terminated due to a rise in the spot price above your limit price, and the spot price hasn't fallen back down again yet).

mersenne.org pages

If you have a PrimeNet user account (you can create one at http://www.mersenne.org/gettingstarted/ ), and if you entered that account name in the V5UserID= field in the prime-init.txt file in the Setting up an EFS filesystem: initial setup and configuration section, then you can log into that account at the mersenne.org website.

You can go to http://www.mersenne.org/cpus/ to see the list of your computers that are currently doing LL testing on Mersenne numbers. You should see an entry for your newly-launched instance, along with the Mersenne exponent that it is working on.

Then go to http://www.mersenne.org/workload/ to see the complete list of Mersenne exponents assigned to you. One of them should be the one from the newly-launched instance. The "Stage, %" field will be blank for now, but if you return tomorrow and take another look, it should show what percentage of completion has been achieved so far.

When the percentage of completion indicates that the LL test is nearly finished, you can wait a while and then go to http://www.mersenne.org/results/ to see the completed result. Unless you are very very lucky it will not be a Mersenne prime, but rather it will display a "Residue" of the form XXXXXXXXXXXXXX__, which is a sort of "certificate" indicating that this Mersenne number is not prime.

The website http://mersenneforum.org/ contains discussions about Mersenne prime hunting and the Prime95/mprime program itself.

Changing the number of simultaneous instances to more than one, or none

The spot request you created will keep running indefinitely. If spot instances are terminated due to the spot price getting too high, it will keep respawning new spot instance whenever the spot price falls sufficiently. It will keep doing more and more Lucas-Lehmer tests, and continue to run up a bill.

You might eventually choose to run more than one instance simultaneously, or you might choose to stop running them altogether. See the next section.


Next section: Running two or more Lucas-Lehmer testing instances at the same time, or quitting completely

Last fiddled with by GP2 on 2017-07-30 at 17:42
GP2 is offline   Reply With Quote
Old 2016-07-15, 07:17   #15
GP2
 
GP2's Avatar
 
Sep 2003

2·1,289 Posts
Default Running two or more Lucas-Lehmer testing instances at the same time, or quitting completely

Go to the EC2 console at http://console.aws.amazon.com/ec2/, then click on the "Spot Requests" link in the left-hand-side menu.

Make sure you are in the AWS region you intended to be in (the one where you launched your spot instances in the previous steps), and change it if necessary. The region name is indicated at the top right part of the page.

Select the spot request you launched, by clicking on it. In the "Capacity" column, it should show "1 of 1"; however, it might say "0 of 1" if the spot price has risen above your limit price. Now go to to the "Actions" drop down menu at the top of the page.


Running more than one spot instance simultaneously

Maybe you want to run two spot instances simultaneously instead of just one. Then just choose "Modify target capacity" from the Actions drop-down menu. The "Old target capacity" field will say "1", and under "New target capacity", you can enter "2". Make sure you unselect the "Terminate instances" box, then click on the blue "Submit" button.

Wait a couple of minutes and you will now have two instances crunching Lucas-Lehmer tests simultaneously. It's that simple. Of course, each instance incurs additional costs. You can run as many simultaneous instance as you want if you can afford it.


Stopping all LL testing

If you want to quit LL testing completely, select "Cancel spot request" from the Actions drop-down menu. Make sure the "Terminate instances" box is selected, then click on the blue "Confirm" button. This will kill your instances and not create any new ones.

Note: if you click on the "Instances" link in the left-hand-menu and use that page to terminate the running instances, that won't work. The spot request will still be active, and it will simply respawn new instances to replace the ones you terminated. You must cancel the spot request to ensure that no new instances are launched.


Other instance types

Note: the c4.large instance types have one processor core, and are the least expensive. You can try running the other instance types: c4.xlarge = 2 cores, c4.2xlarge = 4 cores, c4.4xlarge = 8 cores, c4.8xlarge = 18 cores (not 16). If you want to run those, you have to create a new spot request. Repeat all the exact same steps as described in earlier sections, but select one of the other instance types instead of c4.large

You might want to run some of the other instance types because there are often temporary price disparities that can be exploited. For instance, the current spot price of a c4.xlarge instance might often be less than twice the spot price of a c4.large instance even though it has two cores instead of one.

There are also often considerable pricing disparities between different AWS regions. You might be tempted to shop around for the lowest spot prices; however, these prices can fluctuate considerably and today's bargain might be considerably more expensive a week from now.


The end

Last fiddled with by GP2 on 2016-07-27 at 08:00
GP2 is offline   Reply With Quote
Old 2016-07-15, 14:14   #16
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

29·311 Posts
Default

Quote:
Originally Posted by GP2 View Post
The end
chalsall is offline   Reply With Quote
Old 2016-07-15, 15:16   #17
Madpoo
Serpentine Vermin Jar
 
Madpoo's Avatar
 
Jul 2014

CC216 Posts
Default

Quote:
Originally Posted by GP2 View Post
...I also strongly recommend lowering the DiskWriteTime from the default 30 minutes to something smaller like 10 minutes. There are some circumstances where save files do not get written when instances are terminated, in particular when you yourself terminate the instance from the EC2 console. This modified setting helps to ensure that no more than 10 minutes' work maximum is lost under those circumstances.
Good info on all of this. For the disk write, may as well do it every couple of minutes. Back when it actually took a while to write a file of a few MB to disk (or floppy?) it made sense to do it every 30 minutes, but that was so 1990s... LOL
Madpoo is offline   Reply With Quote
Old 2016-07-15, 18:10   #18
GP2
 
GP2's Avatar
 
Sep 2003

2·1,289 Posts
Default

An up-to-date list of AWS regions that support EFS (Elastic File System) is maintained by Amazon at this page.

PS, sorry for the length, if I spent more time I might have tried to make it more concise. I just wanted to document everything before I forgot what I did to set it up.

I'm not familiar enough with AWS CloudFormation or similar tools, probably there's some way that the initial setup and configuration part could be automated a lot more.
GP2 is offline   Reply With Quote
Old 2016-07-15, 18:54   #19
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

26AC16 Posts
Default

Quote:
Originally Posted by GP2 View Post
An up-to-date list of AWS regions that support EFS (Elastic File System) is maintained by Amazon at this page.

PS, sorry for the length, if I spent more time I might have tried to make it more concise. I just wanted to document everything before I forgot what I did to set it up.

I'm not familiar enough with AWS CloudFormation or similar tools, probably there's some way that the initial setup and configuration part could be automated a lot more.
Please don't be sorry. Completeness counts at least as much as conciseness.
kladner is offline   Reply With Quote
Old 2016-07-23, 21:31   #20
GP2
 
GP2's Avatar
 
Sep 2003

2×1,289 Posts
Default How to monitor the progress of a p??????? save file

When you run mprime in the background on a server, you don't see output lines showing the progress:

Code:
[July 09 15:23] Iteration: 21631169 / 40000003 [54.08%]
However, this information can be extracted from the p??????? savefiles. The exponent is stored at offset 20 and the iteration count is stored at offset 56; both are four bytes in little-endian order.

Here is a Python script that reads one or more p??????? savefiles and outputs one line for each, showing its state of progress:

(name the script showprogress.py , or whatever you want:

Code:
#! /usr/bin/env python

"""
This program takes one or more arguments, which are filenames of save files
created by various programs related to Mersenne prime testing.

Filenames are of the following forms (the ? represent alphanumberic characters0:

    p??????? : created by mprime/Prime95, doing Lucas-Lehmer testing
    c??????? : created by CUDALucas, doing Lucas-Lehmer testing
    e??????? : created by mprime/Prime95, doing ECM testing
    m??????? : created by mprime/Prime95, doing P-1 testing

For each filename, a line is output, similar to:
	[July 09 15:23] LL (mprime): 21631169 / 40000003 [54.08%]

or if the -s (--show_filename) option was set, the output is like:
	p0F00003: [July 09 15:23] LL (mprime): 21631169 / 40000003 [54.08%]

or if the -n (--no_timestamp) option was set, the output is like:
	LL (mprime): 21631169 / 40000003 [54.08%]
"""

from __future__ import print_function, division
import sys
import os
import os.path
import datetime
import struct



class p_file:
    def __init__(self, f):
        self.f = f

    def read(self):
        self.s = self.f.read(60)

    def unpack(self):
        self.p = struct.unpack_from("<L", self.s, 20)[0]
        #self.percent_complete = struct.unpack_from("d", self.s, 40)[0]
        self.i = struct.unpack_from("<L", self.s, 56)[0]

    def progress_string(self):
        return "LL (mprime): {} / {} [{:.2f}%]".format(self.i, self.p, 100*self.i/self.p)


class c_file:
    def __init__(self, f):
        self.f = f

    def read(self):
        self.s = self.f.seek(-10 * 4, 2)   # from the end
        self.s = self.f.read(12)

    def unpack(self):
        self.p = struct.unpack_from("<L", self.s, 0)[0]
        self.i = struct.unpack_from("<L", self.s, 8)[0]

    def progress_string(self):
        return "LL (CUDALucas): {} / {} [{:.2f}%]".format(self.i, self.p, 100*self.i/self.p)


class e_file:
    def __init__(self, f):
        self.f = f

    def read(self):
        self.s = self.f.read(96)

    def unpack(self):
        # k*b^n+c
        self.b, self.p, self.c = struct.unpack_from("<LLl", self.s, 16)
        self.percent_complete = struct.unpack_from("d", self.s, 40)[0]
        self.flag, self.curve = struct.unpack_from("<LL", self.s, 52)
        self.B1, self.B1_i, self.B2_i = struct.unpack_from("<QQQ", self.s, 68)

    def progress_string(self):
        if self.flag == 1 and self.B1_i == self.B1:
            return "ECM {}: curve #{}; stage 2 at {:.2f}% ({}/)".format(self.p, self.curve, 100*self.percent_complete, self.B2_i)
        elif self.flag == 0:
            return "ECM {}: curve #{}; stage 1 at {:.2f}% ({}/{} [{:.2f}%])".format(self.p, self.curve, 100*self.percent_complete, self.B1_i, self.B1, 100*self.B1_i/self.B1)
        else:
            return "ECM {}: curve #{}; ({:.2f}%) (flag:{} {}/{} {}/)".format(self.p, self.curve, 100*self.percent_complete, self.flag, self.B1_i, self.B1, self.B2_i)


class m_file:
    def __init__(self, f):
        self.f = f

    def read(self):
        self.s = self.f.read(112)

    def unpack(self):
        # k*b^n+c
        self.b, self.p, self.c = struct.unpack_from("<LLl", self.s, 16)
        self.percent_complete = struct.unpack_from("d", self.s, 40)[0]
        self.flag, self.Bwhat_0, self.Bwhat_1, self.Bwhat_2, self.Bwhat_3, self.Bwhat_4, self.Bwhat_5 = \
                        struct.unpack_from("<LQQQQQQ", self.s, 52)

    def progress_string(self):
        if self.flag == 1 and self.Bwhat_1 == self.Bwhat_0 and self.Bwhat_5 == 0:
            return "P-1 {}: stage 2 at {:.2f}%".format(self.p, 100*self.percent_complete)
        elif self.flag == 0:
            return "P-1 {}: stage 1 at {:.2f}% ({}/{} [{:.2f}%])".format(self.p, 100*self.percent_complete, self.Bwhat_5, self.Bwhat_1, 100*self.Bwhat_5/self.Bwhat_1)
        else:
            return "P-1 {}: ({:.2f}%) (flag:{} {};{};{};{};{};{})".format(self.p, 100*self.percent_complete, self.flag,
                                                           self.Bwhat_0, self.Bwhat_1, self.Bwhat_2, self.Bwhat_3, self.Bwhat_4, self.Bwhat_5)



def savefile_progress(filename, dont_show_timestamp, show_filename):
    basename = os.path.basename(filename)

    with open(filename,'rb') as f:
        if not dont_show_timestamp:
            st = os.fstat(f.fileno())

        if basename.startswith('p'):
            cep = p_file(f)
        elif basename.startswith('e'):
            cep = e_file(f)
        elif basename.startswith('m'):
            cep = m_file(f)
        elif basename.startswith('c'):
            cep = c_file(f)
        else:
            raise ValueError

        cep.read()

    cep.unpack()

    if show_filename:
        print(filename, end=": ")
    if not dont_show_timestamp:
        t = datetime.datetime.fromtimestamp(st.st_mtime)
        print(t.strftime("[%b %d %H:%M]"), end=" ")
    print(cep.progress_string())


if __name__ == "__main__":
    try:
        import argparse

        parser = argparse.ArgumentParser(description="Output the progress status of Mersenne-related savefiles")
        parser.add_argument('filenames', metavar='filename', nargs='+',
            help="name of a savefile (p??????? for mprime/Prime95 LL, c??????? for CUDALucas, e??????? for mprime/Prime95 ECM, m??????? for mprime/Prime95 P-1)")
        parser.add_argument('-n', '--no_timestamp', dest='dont_show_timestamp', action='store_true',
            help="don't show the timestamp")
        parser.add_argument('-s', '--show_filename', action='store_true',
            help="show the filename")

        args = parser.parse_args()
    except ImportError:
        import getopt

        class argparse_Namespace:
            def __init__(self):
                self.filenames = []
                self.dont_show_timestamp = False
                self.show_filename = False

        args = argparse_Namespace()

        try:
            opts, args.filenames = getopt.getopt(sys.argv[1:], "ns", ["no_timestamp", "show_filename"])
        except getopt.GetoptError as err:
            print(str(err))
            sys.exit(2)
        for o, a in opts:
            if o in ("-n", "--no_timestamp"):
                args.dont_show_timestamp = True
            elif o in ("-s", "--show_filename"):
                args.show_filename = True
            else:
                assert False, "unhandled option"

    if sys.hexversion < 0x30000f0:
        class FileNotFoundError(OSError):
            pass

    for filename in args.filenames:
        try:
            savefile_progress(filename, args.dont_show_timestamp, args.show_filename)
        except FileNotFoundError:
            print("No such file: ", filename, file=sys.stderr)
        except IOError as e:
            print(os.strerror(e.errno), filename)
If you are running multiple instances in an EFS filesystem as described previously in this thread, you can install this script in the "instances" subdirectory and check progress globally by doing:

Code:
cd /mnt-efs/mprime/instances
# you previously installed the showprogress.py script in this directory, now run it:
find . -name 'p???????' -print | xargs python ./showprogress.py
# show the filenames too:
find . -name 'p???????' -print | xargs python ./showprogress.py -s
# show only the c4.large subdirectory instead of all subdirectories
find c4.large -name 'p???????' -print | xargs python ./showprogress.py -s

Last fiddled with by GP2 on 2016-10-10 at 18:03 Reason: update script, now handles e* (ECM savefiles), m* (P-1 savefiles), c* (CUDALucas savefiles) and p* (mprime LL savefiles)
GP2 is offline   Reply With Quote
Old 2016-10-10, 00:53   #21
GP2
 
GP2's Avatar
 
Sep 2003

50228 Posts
Default

The startup script (in an earlier message in this thread) for launching mprime has now been updated to also work with the new p2.xlarge instance, which has a GPU and four CPU cores. The updated script will run CUDALucas simultaneously with mprime.

Note: only a single instance of CUDALucas is run, so the script isn't (yet) suitable for the larger p2.8xlarge and p2.16xlarge instance type, which have 8 GPUs and 16 GPUs respectively. In principle, the script could be readily modified to launch multiple CUDALucas instances, each with a different -d <device> on the command line, but I haven't tried this yet (mostly because the spot prices for these larger instance types are impractically high at the moment).

To make this work, you need to create a p2.xlarge directory alongside the c4.large, c4.xlarge, etc. directories mentioned in the earlier messages.

Sample local-init.txt file for p2.xlarge subdirectory:

Code:
OldCpuSpeed=2600
NewCpuSpeedCount=0
NewCpuSpeed=0
RollingAverage=1000
RollingAverageIsFromV27=1
ComputerID=P2_XL
Memory=58000 during 7:30-23:30 else 58000
Affinity=100
WorkerThreads=1
ThreadsPerTest=2
The prime-init.txt file for the p2.xlarge subdirectory is the same as for the other subdirectories.

You also need to create a parallel subdirectory structure for CUDALucas.

Download the CUDALucas source code and compile it, and place the executable at:
/mnt-efs/CUDALucas/prog/CUDALucas

Create a directory
/mnt-efs/CUDALucas/instances/p2.xlarge

and copy the CUDALucas.ini file (from the source code distribution) into that subdirectory.
GP2 is offline   Reply With Quote
Old 2016-10-17, 23:08   #22
GP2
 
GP2's Avatar
 
Sep 2003

257810 Posts
Default AWS us-east-2 region (Ohio) is now online, has Elastic File System

A new AWS region, Ohio (us-east-2) is now available.

The Elastic File System (EFS) is available for this new region, so the scripts and methods mentioned in this thread will work for it too.

In addition to Ohio, EFS is also available in us-west-2 (Oregon), us-east-1 (Northern Virginia) and eu-west-1 (Ireland). Note that EFS is still not available in any other regions, including Northern California (us-west-1) and Frankfurt (eu-central-1).

Other new regions coming soon include London, Paris and Montreal.
GP2 is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
How-to guide for running LL tests on Google Compute Engine cloud GP2 Cloud Computing 2 2020-02-19 00:16
Is it possible to disable benchmarking while torture tests are running? ZFR Software 4 2018-02-02 20:18
Amazon Cloud Outrage kladner Science & Technology 7 2017-03-02 14:18
running single tests fast dragonbud20 Information & Answers 12 2015-09-26 21:40
LL tests running at different speeds GARYP166 Information & Answers 11 2009-07-13 19:39

All times are UTC. The time now is 13:24.

Sat Jul 4 13:24:06 UTC 2020 up 101 days, 10:57, 2 users, load averages: 2.24, 2.02, 1.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.