mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > Cloud Computing

Reply
 
Thread Tools
Old 2017-08-03, 21:32   #34
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

3·31·97 Posts
Default

Quote:
Originally Posted by GP2 View Post
Note that anything bigger, such as the four-core c4.2xlarge all the way up to the 18-core (not 16-core) c4.8xlarge is rarely worthwhile, because the price of an N-core instance for larger N will usually be a lot more than N times the cost of a one-core c4.large instance, and the throughput of an N-core instance running mprime will usually be significantly less than the total throughput of N one-core c4.large instances.
It depends on what you are trying to do. Not everyone here is using AWS / EC2 only for mprime.

Last fiddled with by chalsall on 2017-08-03 at 21:40 Reason: s/EC3/EC2/ ; # Humor is such a subjective thing....
chalsall is online now   Reply With Quote
Old 2017-08-03, 22:53   #35
GP2
 
GP2's Avatar
 
Sep 2003

2,579 Posts
Default

Quote:
Originally Posted by chalsall View Post
It depends on what you are trying to do. Not everyone here is using AWS / EC2 only for mprime.
The User Data script that is included with this guide (as an attachment to one of the posts) lets you run CUDALucas or mfaktc on the GPU of a p2.xlarge instance, while simultaneously running mprime on the CPU. But in general using GPUs on the AWS cloud is not at all cost-effective because the great demand for GPUs in machine learning keeps the spot prices high for instances with GPUs.

I'm curious what other CPU-based programs people are running on AWS and if they adapted this framework (EFS + the User Data script) or use something else.

By the way, the User Data script is now at version 1.11, with various bug fixes, and you can now reboot the machine and mprime will resume automatically. Normally a User Data script only runs at instance launch time, but now the script puts itself into crontab so that it will run at reboot time as well. Anyone using an older version should upgrade.
GP2 is offline   Reply With Quote
Old 2017-08-03, 23:03   #36
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

3×31×97 Posts
Default

Quote:
Originally Posted by GP2 View Post
Anyone using an older version should upgrade.
Extreme coolness.

That's sincere.
chalsall is online now   Reply With Quote
Old 2017-08-04, 09:09   #37
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

3·5·317 Posts
Default

Quote:
Originally Posted by GP2 View Post
I'm curious what other CPU-based programs people are running on AWS and if they adapted this framework (EFS + the User Data script) or use something else.
You would be amazed!
I am running it with llr, did it with primegaps, Fermat searcching programs and with my sieve. That's because a top-level FMA3-aware server or desktop would still cost more to me than the AWS system.

Quote:
Originally Posted by GP2 View Post
By the way, the User Data script is now at version 1.11, with various bug fixes, and you can now reboot the machine and mprime will resume automatically. Normally a User Data script only runs at instance launch time, but now the script puts itself into crontab so that it will run at reboot time as well. Anyone using an older version should upgrade.
Cool! Thanks! bow:
ET_ is offline   Reply With Quote
Old 2017-08-09, 03:07   #38
GP2
 
GP2's Avatar
 
Sep 2003

2,579 Posts
Default

There was a bug introduced in v 1.10 and v 1.11 that prevented CUDALucas/mfaktc from starting on p2.xlarge instances (two shell variables were needlessly and incorrectly made "readonly"). This didn't have any effect on mprime.

Version 1.12 is now attached to this earlier post.
GP2 is offline   Reply With Quote
Old 2017-08-29, 18:55   #39
amaurymartiny
 
Aug 2017

12 Posts
Default

Thanks GP2 for this excellent tutorial, and totally up to date.

I ran the tutorial against a c4.8xlarge, and after some time, the CPU utilization is 50 percent in the monitoring tab in AWS console, as shown in the picture:

http://i.imgur.com/Yykn0sZ.png

If I do a top in the instance, I see a CPU usage of 1800%, which makes sense with the fact that there are 18 cores. I just want to make sure that seeing 50% in AWS console is normal.

Thanks.
amaurymartiny is offline   Reply With Quote
Old 2017-08-30, 00:34   #40
GP2
 
GP2's Avatar
 
Sep 2003

2,579 Posts
Default

Quote:
Originally Posted by amaurymartiny View Post
I ran the tutorial against a c4.8xlarge, and after some time, the CPU utilization is 50 percent in the monitoring tab in AWS console, as shown in the picture:

http://i.imgur.com/Yykn0sZ.png

If I do a top in the instance, I see a CPU usage of 1800%, which makes sense with the fact that there are 18 cores. I just want to make sure that seeing 50% in AWS console is normal.
I don't know if it's "normal", but I do get the same thing. I'm running c4.large and c4.xlarge instances, and I get %CPU = 100% and 200% respectively in top, so I think that's what really matters. I'm not sure what that Monitoring tab is measuring.

If you're using an instance with lots of cores like c4.8xlarge, it's probably best to run version 29.2 of mprime, because it uses the hwloc library to automatically figure out the topology. The tutorial still mentions version 28.10 because that's what the official download page still uses.

In terms of pricing, us-east-2 (Ohio) has the lowest spot prices, and the spot prices for the 18-core c4.8xlarge are consistently well above 18 times the cost of a 1-core c4.large, by about maybe 50%. The 8-core c4.4xlarge and 4-core c4.2xlarge also have a premium, though less pronounced, so I stick to the c4.large and c4.xlarge for my own usage.
GP2 is offline   Reply With Quote
Old 2017-09-23, 07:32   #41
GP2
 
GP2's Avatar
 
Sep 2003

257910 Posts
Default

A small change to the script (now version 1.13), to take into account the fact that the new feature in AWS that spot instances can now be stopped.

You can choose the behavior at launch time (stopping or termination). See [the documentation](http://docs.aws.amazon.com/AWSEC2/la...le-differences) for an explanation of the difference. Resuming a stopped instance vs. relaunching a new instance to replace a terminated instance is like the difference between rebooting a computer and buying a new computer to replace an old computer.

It still makes sense for "Interruption behavior" to be "Terminated" rather than "Stopped", since this avoids incurring unnecessary EBS storage charges (at 8 GB for the root volume of each instance) while instances aren't running. However, now the script should now still work even if spot instances are stopped rather than terminated.
GP2 is offline   Reply With Quote
Old 2017-11-13, 21:19   #42
GP2
 
GP2's Avatar
 
Sep 2003

1010000100112 Posts
Default

The user-data script (linked in this post) is now version 1.15

The EFS filesystem is now mounted with dirsync. Previously it was possible for instances to end up pointing to the wrong work directory due to race conditions.
GP2 is offline   Reply With Quote
Old 2020-05-28, 21:49   #43
Cheetahgod
 
May 2020

32 Posts
Default p95v298b6

I followed your steps but I can't seem to get this working with the lastest version of prime95. It worked with the version you used in your examples. I want to use AVX512 of the c5 intstances.
Cheetahgod is offline   Reply With Quote
Old 2020-05-29, 19:38   #44
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

201516 Posts
Default

Quote:
Originally Posted by Cheetahgod View Post
I followed your steps but I can't seem to get this working with the lastest version of prime95.
Make sure you are using mprime not Prime95.
Uncwilly is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
How-to guide for running LL tests on Google Compute Engine cloud GP2 Cloud Computing 2 2020-02-19 00:16
Is it possible to disable benchmarking while torture tests are running? ZFR Software 4 2018-02-02 20:18
Amazon Cloud Outrage kladner Science & Technology 7 2017-03-02 14:18
running single tests fast dragonbud20 Information & Answers 12 2015-09-26 21:40
LL tests running at different speeds GARYP166 Information & Answers 11 2009-07-13 19:39

All times are UTC. The time now is 00:58.

Fri Jul 10 00:58:02 UTC 2020 up 106 days, 22:31, 0 users, load averages: 1.31, 1.47, 1.57

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.