mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware

Reply
 
Thread Tools
Old 2015-04-25, 01:21   #1
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5×103 Posts
Default 18 core Haswell/P-1 CPU load

Thought this screenshot may be of interest. It's a screenshot of htop under stage2 of P-1 exponent. It's a c4.8xlarge instance on AWS EC2. Intel Xeon E5-2666 v3.

Memory starvation anyone?

I'll have to play around, to see what's best utilization of this hardware.
Attached Thumbnails
Click image for larger version

Name:	18c-haswell-p-1.PNG
Views:	219
Size:	52.5 KB
ID:	12541  
nucleon is offline   Reply With Quote
Old 2015-04-25, 03:45   #2
aurashift
 
Jan 2015

11×23 Posts
Default

Quote:
Originally Posted by nucleon View Post
Thought this screenshot may be of interest. It's a screenshot of htop under stage2 of P-1 exponent. It's a c4.8xlarge instance on AWS EC2. Intel Xeon E5-2666 v3.

Memory starvation anyone?

I'll have to play around, to see what's best utilization of this hardware.
I'm not familiar with AWS. This is running on dedicated hardware with dual or quad channel RAM?
aurashift is offline   Reply With Quote
Old 2015-04-25, 04:08   #3
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

51510 Posts
Default

Blackbox.

Presumably given the size of RAM given to us, I think we're good to assume quad channel.

Public information on instance types:

https://aws.amazon.com/ec2/instance-types/
nucleon is offline   Reply With Quote
Old 2015-04-25, 04:47   #4
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

A c4.8xlarge is a dual CPU machine.

If you get a c4.4xlarge or smaller, the virtual cores will all be allocated on a single CPU. I asked Amazon about this.

Each virtual core in the current generation is a hyperthreaded core. I don't know if you get both halves of the same physical core or not, but I would imagine so, so other tenants on the same hardware would get consistent performance.

I haven't asked if they use quad channel or not, but it would be very strange if they didn't.
Mark Rose is offline   Reply With Quote
Old 2015-04-25, 04:51   #5
aurashift
 
Jan 2015

25310 Posts
Default

Does EC2 run VMWare? It'll schedule everything on the physical cores first before going for logical. (which is true for everything...I think)
aurashift is offline   Reply With Quote
Old 2015-04-25, 05:40   #6
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

2·5·293 Posts
Default

Quote:
Originally Posted by aurashift View Post
Does EC2 run VMWare? It'll schedule everything on the physical cores first before going for logical. (which is true for everything...I think)
EC2 is Xen under the hood.
Mark Rose is offline   Reply With Quote
Old 2015-04-25, 12:02   #7
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

10038 Posts
Default

I chose the c4.8xlarge mainly for the amount of ram given (60GB).

Performance varies. In my experimentation, you're basically existing on the same physical box as another VM. If other VMs hosted on the same box are memory bandwidth conservative - good win for you. If not, you can see performance drop.

So my mucking around this evening, I was getting good results with 2x workers: (1) P-1 testing (2) double check. As the double check doesn't need much, I can allocate the vast bulk of ram to the P-1 process.

The difference in performance between 8cores and 16cores was minimal.

Results below.

Code:
#16cores; 1x double check
[Work thread Apr 25 11:05] Iteration: 290000 / 38961211 [0.74%], ms/iter:  1.228, ETA: 13:11:47
[Work thread Apr 25 11:05] Iteration: 300000 / 38961211 [0.76%], ms/iter:  1.228, ETA: 13:11:33
[Work thread Apr 25 11:05] Iteration: 310000 / 38961211 [0.79%], ms/iter:  1.227, ETA: 13:10:28

#1core; 1x double check
[Work thread Apr 25 11:07] Iteration: 330000 / 38961211 [0.84%], ms/iter:  9.567, ETA: 4d 06:39
[Work thread Apr 25 11:09] Iteration: 340000 / 38961211 [0.87%], ms/iter:  9.412, ETA: 4d 04:58
[Work thread Apr 25 11:10] Iteration: 350000 / 38961211 [0.89%], ms/iter:  9.425, ETA: 4d 05:04

#8cores; 1x double check
[Work thread Apr 25 11:13] Iteration: 370000 / 38961211 [0.94%], ms/iter:  1.492, ETA: 15:59:55
[Work thread Apr 25 11:13] Iteration: 380000 / 38961211 [0.97%], ms/iter:  1.496, ETA: 16:01:44
[Work thread Apr 25 11:14] Iteration: 390000 / 38961211 [1.00%], ms/iter:  1.500, ETA: 16:04:29

#8cores x2 workers; 2x double checks
[Worker #1 Apr 25 11:19] Iteration: 510000 / 38961211 [1.30%], ms/iter:  1.502, ETA: 16:02:50
[Worker #2 Apr 25 11:19] Iteration: 80000 / 38967377 [0.20%], ms/iter:  1.432, ETA: 15:27:54
[Worker #1 Apr 25 11:19] Iteration: 520000 / 38961211 [1.33%], ms/iter:  1.504, ETA: 16:03:19
[Worker #2 Apr 25 11:19] Iteration: 90000 / 38967377 [0.23%], ms/iter:  1.432, ETA: 15:27:38
[Worker #1 Apr 25 11:19] Iteration: 530000 / 38961211 [1.36%], ms/iter:  1.500, ETA: 16:00:52
[Worker #2 Apr 25 11:19] Iteration: 100000 / 38967377 [0.25%], ms/iter:  1.430, ETA: 15:26:30
nucleon is offline   Reply With Quote
Old 2015-04-25, 17:04   #8
Mark Rose
 
Mark Rose's Avatar
 
"/X\(‘-‘)/X\"
Jan 2013

293010 Posts
Default

Quote:
Originally Posted by nucleon View Post
I chose the c4.8xlarge mainly for the amount of ram given (60GB).
I would try using the r3.2xlarge instance type. You lose AVX2, and 100 MHz of base clock, but you get the same memory for 38% of the price. It only has 8 virtual cores, but there's a good chance your hardware mates aren't running memory bandwidth intensive applications, so those 8 virtual cores can be fully fed.

For instance, we use r3.2xlarge for memcache. Our memory bandwidth usage is limited by the network bandwidth, and we keep spare capacity by not maxing out the network bandwidth. Our CPU usage is around 10%, and the memory bandwidth usage is less than 1 Gb/s, out of the 59.7 Gb/s the E5-2670 v2 chip has available.

Last fiddled with by Mark Rose on 2015-04-25 at 17:05
Mark Rose is offline   Reply With Quote
Old 2015-04-26, 01:22   #9
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5×103 Posts
Default

This is what I love about ec2.

I don't need to go out and buy hardware to test a theory. I just run up an instance :)

Ok, starting up an r3.2xlarge and going to experiment with that, and see how it compares.
nucleon is offline   Reply With Quote
Old 2015-04-26, 07:58   #10
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5×103 Posts
Default

For those that don't know, you can bid a 'spot price' and if the market rate of a particular instance falls below your bid price - an instance of that type is allocated to you.

Doing some price analysis over last 7 days of the market price of the two instance types discussed above - c4.8xarge and r3.2xlarge I get the attached pic.

What I did, I grabbed the last 7-days, per-hour rates of different zones (rows 43-62, 66-85), I worked out the 50-percentile, 75-percentile, and 90-percentile. (yellow bars). The first block is r3.2xlarge, the second block is c4.8xlarge.

The blue bars, are the aggregate figure for 7 days given 50%/75%/90% uptime. So how much it will cost you for the week based on bidding for 50/75/90% uptime.

I was bidding on r3.4xlarge and found the price too volatile for the cheapskate that I am. :) This was another reason for the c4 instance - I wanted to see how stable the prices were. The c4 instances seem a lot less volatile than the r3.4xlarge prices I was chasing.

From the graph, it looks like us-west1 (a or b) is the better value for r3.4xlarge. I'm current in us-east1c zone with my c4 instances.
Attached Thumbnails
Click image for larger version

Name:	bid-price.PNG
Views:	162
Size:	159.1 KB
ID:	12545  
nucleon is offline   Reply With Quote
Old 2015-04-26, 14:48   #11
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

823410 Posts
Default

http://www.businessinsider.com/amazo...nothing-2015-4
Xyzzy is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
PC freezes under load (usually) KadenBiggs Hardware 23 2017-01-26 08:03
i7 6700K 50% load with 4 workers halahup Hardware 21 2016-04-27 03:13
Boat Load of 2##### +/- primes pdazzl FactorDB 7 2014-06-20 14:45
Intel announces multi-core enhancements for Haswell chips ixfd64 Hardware 8 2012-02-10 20:32
P3 reboots when CPU not under heavy load. geoff Hardware 4 2008-06-29 01:56

All times are UTC. The time now is 14:46.


Mon Aug 2 14:46:32 UTC 2021 up 10 days, 9:15, 0 users, load averages: 3.42, 3.78, 3.84

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.