mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   GPU Mentor (https://www.mersenneforum.org/showthread.php?t=19647)

stars10250 2015-03-03 03:45

My computing trial is done. I burned through $200 of free Microsoft Azure money doing LL testing. I ran on two 8-core machines. One was a 2.2 GHz E5-2660 Xeon and the other was an AMD Opteron at 2.1 GHz. These were on the low-end of capabilities, but the upper end machines had features I didn't need for LL testing such as solid state hard drives, fast internet connection, etc. and they were very expensive hourly (over $4/hr verses $.61/hr, they have a funny formula for cpu compute time). So the final analysis works out that the two machines burned through $200 in a week and completed approximately 95% of 1 LL test (combined) with exponents of size 72M. For comparison, my desktop PC does this calculation, assuming hardware amortization over 2 years, for about $9 including the cost of electricity.

Setting up Azure and running prime95 was straightforward. You give them your info, select the hardware, and it builds a virtual machine that operates just like a PC running Windows Server 2012. I went to the internet, downloaded prime95, and was up and running like it was just another build. You can monitor your daily costs (compute, bandwidth, storage). I seemed to get about 92% of each 8 core machine. At the end I offloaded the save-files to be completed by my machine.

I can see MS making a profit but this seems a bit harsh at 22 times more expensive than a desktop pc. I know it's more reliable, and has all sorts of features, but for what we do it doesn't make sense. I thought cloud computing was the next big thing but it's doomed if this is what they charge. It makes me consider going into the hpc business and charging something reasonable.

stars10250 2015-03-08 03:22

I'm running on Amazon's elastic computing cloud EC2 now. They gave me 1 CPU for a year free. It's an intel xeon 2.6 ghz with 1 GB memory. I get 30 GB storage. It's similar to microsoft's setup with a virtual machine, windows server 2012. There was more to do with setup of an RSA encryption key though. On a 72M exponent it is giving 30 ms iteration times.

chalsall 2015-03-08 16:27

[QUOTE=stars10250;397248]I'm running on Amazon's elastic computing cloud EC2 now. They gave me 1 CPU for a year free.[/QUOTE]

Excellent. EC2 is worth making friends with.

[QUOTE=stars10250;397248]On a 72M exponent it is giving 30 ms iteration times.[/QUOTE]

Keep in mind if you're running a "micro" instance that you will be computationally constrained over time. As in, you probably won't see 30 ms over the long term (very useful for things like DNS and SCP though!).

stars10250 2015-03-08 17:31

Holy $&*! After 2 hours it slowed to an iteration time of 300 ms! Well, so much for that.

chalsall 2015-03-08 23:18

[QUOTE=stars10250;397268]Holy $&*! After 2 hours it slowed to an iteration time of 300 ms! Well, so much for that.[/QUOTE]

Don't give up on it! But, for serious compute, expect to pay real money. I sometimes spend several hundred dollars a month on EC2; Mark's company (I understand) regularly spends several orders of magnitude more.

For myself (who sometimes need to process a lot of data quickly) it's wonderful being able to spin up instances when (and only when) I need them. It's actually cheaper than what it would cost me in electricity, and I don't have to buy any hardware. Plus, I can choose the instance types as appropriate; some GPUs here, some big CPUs there, some massive amounts of memory as needed, etc.

And, be sure to look at the "spot" instances offerings -- these are dirt cheap and very useful if you don't mind them disappearing from time to time.

stars10250 2015-04-04 20:09

I'm running LL on Google compute now as a test. They gave me $300 play money and 60 days. It was going great, and then 3 days in they suspended all my jobs for suspected abuse. I'm appealing that decision. I'm not sure why they consider that abuse on a compute virtual machine. I've read their acceptable use policy and I don't think I'm violating it. We will see. Otherwise their product seem more reasonably priced than Microsoft Azure and Amazon elastic cloud.

Mark Rose 2015-04-04 21:22

[QUOTE=stars10250;399358]I'm running LL on Google compute now as a test. They gave me $300 play money and 60 days. It was going great, and then 3 days in they suspended all my jobs for suspected abuse. I'm appealing that decision. I'm not sure why they consider that abuse on a compute virtual machine. I've read their acceptable use policy and I don't think I'm violating it. We will see. Otherwise their product seem more reasonably priced than Microsoft Azure and Amazon elastic cloud.[/QUOTE]

They probably suspect crypto-coin mining.

chalsall 2015-04-04 22:45

[QUOTE=stars10250;399358]Otherwise their product seem more reasonably priced than Microsoft Azure and Amazon elastic cloud.[/QUOTE]

Compute costs. "Cloud compute" providers offer these "free" periods so you can learn their systems, not so people can come in and spend many cycles for mining or SETI@home or GIMPS et al who are not serious.

Spend a bit of real money, and the providers will take you much more seriously.

stars10250 2015-04-04 22:45

They wrote back. I created too many projects, so they suspended everything. It's kind of weird that they don't have it set to not let you do that. I got my main projects reinstated.

I don't think it's unreasonable to use their free trial to do compute and get a sense of the costs. The cost formula is detailed in that they charge for every aspect of computing....cpu, memory, network traffic, storage, etc. So it's not straightforward for me to figure out how much a LL test genuinely costs without actually trying it. If they are reasonable, I will pay. I'd love to have my office not be 85 degrees.

stars10250 2015-05-04 02:35

My cloud computing trial is essentially over. I say essentially because the Amazon trial runs for a year.

I tried 4 services. All had limits on how much I could run under a free trial, but this didn’t really matter. I normalized the work by the cost required to test one 70M exponent for primality. Some services gave me new (fast) hardware, some old (slower), some throttled me (very slow), etc. But at the end I could see how much it cost for one test which is all I needed for comparison. All services have faster hardware which you can get if you pay for it, but it seemed like the cost went up faster than the compute speed. So I don't think faster would be a good choice for LL testing. Cores were a mix of AMD and Intel Xeon. In all cases I set up a virtual machine and windows server 2012. Costs are approximate.

Microsoft Azure
$222/1 test
Two 8 core cpu limit under free trial
No real issues

Profit Bricks
$100/1 test
4 CPU core limit under free trial
No real issues

Google Cloud Computing
$94/1 test
8 cpu core limit under free trial

Compute continually triggered abuse alarm that had to be appealed daily to get reinstated. Presumably this wouldn’t happen if I paid for service, but it seemed like a poor impression to make on a customer.

Amazon EC2
$108/1 test
1 cpu core limit under free trial

This one was hard to estimate because the “bill” is all $0.00 under the free trial. I accidentally launched a 2nd instance, which isn’t allowed under the free trial, and that generated a bill immediately. So I estimated the cost from that. Amazon provides a weird trial. You get massive cpu throttling (10x) but get 1 cpu for a year. Everyone else gave more compute power over a shorter time.

The cost to do the same calculation on my home computer is $9. That includes hardware amortization of 1 year and electricity at 25 c/kw-hr but no cooling cost.

nucleon 2015-05-09 11:13

I've been on EC2 for a while.

In short, loving it.

The way to be efficient on EC2 is spot prices.

8cores on a c4.8xlarge instances (18real cores usable), takes about 60hours.

Spot price hovers 26c-35c.

That yields a price much lower than the $108 you've listed.

For our requirements, I don't think paying the on-demand prices/reserved prices are value for money.

I buy one on-demand t2.micro unit, attach sufficient storage to it. Then spawn a number of spot compute units, and setup scripts to mount back to that storage. (Big thing with spot compute units - the storage is wiped, when the market rate rises above your threshold)

When the market rate rises above my threshold, the compute unit shuts down and saves back to the on-demand unit, when the price is low enough, the compute unit starts, mounts back to the storage compute unit, starts up where it left off.

Each to the their own, my life situations is a bit fluid. I can get shipped off to different locations pretty easily. With physical hardware, I lose time when I move. And also when I'm away, it's not safe to be in another state with gear running back home with no one to check up on it.

-- Craig

stars10250 2015-05-13 18:12

If you mean ballpark $0.26*60 = $15.60 for 1 LL test, that's much more reasonable and I'd be willing to pay that. When I started this I figured I'd be willing to pay up to twice what it costs me to have the machine in my house do the calculation, or ~18$/test. Your method seems much more reasonable. I'd need to understand the script more, though, since the job could arbitrarily stop and I'd need the savefile. As for starting and stopping, I don't care about that for LL tests. I just want to make ~consistent progress. Cool, thanks.

chalsall 2015-05-13 18:37

[QUOTE=stars10250;402244]As for starting and stopping, I don't care about that for LL tests. I just want to make ~consistent progress. Cool, thanks.[/QUOTE]

Further to this... You should care about "Starting and Stopping", because when an AWS instance stops, the root file system is (by default) deleted -- thus any changes are lost. You can choose to not delete the root file system under the "Add Storage" section, but you will then incur the costs of that even when your instance isn't running.

My work-around for this is to have a default AMI (read: "/") which is "expendable" (read: nothing changes, thus no need to save the state), and then I mount a little 1G or so /home/ volume on each instance which contains the data and are persistent (read: they are not deleted upon instance termination).

And further to Craig's comments... I also find EC2 to be a very cost effective way to "compute". Partially because electricity is very expensive here in Bim. But also because I sometimes need a lot of compute for a particular job; it is nice being able to spin up, for example, 10 big machines in a few minutes, let them do the work, and then shut them down again.

Far more cost effective than having a server stack hanging around waiting for work.

stars10250 2015-05-14 02:41

I think his script is to copy the save-file somewhere that doesn't get deleted. So if the job stops you don't lose too much compute. Maybe lose 30 minutes or less. I meant I don't care in the sense that if it does 1% of a calculation and stops, I get charged for 1% which is fair. I don't mind it limping along like that. I just want LL tests to eventually get done at a fair overall price.

chalsall 2015-05-14 16:23

[QUOTE=stars10250;402270]I think his script is to copy the save-file somewhere that doesn't get deleted. So if the job stops you don't lose too much compute.[/QUOTE]

OK, I didn't fully understand what you were saying.

I was speaking about non-GIMPS processing where the processed data state may update several megabytes every second or so.

But, then, maybe my solution space isn't optimal. I just let my Spot Instances die, and then bring them back online (with the associated /home/ file system) and continue once the prices become reasonable again.

Mark Rose communicated to us all that there is a new feature of EC2 which gives the instance/owner an opportunity of having a couple of minutes of warning before it is shut down, so that the state can be quickly saved.

Mark Rose 2015-05-14 18:44

[QUOTE=chalsall;402285]Mark Rose communicated to us all that there is a new feature of EC2 which gives the instance/owner an opportunity of having a couple of minutes of warning before it is shut down, so that the state can be quickly saved.[/QUOTE]

It's less than 60 seconds, and it's not guaranteed. My experience with Amazon is that things that aren't guaranteed are almost always good though.


All times are UTC. The time now is 06:49.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.