Forum

Attention: For Live support and discussion, please visit our Discord!  https://discord.me/curecoin

EC2 BigAdv Folding ...
 

[Sticky] EC2 BigAdv Folding Setup Guide  

  RSS
vorksholk
(@imported_vorksholk)
Trusted Member Customer

Hello! While folding on GPU hardware is a great way to earn [email protected] points for protein research, if you do not have GPUs, or want to do a lot of folding, EC2 can be an alternate option to rack up those folding points.

For this tutorial, you need an amazon EC2 account, and a computer. This tutorial is done at a keyboard of a Windows computer. Linux and Mac OS X are extremely similar. This will be using the Oregon region. You can select other regions to test pricing. Each region requires the creation of a different private key.

Step 1:  
Log in to your EC2 account. Click on "Request Spot Instance".

Step 2:
Click "Select" for Ubuntu Server 14.04 LTS (PV). HVM will also work, but will perform about 1%-2% slower.

Step 3:
Choose the c3.8xlarge instance type (108 ECUs).

Step 4:
Enter the maximum per-hour price you are willing to pay. Make sure it is at least 25% greater than the above-shown "Current Price" options.

Step 5:
Click the "Launch" button.

Step 6:
Select "Create a new key pair" from the dropdown menu, then name your keypair, and click the "Download Key Pair" button. NOTE: if you have already set up one instance before, you can skip down to step 25, and just double-click on the saved instance.

Step 7:
After downloading the private key (.pem) file, click on "Request Spot Instance".

Step 8:
Make sure that EC2 has successfully created your Spot Instance request (A green check-mark in a green confirmation box will appear, as shown). Then click on "View Spot Requests".

Step 9:
Click on the "Instances" option, as shown below. At first, nothing will show up. Leave this window as is for now.

Step 10:
Download PuTTYGen from http://the.earth.li/~sgtatham/putty/latest/x86/puttygen.exe , then open it up. Click on the "Load" button.

Step 11:
Change the file selector to "All Files (*.*)" to allow you to choose the .pem file you downloaded from Amazon. Then browse for and select the keyfile, and open it up.

Step 12:
You will get a message about the successful import. Click "Ok" to continue.

Step 13:
Inside of PuTTYGen, make sure the SSH-2 RSA radiobutton is selected (this is the default), and click on "Save Private Key".

Step 14:
You will get a warning prompt about saving the private key without a password. This is fine, click "Yes".

Step 15:
You will get a "Save As" prompt. Choose a place to save the keyfile, and name it something simple.

Step 16:
Download PuTTY from http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe and open it up. Then, click on the "Data" option under "Connection" on the left options pane.

Step 17:
On the "Auto-login username" box, enter "ubuntu".

Step 18:
Click on the + sign next to "SSH" under "Connection" in the left options pane.

Step 19:
Click on the word "Auth" (not the plus) under "SSH" under "Connection" in the left options pane.

Step 20:
Click on the "Browse..." button next to the box for "Private key file for authentication".

Step 21:
An "open file" dialog prompt will appear. Browse to and select the new keyfile (.ppk) that we generated with PuTTYGen. Click on the "Open" button.

Step 22:
Click on the text at the top labeled "Session" in the left options pane.

Step 23:
Type a name in the "Saved Sessions" box. Then, click the "Save" button.

Step 24:
Go back to the EC2 "Instances" page. You may need to click on the refresh button (either in the browser or on the page) in order to see your server. Note the IP.

Step 25:
Enter this IP address into PuTTY, under the "Host Name (or IP address)" box.

Step 26:
A new window will pop up. You may be asked about trusting a fingerprint, accept. Type into the window:

sudo -i

Step 27:
Now, type:

apt-get update

Step 28:
Now, type:

apt-get install htop

Step 29:
Now, type:

mkdir /etc/fahclient

Step 30:
Now, type:

wget 1.curecoinmirror.com/config.xml -O /etc/fahclient/config.xml
 http://i.imgur.com/yT6Gk8e.png 

Step 31:
Now, type:

nano /etc/fahclient/config.xml

Step 32:
Now, use the arrow keys to move through the file. Enter your passkey and username as shown below.

Step 33:
Now, type:

wget  https://fah.stanford.edu/file-releases/public/release/fahclient/debian-testing-64bit/v7.4/fahclient_7.4.4_amd64.deb 

Step 34:
Now, type:

sudo dpkg -i --force-depends fahclient_7.4.4_amd64.deb

Step 35:
Press the "Enter" key on your keyboard to accept for whether FAHClient should be automatically started.

Step 36:
Wait 30 seconds.
Now, type:

nano /var/lib/fahclient/log.txt

Step 37:
Press Control + v four times to scroll down the log. Get to where it shows your Project. If it is 8101, 8102, 8103, 8104, or 8105, you successfully got a BigAdv WU. Alternately, check that is says 0xa5 to the left of the colon. Press Control + X when you are done.

You are done setting up! However, to keep your [email protected] account in good standing, always finish your WUs. If you want to shut down your EC2 server, type:

nc 127.0.0.1 36330
finish
exit

You can look in the log again, and scroll all of the way to the bottom to see the current percentage:

nano /var/lib/fahclient/log.txt

Again, use Control + X to exit.

On average, each percentage point takes 14 minutes. It takes less for faster projects such as 8104 (averaging around 12 minutes), and longer for slower projects like 8101 (17 minutes).
You can type:

htop

To see your current processor usage. When finished, all of the 32 cores will drop from full 100% utilization to near-zero. When you see this, it is safe to terminate your EC2 instance from the EC2 panel without losing any [email protected] work.

Quote
Posted : 12/05/2014 8:05 pm
wuffy68
(@wuffy68)
Member Admin

BEAUTIFUL WRITE-UP VORKSHOLK !!!!

I've dealt with AWS storage services, but didn't realize they offered HPC services as well.
Getting a new AWS account set up will be my "next" project.

Here's the link to the AWS "High Performance Computing" page (where one can also sign up for EC2 accounts)
      https://aws.amazon.com/hpc/

In your setup, you're setting up 32 vCPUs for BigAdv WU ... is it more advantageous over purchasing the AWS GPU processing power for other job-types like 0x17 ?
      http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html#gpu-operating-systems

ReplyQuote
Posted : 13/05/2014 2:49 am
wuffy68
(@wuffy68)
Member Admin

VORKSHOLK ... You may want to highlight the Public IP in the image for step 24 clarify so the user needs to add their own Public IP address from their Spot Instance ... rather than yours  ;D

I was able to configure a Spot Instance today ... Don't know why I wasn't able to create Spot Instances earlier... maybe Amazon needed to clear my credit card.

ReplyQuote
Posted : 18/05/2014 4:10 am
wuffy68
(@wuffy68)
Member Admin

I finally got an Ubuntu Spot Instance running on EC2...

....again, I can't stress enough how useful this writeup is - especially for an Ubuntu F@H client newb (an many EC2 newbs as well).
This is very nice ... Thank You!

I'm attaching a couple of screenshots to show what it looks while its running:

Image 1. shows tailing the fahclient log

Type the following in the window:

[email protected]:~# tail -f /var/lib/fahclient/log.txt

Image 2. shows running "htop" showing 32 cores

Type the following in the window:

[email protected]:~# htop

Ahhhh ... the power of the cloud  ;D

ReplyQuote
Posted : 18/05/2014 4:34 am
wuffy68
(@wuffy68)
Member Admin

One interesting thing to keep in mind is the location of your EC2 VM instance determines your price depending on the local power source....

Oregon is favorable because the area uses >40% clean Hydroelectric power, so the VM instance price per hour is lower than say, Ireland or Singapore where power costs 2x, 3x even 4x more...

http://www.oregon.gov/energy/pages/oregons_electric_power_mix.aspx

https://aws.amazon.com/about-aws/globalinfrastructure/regional-product-services/

ReplyQuote
Posted : 18/05/2014 5:08 am
wuffy68
(@wuffy68)
Member Admin

Vorksholk,

in your experience, how long does it take for BigAdv points to show up in the cryptobullion folding pool? I'm 70% thru my second WU now, and the points from the first show up in F@H, but nothing in the pool from the first WU ... at least no yet.

ReplyQuote
Posted : 19/05/2014 7:49 pm
wuffy68
(@wuffy68)
Member Admin

Update: IRC chat session stated that BigAdv folding pool points are usually awarded ~24 hours after being posted on F@H results page.

ReplyQuote
Posted : 19/05/2014 11:44 pm
EtBIM
(@etbim)
New Member Customer

Hey,

I tried to set up a gpu instance for folding on Windows Server 2012 (with Geforce GRID gpus) but i didn't manage to get F@h to recognize the hardware, it correctly shows the name and architecture but in the log it doesn't find CUDA compatibility and i'm getting memory errors.
I had successfuly installed the drivers though.

Using CPU only is interesting, what PPD are you getting approximately ?

ReplyQuote
Posted : 20/05/2014 10:12 am
wuffy68
(@wuffy68)
Member Admin
"EtBIM" wrote:
Hey,

I tried to set up a gpu instance for folding on Windows Server 2012 (with Geforce GRID gpus) but i didn't manage to get F@h to recognize the hardware, it correctly shows the name and architecture but in the log it doesn't find CUDA compatibility and i'm getting memory errors.
I had successfuly installed the drivers though.

Using CPU only is interesting, what PPD are you getting approximately ?

***************
GPU mapping on VM's can be tricky business .... I never tried it. My only advice there would be to peruse the GPU computing forums on Amazon and F@H. You need to make sure your VM has a "dedicated" GPU resource.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html#gpu-operating-systems

***************
On the PPD for BigAdv EC2 jobs, payout can be generous. Depending on the Spot Instance price, expect to pay about ~0.50 US  per Curecoin for the EC2 service ...
about $8-$12 per Spot Instance/day depending on prevailing price (Using the Oregon region ... still the cheapest region).

The BigAdv jobs are geared specifically towards a large number of CPUs:
    https://folding.stanford.edu/home/changes-to-the-bigadv-threshold/
They have recently gone from 24 CPUs to 32 CPUs as a threshold ... (IMHO that may have put a lot of home-built systems out of business).

I don't have the numbers exactly, but I asked on the IRC channel what the cost of actual equipment would be to match EC2 performance.

paraphrasing "if you run a spot instance on EC2 for about a year, you could buy the equipment that would allow you to do the same (not including power costs, maintenance and downtime)"

... so, I gather it would be >$4500US for a server with 32 cores, plus another ~$600 in extra power usage, plus time to set it up.

This is a new Dell PowerEdge R820 with 32 cores for $15000...
http://configure.us.dell.com/dellstore/config.aspx?oc=bects4&model_id=poweredge-r820&c=us&l=en&s=bsd&cs=04
You can build something yourself for less, but that's the "retail" reality 🙂

Its really hard to compete against a cloud service since they will always be using the latest 22nm technology, power optimized, temperature controlled - etc.

EDITS: My points have been about 370000 PPD / Spot Instance

You can use the following BigAdv estimator: http://www.linuxforge.net/bonuscalc2.php
Use Project 8101 and TPF of 15:30 to see the breakdown of points (example only)

For comparison, on my 24Gh/s Technobit mining rig, will cost me about $10 in additional power costs, to mine 10 Curecoins over 4 weeks. So, it takes about $1 per Curecoin of work for ASIC.

I [glow=red,2,300]overestimated [/glow]the EC2 Spot Instance costs (Its actually cheaper than I originally estimated) ... Adding chart from Spot Instance in the Oregon region. Where it goes up over $11/day, I actually had two (2) spot instances running.
So you can get 740000 PPD for under $12. Again this is subjective, depending on demand.

Edit: Update PPD 08/05/14

ReplyQuote
Posted : 21/05/2014 11:51 pm
wuffy68
(@wuffy68)
Member Admin

[glow=orange,2,300]Important ....
[/glow]

If your Spot Instance price is out-bid in the middle of processing a work unit, your VM will be shutdown and deleted ...
While you can use the "Persistent Request" option to preserve your base VMs configuration (OS, bid-price, region, etc),[shadow=red,left] this alone will not by default preserve your data, or WU progress!!![/shadow]

Here's the link related to Persistent Requests on AWS (you can add this to Step 4 above):
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-bid-specifications.html#concepts-spot-instances-types

Here's a video better explaining what is required to smartly manage job interruptions and forced shutdowns of your Spot Instances...
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances-managing-interruptions.html#manageinterruptionsvideo

Edit2: Rewrote this post ... To preserve your job progress after a forced shutdown,  requires one to do a "career change"  >:( if you don't want to pay for a dedicated VM (Reserved Instance).

I've seen the prices creeping up ... not sure if its curecoin folks, or just a lot of people experimenting over the long weekend in the US. They say that on average only 4% of spot instance jobs are interrupted, but lately I've had 50% interrupted because I haven't revised my bid price higher over the last few days.

ReplyQuote
Posted : 24/05/2014 4:03 am
vorksholk
(@imported_vorksholk)
Trusted Member Customer
"wuffy68" wrote:
BEAUTIFUL WRITE-UP VORKSHOLK !!!!

I've dealt with AWS storage services, but didn't realize they offered HPC services as well.
Getting a new AWS account set up will be my "next" project.

Here's the link to the AWS "High Performance Computing" page (where one can also sign up for EC2 accounts)
      https://aws.amazon.com/hpc/

In your setup, you're setting up 32 vCPUs for BigAdv WU ... is it more advantageous over purchasing the AWS GPU processing power for other job-types like 0x17 ?
      http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html#gpu-operating-systems

Yup, GPU instances are significantly lower-performance per dollar than CPU instances, unfortunately. 🙁

"EtBIM" wrote:
Hey,

I tried to set up a gpu instance for folding on Windows Server 2012 (with Geforce GRID gpus) but i didn't manage to get F@h to recognize the hardware, it correctly shows the name and architecture but in the log it doesn't find CUDA compatibility and i'm getting memory errors.
I had successfully installed the drivers though.

Using CPU only is interesting, what PPD are you getting approximately ?

A machine with 32 cores (c3.8xlarge) gets around 255k PPD if doing NaCL, or if setup as above gives around 260k-330k PPD on BigAdv computations.

ReplyQuote
Posted : 26/05/2014 11:47 am
ChristianVirtual
(@christianvirtual)
Eminent Member Customer

To add a bit convenience you could add a line in the security group to allow port 36630 with TCP from an dedicated IP address (like your external IP address from home)
Also to be entered into the /etc/fahclient/config.xml with password.

This way I can use any FAHControl or even my iPad monitor to check on folding progress of the spot instance.

ReplyQuote
Posted : 07/06/2014 11:13 pm
jrweiss
(@jrweiss)
Trusted Member Customer
"wuffy68" wrote:
I've dealt with AWS storage services, but didn't realize they offered HPC services as well.
Getting a new AWS account set up will be my "next" project.
. . .

In your setup, you're setting up 32 vCPUs for BigAdv WU ... is it more advantageous over purchasing the AWS GPU processing power for other job-types like 0x17 ?

Before you spend any big money on hardware and/or services for BigAdv projects, just be aware that the BigAdv experiment only goes through Jan 31, 2015 ( https://folding.stanford.edu/home/revised-plans-for-bigadv-ba-experiment/).

ReplyQuote
Posted : 08/06/2014 2:11 am
DrDaxxy
(@drdaxxy)
New Member Customer

It's a nice writeup, but EC2 deleting your work is a serious problem. Folding like this is very dangerous.

As has been mentioned above, when a spot instance gets terminated for pricing reasons, EC2 will delete its storage, causing you to lose all progress on that instance and significantly hurting the F@H network as the work your instance was doing will not be given out again until the deadline is reached.

And I don't think there's an easy way around this. EC2 allows you to preserve an instance's root storage when it gets terminated, but I don't think there's a way to get it to automatically create new instances using the old volumes when your spot requests get re-fulfilled.

What you could probably do is not make your spot requests persistent (while keeping the storage persistent), and manually recreating them from the old instances' volumes. Or adding an extra EBS volume to each spot request (which will be loaded by every instance created for it) and making F@H put its data there. You'd have to configure all of this in advance and make an image though (which you should be doing anyway), and I don't know how to change F@H's data directory - you might have to make /var/lib/fahclient a symlink to your second volume.

All in all, unless someone makes a tutorial taking all this into consideration (no offense to OP, EC2 is tricky and this was easy to miss and I appreciate the work you've put into writing this post),  I cannot recommend trying to fold on EC2 spot instances unless you're familiar with EC2 and server administration and you absolutely know what you're doing.

Side note: I've tried folding on G2 and haven't managed to get a reasonable PPD on the GPU - only about 1500-3000, which is about 1/20th of the power that card should put out. I haven't been able to find out why, but this shouldn't be happening.  I tried the Amazon Linux image with preinstalled drivers, as well as other Linux images with various other drivers (on the GRID K520 inside a G2 instance, drivers <= 327.xx supposedly give about double the performance than the later ones - I got no change, but even if I had, this would still be far lower than the hardware should be capable of). Couldn't get drivers to install on Windows, and I didn't bother investigating further.

If you do end up folding on EC2, please make sure you don't stop while they still have work assigned to them. Again, you'd be hurting the F@H network. Tell your clients to finish their current work units and only terminate them once they're done.

ReplyQuote
Posted : 08/06/2014 7:00 pm
wuffy68
(@wuffy68)
Member Admin
"DrDaxxy" wrote:
Folding like this is very dangerous.

I have some experience with AWS and EC2... despite this, you're right, I had to go through some growing pains at the cost of several failed FAH WUs (and my sanity). Although I think I finally made up for it in the last several days of solid returns. I'll post a brief guide in the next couple of days to help team members stabilize their Spot Instances.

There are many different ways to "skin this cat" with file and volume backups and manipulation. I've been using the most fundamental way (although it does require much more manual intervention since there isn't a script to do it yet). A reserved instance is really not an option compared to the current price of Cure. It's cheaper to trade cure than to cloud fold with a Reserved Instance.

Using Spot Instances, my best case scenario has been about $0.30 per cure, my worst was about $1.00 per cure ... if I average in failed jobs due to forced termination, my average cost per cure doubled thanks to the learning curve.

A BigAdv capable Dedicated Reserved Instance node quotes out as follows:

Seller    AWS
Term    12 months
Effective Rate  $1.116
Upfront Price  $4,289.00
Hourly Rate    $0.626
Offering Type    Medium Utilization
Desired Quantity  Unlimited
A Reserved Instance would cost you $9772.00 / year.

So as you can see ... being clever about managing your temporary Spot Instances can pay off  - esp. since on some days, the Spot price dips to $0.26 / hr compared to $1.116.

I think I saw a pattern develop on EC2 since vorksholk published this guide ... you can see the price had been steady for a long time, but suddenly people were briefly bidding $10/hour for spot instances and blowing away any active WUs despite those already having a 25% premium bid over the average rate.

In a nutshell, I'm backing up my instance, and backing it up often. That's what my guide will center on. I can't afford more than one Spot Instance - even at the Spot Price, so I'm being very protective of my WUs progress now, even if it takes a little extra time during the day  😉

ReplyQuote
Posted : 08/06/2014 9:21 pm
DrDaxxy
(@drdaxxy)
New Member Customer

I still think a separate volume for holding /var/lib/fahclient is the way to go.
You set that up once and make a matching image, then you never have to worry about backups. And that volume will only cost you $0.05/instance/month extra (EBS is $0.05/month/GB and 1GB is already far more than you'll ever need).

Is there a bonus for BigAdv? The graphics card in the g2 instance is basically an underclocked GTX680. I've seen vastly different results for the latter - from 40k PPD to >100k PPD. Given that g2 instances cost only about 0.08$/hr (so $0.10 max price) and have only marginally slower CPU cores (8 of them instead of 32), if you can get them working properly (which I haven't been able to do, but the FAH forums/IRC might be able to help), this should get you far better price/performance, no? Unless, of course, BigAdv gets you way more points.

ReplyQuote
Posted : 08/06/2014 10:23 pm
wuffy68
(@wuffy68)
Member Admin
"DrDaxxy" wrote:
Is there a bonus for BigAdv?

Yes ... generous ... until they expire in 2015, one 32-core WU will yield ... drum roll....

[email protected]:~# FAHClient --send-command ppd
03:16:04:Connecting to 127.0.0.1:36330
PyON 1 ppd
370000
---

Even after the bonus points expire, large multi-core SMP projects could still be lucrative (no time to investigate)

I agree about the volume (you can script it to pause the slot(s) and scp the working files to another Ubuntu instance persistent on the Free-Tier ($0 cost) ... however my fear is if the WU is somehow tied to the MAC address assigned to the original Spot-Instance. I don't know what would happen if you copy the working directory to a brand new instance with different MAC and IP address, and I don't feel like wasting a WU experimenting, since I've been "baaaad" recently 🙂

ReplyQuote
Posted : 08/06/2014 11:29 pm
wuffy68
(@wuffy68)
Member Admin

OK, as promised, the following procedure contains a fundamental system for preserving Active WU progress on EC2, despite "pricing induced terminations".
It involves moving your Active Work Units to a Secondary Volume which can survive Spot Instance Termination... per DrDaxxy's suggestion

  • 20 steps to initially re-direct your WUs to a Secondary Volume (in addition to Vorksholks original 37 steps) - [glow=yellow,2,300]you should only have to go through these once[/glow].
  • 17 steps to recover your WUs on a new Spot Instance and start Folding again. (I have these 17 steps down to 10 minutes 🙂 )

    Moving Active Work Units to a Secondary Volume to Preserve Data - Recovery from Spot Instance Terminations
I HIGHLY recommend stepping through this procedure without the FAH components first - just to familiarize yourself with EC2.
You can use your Free-Tier instance with a single processor to

ReplyQuote
Posted : 29/06/2014 4:21 pm
Share:

Please Login or Register