Most domain providers now provide hosting capabilities for your company website and email so you can get your business presence established on-line in a matter of minutes. After establishing a presence through these channels, many SMB owners will then go on with their business satisfied that they have an on-line presence with an always-on email system and web site that they don’t have to invest in administering beyond the occasional update through the providers tools.
The tech savvy SMB that requires more control to manage indexing, meta-tagging, directory services, interactive web pages, higher order mail functions and auto response systems will find that these hosted services from their domain vendors fail to completely fill their needs, but still they don’t want to invest in the infrastructure to host their own mail and web services to improve integration with their business functions. In our case, we were unhappy with mail delays in our domain providers service and were unable to provide the experience we wanted to our clients and prospects when they visited our website. The web site improvements included a need to host small applications for time and expense entry, provision protected access to proprietary content, and better manage lead information while integrating with our communications infrastructure. Cloud based technologies appeared to be the best path to satisfy these needs, but digging in to actually roll out these services in the cloud was initially a bit daunting.
We signed up for AWS believing that this would be relatively straightforward, but found that the flexibility offered by these services also introduced some unexpected complexity. Sure, signing up was easy, but the first step in getting up and running is to select an Amazon Machine Image (AMI) to launch – and there are hundreds of images. While we could upload and run our own image, the setup involved adds another level of complexity and also incurs additional charges for upload and storage not to mention delays while waiting for the upload to progress. Then we had a choice of using EBS or S3 for the image persistence, a choice which certainly required some amount of research to understand the differences and determine which was right for our purposes. Fortunately, Amazon has put the content out there for those who are persistent enough to push through and do their homework to make informed choices that will ensure the continued operation of a cloud hosted server.
Deciding on Storage
Fundamentally, our needs revolve around quick startup time and local persistence, so EBS appeared to be the most appropriate storage choice for the AMI. Considering that we could start multiple instances, we also chose to create mountable volumes on EBS that we could snapshot to S3 and access across AMI instances over time by remounting the snapshot to the new AMI instance on startup.
Selecting the VM Image
We decided to follow Amazon’s getting started tips and selected the Basic 32-bit Amazon Linux AMI (a 64 bit AMI is now available) due to the fact that this image does not contain any extra software and services beyond those required to run a simple virtual machine within Amazon’s cloud infrastructure. If your organization intends on scaling the server up to have more than 3GB of system memory, you will want to start with the 64 bit AMI image.
Amazon also provides the RPM based yum software update and installation tool which allows for the installation of any software or services you need against this image simplifying installation significantly. Most packages like apache, sendmail, dovecot, openldap-server, php, mysql, wordpress, etc… can be installed directly from a local repository within the Amazon cloud and incur no additional charges to install and configure. In addition, we were able to install a build environment and compile and install several packages from source where the compiled packages were not available through Amazon.
Configuring the VM Image
We decided to create a script to bring the stock VM Image from the basic amazon 32 bit VMI for EBS. We weren’t completely sure how the VM would or could be restored if the entire data center went offline or there some other catastrophic failure struck the storage cluster, so we opted to configure through a script that we back up in our own on-premises storage cluster. In theory, this same script could be used to create a bootstrapped AMI with a simple modification to download and run the script after launching the stock Basic 32-bit Amazon Linux AMI at any future point.
In addition to the basic configuration, our instance also required a static IP address to allow consistent access to our web and mail services hosted in the AMI instance, requiring configuration of the elastic IP address per the User’s Guide. We found that an additional call was required to Amazon to allow mail traffic to pass to our image as they do not allow SMTP traffic from these instances by default – likely to cut down on the incident of spam abuse from virtual machines.
After installing all our software on the AMI instance and configuring the same, we chose to save the instance as a custom AMI that was limited to access from only our account. We first shut down the instance to ensure the instance was in a static state to get a clean image. Note that terminating the instance would delete the instance completely and lose all data saved in the root volume to this point. Prior to terminating the instance, we started a new VM instance from the newly created AMI to ensure the configuration was properly persisted and the VM was recoverable on restart.
Defining and Implementing your Backup Strategy
After completing our research, it was clear that the services we had chosen did not ensure continuity at a level that met our business requirements – although the annual cost for my configuration certainly had appeal. EBS allows for continuity on mechanical failure of a single disk, but catastrophic failure of the storage cluster could conceivably wipe out our curstomized AMI and the attached storage blocks containing the mail, directory, web content, and mysql DB files backing our wordpress blog and other applications.
Consequently, we realized we would need to integrate this AMI into our corporate backup strategy to ensure such a failure would be recoverable. This integration would need to consider the billing structure from Amazon and would need to optimize on processing, storage, and network traffic accordingly so as to not significantly reduce or erase the improvements in operating costs we hoped to achieve through leverage of a cloud based deployment. Considering these factors and the nature of the incremental changes to data, we focused our efforts on scripting a solution via usage of the rsync utility available in most Linux environments to synchronize key data files in our Amazon machine image with our backup store inside our corporate firewall.
The rsync utility worked fairly well to synchronize the cloud data back to our on premise storage cluster with a few exceptions. We used the compression capabilities to optimize the transfer but it occasionally stalled during backups – so we put a time-out in place and implemented retry with compression turned off when the timeout expired. Since security is a prime concern, we run rsync over a pre-configured SSH tunnel. While this adds a little complexity to the error handling due to the port bindings between rsync and SSH – we were able to manage by using xinetd to auto start rsync when accessed from the localhost.
We have been running on AWS for nearly a year now – backups are fairly routine and we have been successful in instituting a weekly image update process manually. We have also survived a week long power outage at our offices with no outward impact, having moved to the cloud a few months before hand. Our infrastructure costs on server hardware and electricity have reduced by over 90% over a 3 year projection as well. While administration is a little more complex, we have become comfortable enough with the tooling to begin incorporating batch processes to automate the maintenance activities in managing images and instances to drive our systems admin overhead down to nearly zero effort. In summary – the move to AWS has been well worth the effort!