Automatic Website Deployment using Terraform and Amazon EFS
This article is an upgradation to my previous article. In this article I am going to explain how will Storage work if the instance gets auto-scaled.
Reference to my previous article:
Automated Website deployment using Terraform
Clients connect to Google, Amazon, Flipkart more often because it is fast. Headquartered in any part of the world, how…
Here I explained in detail about how Cloud Technology works and makes deployment faster and simpler.
When a website is deployed over the Public Network, many things have to be planned. The website is not deployed over just one Webserver. If the website gets an extra amount of traffic that was not pre-calculated, then the site will crash and all efforts and business goes waste. So to avoid this, principles of DevOps are utilised. The most common things to do is use Auto-Scaling and Load Balancers.
Now as the name suggests Auto-Scaling will scale the instances where the website is running as and when required according to some set rules related to metrics of the instance.
This brings up one issue. Each instance will require a storage from where it will get the updated WebApp files. One solution is using multiple EBS.
Why multiple EBS? Because one EBS can be attached to only one Instance.
But this is not a good method because this is a complete waste of resources, space and money.
In the previous article, I used Amazon EBS Service as the main storage. And we have already discussed its cons in this use case. So instead of using an extra EBS for storing the program files, I will use Amazon EFS Service.
Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed Elastic NFS file system for use with AWS Cloud services and on-premises resources. It is built to scale on demand to petabytes without disrupting applications.
- First I will create a VPC
- Create subnets inside this VPC
- The required Internet Gateway and the Route Tables for NAT
- Create a Security Group
- Create the Elastic File System and its requirements
- Launch the EC2 Instances
- Then create the S3 Bucket for storage of static data
- This will be sent to all the edge locations using CloudFront
- Create a AWS Elastic Load Balancer
- Create Target Groups and Listener Rules for Load Balancer
- Then finally load the website on your favourite browser using the DNS Name of Load Balancer
Let’s start by building the code.
First we setup the provider for downloading the plugins required for AWS Cloud Platform.
Profile is set so that Terraform code automatically picks up credentials from the local system without passing it through code.
Profile can be set using the following command:
aws configure --profile profilename
Then I have set a Data Source of Availability Zone so that it can be used later in the code.
Creating VPC —
In the VPC set a CIDR Block of your choice. Make sure to set “true” to these 2 things:
This will help to enable DNS support and DNS Hostname for the VPC and will later be used by EFS.
Creating the Subnets —
Next I have created the subnets in the same VPC. I have used a variable “count” that stores the number of Availability Zones currently Available and creates the same number of subnets — 1 subnet for each Availability Zone.
Then the CIDR Block is specified using the same variable. Keep in mind to set the “map_public_ip_on_launch” as true.
Internet Gateway —
The Internet Gateway provides the facility of DNAT in the instance. This will let any instance connected to the internet , i.e. the outside world.
Route Table —
This Route Table consists of the CIDR Block that is the destination of the instance and I have mentioned it as 0.0.0.0/0 which means “Anywhere on the Internet”. The Gateway through which it will go to the Internet is provided by passing the ID of the Internet Gateway created earlier.
Route Table Association —
Finally I have associated the Routing Table with all the subnets.
Why all subnets ? Because I am launching Web Server and I want anyone to connect to it and the Web Server can also go to the Internet — for updates.
Now we start creating the Security Groups and EFS
Security Group —
This is a very simple Security Group. This can be modified to be open for specific ports. Right now I have kept it all open, i.e. All in and All out.
Creating EFS —
This resource will create an EFS, with the only thing to be kept in mind is the creation_token. It should be UNIQUE.
Mounting the EFS —
Using a count variable, here we mount the EFS on all the subnets and also attach the correct Security Group using ID.
Here the important thing to keep in mind is the ID of the:
- File System
- Security Groups
Specifying the Access Points for EFS —
Amazon EFS access points are application-specific entry points into an EFS file system. We give it the File System ID so that the file system is accessible.
Now we create the EC2 Instances
Now here I am creating 2 instances using the AMI with ami_id = “ami-08706cb5f68222d09”. Then I have attached the Security Group and the Subnets using their IDs that were created earlier. Since in this EC2 Instances we want to launch our Website, so we need to put our code files into it.
So I have used a remote-exec using Heredoc to do the following things:
- First install the required softwares — git, php, httpd, amazon-efs-utils
- Then I have started and enabled the httpd services
- Then created 2 variables in the Command Promt that store the efs_id & accesspt_id
- Then we mount the EFS File System to the folder /var/www/html
- Then to make it permanent I have written it in the fstab file
- Then I have used git clone command to copy the code stored in GitHub Repository
A Heredoc is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. Its use will be explained later in the article (CloudFront Distribution).
S3 Bucket —
S3 bucket is used to store the static data like images, videos and other graphics. This is required so that anywhere from the world, if the website is opened, the static contents also get loaded without any delay or latency.
An Amazon S3 bucket name is globally unique. This means that after a bucket is created, the name of that bucket cannot be used by another AWS account in any AWS Region until the bucket is deleted.
Put Object in S3 —
Then I have uploaded the objects in S3 Bucket. Here writing depends_on is important so that this resource starts to build only after the S3 Bucket is ready. Key is specified as the object that is needed to be uploaded & source is the path from where to load this file.
The main aim of creating an S3 Bucket is that there is no latency. So this can be achieved in AWS using Edge Locations. These are small data centres that are created by AWS all over the world. To use this mechanism, we use CloudFront service of AWS.
I have used a variable where I have specified a default value “S3-”. This is done because in order to use the CloudFront we have to specify the S3ucket ID and the resource
aws_s3_bucket.tf_s3bucket.id does not provide “S3-” as can be seen from the GUI.
Then I have created the rest of the code by providing the domain name & origin ID.
- Then I have set the default_cache_behavior which is a required block of code. Each cache behavior specifies the one origin from which you want CloudFront to get objects. If you have two origins and only the default cache behavior, the default cache behavior will cause CloudFront to get objects from one of the origins, but the other origin is never used.
- Then I have set the viewer_protocol_policy specifying the default and maximum TTL. Choose the protocol policy that you want viewers to use to access your content in CloudFront edge locations: HTTP and HTTPS
- Then restrictions can be set if required (whitelist & blacklist). I haven’t specified any.
- Then I have set the viewer_certificate as true.
Now finally one last thing we have to do. From this cloudfront the URL that is provided to the bucket object, we have to put in the code sent to us by the developer so that the client can see it.
For this again I have made an SSH connection using Connection & Provisioner. This is done using a seperate Null Resource.
To write the code into an already existing file we have to be the root user, and right now we are ec2-user by default.
- We can login as the root user.
This is possible from GUI or CLI but from Terraform code this is not possible directly.
- Switch user to root on the fly.
This is a good solution and technically that is what I have done.
When we use this command:
sudo su — root , a child shell is created. If this is written on the local system, this command works seamlessly.
But if this is written from a remote system, here Terraform, it fails to get the child shell.
- Correct solution is use
sudoto run a shell and use a heredoc to feed it commands.
As mentioned earlier, a Heredoc is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file.
So using Heredoc I have passed the command to add the URL of CloudFront into the file so that user will see the images seamlessly (along with code) when Web Server is launched.
Since I have made 2 instances, it is not a good idea to provide 2 IPs to my client.
Take example of Google:
Google “might” have multiple running web servers.
But we do not know about any Web Server IP, we just know the IP — 184.108.40.206 and the DNS Name — www.google.com .
This means that Google has multiple backend servers and using this DNS Name we connect to any of the Web Servers, and to us it doesn’t matter because we get the seamless connection. So how is this done ?
An answer can be Google has only 1 Web Server ! But No …
Google uses a Load Balancer on top of all the Servers, which acts as a reverse proxy and distributes network or application traffic across a number of servers.
Similarly I have also used AWS Elastic Application Load Balancer to distribute the load of my servers and also get 1 single DNS Name instead of 2 IPs.
Creating the Application Load Balancer —
I have used the load_balancer_type as “application”. Keep in mind to put internal as “false”. This is important to tell to AWS that this Load Balancer is for external usage and needs internet connectivity.
Then I have attached the Security Groups and the Subnets using IDs.
Target Group & Attachment —
Next I have created the Target Group. Here I have specified that Load Balancer’s targets are on Port 80 using HTTP Protocol.
Then we have to explicitly attach the Target Group to the Load Balancer. Here I have mentioned the Target Group’s ARN & the ID.
Listener & Listener rule —
A listener is a process that checks for connection requests, using the protocol and port that you configure. The rules that you define for a listener determine how the load balancer routes requests to its registered targets.
In the Listener Group the frst thing we should specify is port & protocols. Next most crucial thing is default_action. Use the type => “forward” and provide the target group ARN.
Then in the Listener Rule where I have provided the ARN of the listener and again the action type as forward. Forward actions route requests to one or more target groups.
Then finally for the Bonus Part —
Open automatically on Chrome :
I am running the chrome command on my local machine to launch the EC2 instance using the DNS Name of the Load Balancer.
(PS. To launch chrome from Command Prompt on Windows, you have to set the Environment Variable PATH for Chrome Application.)
Since Load Balancer has IDs of both subnets as the Targets, it will redirect traffic to the 2 web servers using its DNS Name only.
Practical to show Terraform Infrastructure code & Load Balancing
The code used in the Web Server is a PHP Code that has the command
ifconfig. It will print the private IP of the Instance. I have made a note of it during the demo. You can match it with the picture below:
So this explains Automatic Deployment of Web Server using EC2 and also live demo of Load Balancing.
The use of EFS can also be clearly seen as the data is persistent in both the instances.
You can find these codes on my GitHub profile.
That’s all folks!
For any queries, corrections, or suggestions you can always connect with me on my LinkedIn.
Daksh Jain - Hybrid Multi Cloud (AWS | OpenStack | Terraform ) - LinuxWorld Informatics Pvt Ltd |…
Currently pursuing B. Tech. Computer Science with specialization in Cyber Security. Former Cyber Security Intern at…