Terraform, EKS and Spot Worker Nodes

January 27, 2020

When searching the web for how to deploy an EKS cluster, the most common thing you will  find is eksctl from Weaveworks. It is a great tool, but it is mostly a tool for deploying AWS’s EKS and AWS resources related to EKS. What I want to describe in this post is a straightforward way to create an EKS Cluster using Spot Instances for Worker nodes running applications on that EKS Cluster. Terraform is what I will be using in this article; if you’re not already familiar with Terraform, here is a description of Terraform from Hashicorp, the makers of Terraform:

“Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions.

“Configuration files describe to Terraform the components needed to run a single application or your entire datacenter. Terraform generates an execution plan describing what it will do to reach the desired state, and then executes it to build the described infrastructure. As the configuration changes, Terraform is able to determine what changed, and create incremental execution plans which can be applied.

“The infrastructure Terraform can manage includes low-level components such as compute instances, storage, and networking, as well as high-level components such as DNS entries, SaaS features, etc.”

Along with being able to deploy and manage onsite and custom infrastructure, Terraform has become the standard cloud agnostic infrastructure deployment and management tool. To give you an idea of the vendors/infrastructure that Terraform officially supports today, here is a list of 123 vendors:

ACME Docker Logentries Rancher2
Akamai Dome9 LogicMonitor Random
Alibaba Cloud Dyn Mailgun RightScale
Archive Exoscale MetalCloud Rundeck
Arukas External MongoDB Atlas RunScope
Avi Vantage F5 BIG-IP MySQL Scaleway
Aviatrix Fastly Naver Cloud Selectel
AWS FlexibleEngine Netlify SignalFx
Azure FortiOS New Relic Skytap
Azure Active Directory Genymotion Nomad SoftLayer
Azure Stack GitHub NS1 Spotinst
A10 Networks GitLab Null StackPath
Bitbucket Google Cloud Platform Nutanix StatusCake
Brightbox Grafana 1&1 TelefonicaOpenCloud
CenturyLinkCloud Gridscale OpenNebula Template
Check Point Hedvig OpenStack TencentCloud
Chef Helm OpenTelekomCloud Terraform
CherryServers Heroku OpsGenie Terraform Cloud
Circonus Hetzner Cloud Oracle Cloud Infrastructure TLS
Cisco ASA HTTP Oracle Cloud Platform Triton
Cisco ACI HuaweiCloud Oracle Public Cloud UCloud
Cloudflare HuaweiCloudStack OVH UltraDNS
CloudScale.ch Icinga2 Packet Vault
CloudStack Ignition PagerDuty Venafi
Cobbler InfluxDB Palo Alto Networks VMware NSX-T
Consul JDCloud PostgreSQL VMware vCloud Director
Datadog Kubernetes PowerDNS VMware vRA7
DigitalOcean LaunchDarkly ProfitBricks VMware vSphere
DNS Librato Pureport Vultr
DNSimple Linode RabbitMQ Yandex
DNSMadeEasy Local Rancher  

Here is a partial list of 146 Terraform community supported vendors/infrastructure:

1Password

Databricks Keboola SakuraCloud
Abiquo Dead Man's Snitch Keycloak SCVMM
Active Directory - adlerrobert Digital Rebar Keyring Sendgrid
Active Directory - GSLabDev Docker Machine Kibana Sensu
Airtable Drone Kong Sentry
Aiven Dropbox Ksyun Sewan
AlienVault Duo Security Kubectl Shell
AnsibleVault EfficientIP Kubernetes Smartronix
Apigee Elastic Cloud Enterprise (ECE) libvirt Snowflake
Artifactory Elasticsearch Logentries snowflakedb
Auth ElephantSQL Logz.io sops
Auth0 Enterprise Cloud LXD Spinnaker
Automic Continuous Delivery ESXI Manifold SQL
AVI Foreman Matchbox Stateful
Aviatrix Gandi MongoDB Atlas Statuspage
AWX Generic Rest API Nagios XI Stripe
Azure Devops Git Name Sumo Logic
Bitbucket Server GitHub Code Owners Nelson TeamCity
CDAP GitHub File NetApp Telegram
CDS GitInfo NSX-V Transloadit
Centreon Glue Okta Trello
Checkly GoCD Online.net tumblr
Cherry Servers Google Calendar Open Day Light Unifi
Citrix ADC Google G Suite OpenAPI UpCloud
Cloud Foundry GorillaStack OpenFaaS Updown.io
Cloud.dk Greylog Openshift Uptimerobot
Cloudability Harbor OpenvCloud Vaulted
CloudAMQP Hiera oVirt Veeam
Cloudforms HPE OneView Pass Venafi
CloudKarafka HTTP File Upload PHPIPAM vRealize Automation
CloudMQTT IBM Cloud Pingdom Vultr
CloudPassage Halo IIJ GIO Pivotal Tracker Wavefront
CodeClimate Infoblox Proxmox Win DNS
Confidant InsightOPS Puppet CA XML
Confluent Cloud Instana PuppetDB YAML
Consul ACL Iron.io Purestorage Flasharray Zendesk
CoreOS Container Linux Configs Jira QingCloud ZeroTier
Coveo Cloud Jira (Extended) Qiniu Zipper
CouchDB JumpCloud Redshift  
Credhub Kafka RKE  
Cronitor Kafka Connect Rollbar  

For the many organizations that have used or are looking to standardize on a tool for the vast majority of infrastructure, standardizing on Terraform should be something to consider, and use eksctl where it makes the most sense to do so. For those that are in an organization that already use Terraform, or someone that hopes to standardize on a tool like Terraform, and want to deploy EKS and spot instances, I hope this post helps you get started. For everyone else that just wants to get an EKS Cluster up to use eksctl, or if you have already standardized on CloudFormation because your organization primarily uses AWS, you can follow AWS’s Quick Start for Amazon EKS here.

What I’m going to show here is an easy way to create an EKS cluster, Spot Instance Worker nodes, access the cluster, and start using (the Kubernetes package manager) to install and run applications. Spot Instances are AWS Compute Instances that are available from AWS’s unused capacity at an extreme discount, which is generally between 60 to 90% of the on-demand pricing -- but your Spot Instance may be shut down with only 2 minutes’ notice, when that unused capacity is needed by AWS.

I will be using Terraform’s terraform-aws-eks module to create an Elastic Kubernetes (EKS) cluster and associated worker instances on AWS and using that projects Spot Instance example.

To start you will need to use git to clone the terraform-aws-eks project to your local machine.

Enter the below git command from a command prompt:

git clone git@github.com:terraform-aws-modules/terraform-aws-eks.git

cd terraform-aws-eks/examples/spot_instances/

I’ll assume you have your default AWS Credentials setup for the account and user you intend on using for the environment we will be deploying.  If not, follow these steps to install the cli here and to setup your AWS credentials here. If you don’t have Terraform installed, you can follow these instructions here.

From the spot_instances/ directory will want to enter the command:

terraform init

This command is used to initialize a working directory (creates the .terraform directory) with various local settings and data that will be used by subsequent commands.

Next you will want to issue the “terraform plan” command. This command is used to create an execution plan. Terraform performs a refresh, unless explicitly disabled, and then determines what actions are necessary to achieve the desired state specified in the configuration files. You will want to add a variable to the end of this command to define which AWS Availability zone you wish to deploy to, or you can update the default Availability Zone defined in terraform-aws-eks/examples/spot_instances/variables.tf :

terraform plan -var 'region=us-east-1'

The output from this command will show everything that Terraform “plans” on doing to achieve the state you have defined in your configuration files from this terraform-aws-modules project.  Nothing has actually been deployed or implemented yet, terraform is just displaying what it intends on deploy when you apply this plan.  You can output this plan to a file, and have the “terraform apply” command reference that file for exactly what it is going to execute, but if anything has drifted from what is out already out in AWS, and what your plan is, your apply might be what you expected.

NOTE: Before issuing the next command, know that with this command you will be creating AWS infrastructure. Also know that you will be incurring costs. As an example, the EKS cluster service at the time of writing costs .10/hr which adds up to around $72 per month, which with the simplicity it provides is worth the cost. Just understand you will be incurring that cost, plus additional costs for the spot instances and other resources as described in the plan that was just displayed if you are following along. Once deployed, you can issue the “terraform destroy” command from this same directory and terraform will attempt to destroy all the resources you deployed. Please remember this; don’t deploy this infrastructure, forget about it and get a bigger than expected bill later.

We’re just going to issue the apply command below to deploy the infrastructure:

terraform apply -var 'region=us-east-1'

or just “terraform apply” if you updated your variables.tf file with the Availability Zone.  When you issue this command, a plan is shown before anything is deployed, and there is a prompt asking you “Do you want to perform these actions?” from here. If the plan looks good you can type yes, and it will continue.  Now you can just jump to the “terraform apply” and not do the plan at all, but you will be sitting at the prompt asking you that question while you’re searching and evaluating the plan -- which you can definitely do (it’s completely up to you). The kind of standard Terraform workflow for working with Terraform is, initially “terraform init”, then it’s repeating, editing your configuration files, then running “terraform plan” and “terraform apply” to get your environment deployed.

With the running of this apply command, Terraform will attempt to deploy what you have defined in your configuration files by talking directly to AWS API’s, issuing create, destroy or update commands, checking status and retrying as necessary. Terraform understands the vast majority of order necessary to create, update, or delete infrastructure, so you generally do not have to worry about how the configuration files are ordered. But if there is an issue, you can define instructions for order where necessary.

Most of the deployment will happen relatively quickly, but the EKS cluster portion always takes around 15 minutes to initially create, or destroy it later; changes are generally rather quick. You can sit back and watch the status scrolling down the screen from the output from your apply command, just be prepared for this part to take about 15 minutes.

Once this process completes you will have a complete and working Kubernetes Cluster with the Kubernetes ControlPlane provided by the AWS EKS service, and your Kubernetes worker nodes will be AWS Spot Instances saving you 60-90% on the cost of those Worker Nodes.

Next you will need to install the Kubernetes(k8s) client(kubectl) on your local machine so you can issue k8s commands to the k8s cluster you just created. To do so follow the instruction here.

Then you will have to update kubeconfig on your local machine to allow your local client (kubectl) to communicate to the cluster you just created. To do so, follow the instructions here. The instructions tell you to basically enter this command: “aws eks --region region update-kubeconfig --name cluster_name” if your AWS cli and credentials are already setup and working -- but follow the instructions from that link if you have any issues.

Now you can issue kubectl commands and administer your Kubernetes cluster. I’m not going to get into all the possible options, and how Kubernetes works, I just wanted to show you how to get an EKS cluster with Spot Instances as for Worker Nodes, up quickly and easily with Terraform. I will show you a few easy things to get you started, so you can play around and learn.

To get status information from your cluster you can issue some of these commands below:

# Get commands with basic output

kubectl get services                                        # List all services in the namespace

kubectl get pods --all-namespaces                 # List all pods in all namespaces

kubectl get pods -o wide                                 # List all pods in the namespace, with more details

kubectl get deployment my-dep                     # List a particular deployment

kubectl get pods                                              # List all pods in the namespace

kubectl get pod my-pod -o yaml                    # Get a pod's YAML

kubectl get pod my-pod -o yaml --export       # Get a pod's YAML without cluster specific information

# Cluster info

kubectl cluster-info                                          # Display addresses of the master and services

kubectl cluster-info dump                                # Dump current cluster state to stdout

kubectl cluster-info dump --output-directory=/path/to/cluster-state   # Dump current cluster state to /path/to/cluster-state

 

Resource types

List all supported resource types along with their shortnames, API group, whether they are namespaced, and Kind:

kubectl api-resources

Other operations for exploring API resources:

kubectl api-resources --namespaced=true      # All namespaced resources

kubectl api-resources --namespaced=false     # All non-namespaced resources

kubectl api-resources -o name                        # All resources with simple output (just the resource name)

kubectl api-resources -o wide                         # All resources with expanded (aka "wide") output

kubectl api-resources --verbs=list,get            # All resources that support the "list" and "get" request verbs

kubectl api-resources --api-group=extensions # All resources in the "extensions" API group

With the cluster up and working, I will show you how to start using Helm, “The package manager for Kubernetes.” It does require a local client on your machine. To install, follow the steps here.

There are three concepts to know when working with Helm, Chart, Repository, and a Release. The chart is the package. It contains all the resource definitions necessary to run an application, tool, or service inside of a Kubernetes cluster. Think of it like the Kubernetes equivalent of a Homebrew formula, an Apt dpkg, or a Yum RPM file. The Repository is where charts can be shared. It’s like Perl’s CPAN archive or the Fedora Package Database, but for Kubernetes packages. And a Release is an instance of the chart running in a k8s Cluster. A chart can be installed many times in the same cluster, each time it is installed, it is a new release.  You name that particular release from the chart as you install each release.

Initially you can add some repositories to your cluster, along with the default helm repository which is call hub:

helm repo add stable https://kubernetes-charts.storage.googleapis.com

helm repo add eks https://aws.github.io/eks-charts

helm repo add monocular https://helm.github.io/monocular

From here you can search for wordpress as an example from the stable(google) repository:

helm search stable wordpress

or to search the helm hub:

helm search hub wordpress


Next we’re going to install a package on your cluster. Since we are running Spot Instances for our worker nodes, we don’t want the Worker Nodes to just stop working and then force Kubernetes to have to react after the nodes are shut down. We can add a package called aws-node-termination-handler that watches for the Spot termination notice that comes in 2 minutes before the Spot service shuts down your Spot Instance. Then the termination-handler will start a multi-step process of gracefully draining that node that is about to be shut down, so the Pods running on that node can be gracefully moved to a different node before the shutdown actually happens:


helm install aws-node-termination-handler --namespace kube-system eks/aws-node-termination-handler


Now you can issue the command:

kubectl get pods --namespace kube-system

 to see the Pods that are running on your cluster.  And you should see a Pod name starting with “aws-node-termination-handler-“

Now we will install something a little more interesting, like an ingress controller or load balancer to have in front of our cluster:

helm install nginx-ingress stable/nginx-ingress

You can issue the command:

kubectl get pods -n default

and you should see 2 Pods, nginx-ingress-controller- and nginx-ingress-default-backend-. Then if you go to your AWS Console, to EC2, then Load Balancers you will see a new Load Balancer with a long name of random characters. While there copy the “DNS name” of that Load Balancer we will be using it later after installing the next package. You can check out the instances that are part of the load balancers pool by clicking on the “Instances” tab as part of that Load Balancer and see a single instance that is one of your worker nodes, and it should be “InService”.

NOTE: Keep in mind what is run from kubectl, Terraform will not be aware of.  So if you want to destroy all your AWS infrastructure that was deployed. You might think you can run the “terraform destroy” command referenced above, you can, but that command will only destroy the resources it deployed, and if there are conflicts it will stop/fail. By installing the nginx-ingress you have created an AWS Load Balancer that Terraform does not know about, you will want to uninstall the nginx-ingress, by issuing the command:

            helm uninstall nginx-ingress

before running the “terraform destroy”. Or else running “terraform destroy” will not be able to destroy the subnets and instances created, because they will still be in use by the load balancer(nginx-ingress).

Now onto a more interesting package/application, an application called Monocular. It is a web-based application that enables the search and discovery of charts from multiple Helm Chart repositories. To install it, enter the command below:

helm install monocular monocular/monocular

To check the status, you can enter the command:

kubectl get pods --namespace default

After a little bit of time for everything to install and start, you should see something like the output below. If you would like to delete any of these Helm packages you deployed you can issue the command “helm uninstall <release>”, like in the case of monocular it would be “helm uninstall monocular”

As you can see you have 8 additional pods running for Monocular now.  To view the installed service, you can put the DNS name you copied from the load balancer before into your browser. You should see a page similar to the screenshot below.  That site is now running on your cluster.

I have shown you how to get an EKS cluster deployed, with Spot Instances as your Worker nodes, and how to start adding applications to your Kubernetes/EKS Cluster. This is a very barebones glimpse of things that can be done with Kubernetes, but you now have a deployment system (Terraform), and a working environment to explore using Terraform, AWS EKS(Kubernetes), and all the related AWS and Kubernetes ecosystems.

Reminder: The default configurations that were deployed are not intended for any sort of production use, the security rules/groups deployed should be considered unsecure, there are costs associated with deploying and running AWS services. If you want to delete the infrastructure created you can issue the “terraform destroy” command from the directory you ran the other Terraform commands from, referenced in this guide. You can also reference the Terraform apply output to identify all resources deployed to AWS and go to the console and delete them from the console if you like. Remember to uninstall the Helm Applications if they created AWS services first, before using the “terraform destroy” command.

Other Posts You Might Be Interested In