The Digital Team use terraform to manage the AWS configuration.
Terraform is a CLI utility synchronizes AWS with scripts. In essence, it uses a series of scripts to detect and make changes to AWS. Terraform commands are run from a terminal session on a machine with Terraform libraries installed. See website: See documentation
How to update the AMI on our ECS cluster instances
The Digital webapps cluster uses the Elastic Container Service on AWS. We have a handful of EC2 instances that actually host the containers.
These instances use a stock Amazon Machine Image (AMI) from Amazon designed for Docker that comes with the ECS agent pre-installed. From time to time, Amazon releases a new version of this “ECS-optimized” image, either to upgrade the ECS agent or the underlying OS.
This process is sometimes referred to as “rolling” the cluster though it’s more accurate that we set up a second cluster of machines and migrate to it.
Update the instance_image_id
value for the staging_cluster
module to the new AMI ID from step 1 above. Save/commit the file as a new branch, not directly to the production
branch.
Make a PR which merges the new branch into the production
branch, and assign a person to review the changes.
After viewing the plan, if you need to update the terraform scripts, be sure to save the changes to the new branch.
If comitting your changes does not trigger the atlantis plan automatically, you can run it manually by creating a new comment with atlantis plan
.
Keep an eye on the “ECS Instances” tab in the cluster’s UI. You should see the “Running tasks” on the draining instance(s) go down, and go up on the new instances.
Once all the tasks have moved, the old instance(s) will terminate and Terraform will complete. Check a few URLs on staging to make sure that everything’s up-and-running.
Now that Atlantis’s apply finished, you can merge the staging PR and repeat the process (steps 2-6) for the production cluster.
Ensure your cloned copy of the digital-terraform
repository is on the production
branch, and that the branch it up to date with the origin on GitHub.
Create a new branch from the production
branch.
In your preferred IDE open the /apps/clusters.tf
file and update the instance_image_id
value for the staging_cluster
module to the new AMI ID from step 1 above. Save/commit the file to the new new branch (not directly to the production
branch).
in a terminal/shell from the repo/apps/
folder, run the command:
terraform plan
When the plan is done, inspect the output and expect to see changes to: - resource "aws_autoscaling_group" "instances" - resource "aws_cloudwatch_metric_alarm" "low_instance_alarm" - resource "aws_launch_configuration" "instances" (This last one will have the new AMI guid) Any other changes the plan identifies should be carefully investigated. Terraform may be proposing to make changes to the AWS environment you don't want, or at least are not expecting.
Keep an eye on the “ECS Instances” tab in the cluster’s UI. You should see the “Running tasks” on the draining instance(s) go down, and go up on the new instances.
Once all the tasks have moved, the old instance(s) will terminate and Terraform will complete. Check a few URLs on staging to make sure that everything’s up-and-running.
Now that terraform's apply is finished, you can repeat the process (steps 2-9) for the production cluster.
Finally you should merge the changes in your new (local) branch into the local production
branch, and then push the your local production
branch to the origin in Github.