EKS terraform automation project

AWS EKS Infrastructure Automation with Terraform & Jenkins

AWS EKS Infrastructure Automation with Terraform & Jenkins

Terraform AWS Kubernetes Jenkins MySQL


Table of Contents


Project Overview

The Challenge

Modern cloud infrastructure requires more than just clicking buttons in a web console. Organizations need automated, repeatable, and version-controlled infrastructure that can be deployed consistently across environments.

The Solution

I built a production-grade automated infrastructure deployment pipeline that provisions a complete AWS EKS (Elastic Kubernetes Service) cluster using Infrastructure as Code principles. The entire infrastructure—from networking to database deployment—is defined in code, version-controlled in GitLab, and automatically deployed through Jenkins CI/CD.

Real-World Impact

This project demonstrates skills that directly translate to enterprise DevOps roles:

  • Automated cloud provisioning reduces deployment time from hours to minutes
  • Infrastructure as Code eliminates configuration drift and manual errors
  • CI/CD integration enables rapid, reliable deployments
  • Kubernetes orchestration provides scalable application hosting
  • Cost optimization through smart resource management


🛠️ Technologies Used

Infrastructure & Cloud

  • Terraform (v1.6.1) – Infrastructure as Code tool
  • AWS EKS (v1.32) – Managed Kubernetes service
  • AWS VPC – Virtual Private Cloud networking
  • AWS EC2 – Compute instances for Kubernetes nodes
  • AWS EBS – Persistent storage volumes

Container Orchestration

  • Kubernetes (v1.32) – Container orchestration platform
  • Helm (v2.11.0) – Kubernetes package manager
  • AWS Fargate – Serverless container runtime

CI/CD & Version Control

  • Jenkins – Automation server for CI/CD
  • GitLab – Source code management and version control
  • Git – Distributed version control

Database

  • MySQL (Bitnami Chart v9.14.1) – Relational database with replication
    • 1 Primary instance
    • 2 Secondary replicas for high availability

Configuration Management

  • Terraform Modules:
    • terraform-aws-modules/vpc/aws (v5.2.0)
    • terraform-aws-modules/eks/aws (v19.20.0)

✨ Key Features

🔄 Automated Infrastructure Provisioning

  • One-Click Deployment: Entire infrastructure created from a single Terraform apply
  • Repeatable: Destroy and recreate identical environments instantly
  • Version Controlled: All infrastructure changes tracked in Git

🌐 High-Availability Networking

  • Multi-AZ Deployment: Resources spread across 3 availability zones
  • Private Networking: Worker nodes isolated in private subnets
  • Secure Internet Access: NAT Gateway for outbound connections
  • Load Balancer Ready: Subnet tagging for automatic LB discovery

☸️ Production-Grade Kubernetes Cluster

  • Managed Control Plane: AWS handles master node management
  • Auto-Scaling Nodes: 1-3 worker nodes based on demand
  • Fargate Integration: Serverless pods for “my-app” namespace
  • EBS CSI Driver: Dynamic persistent volume provisioning

🗄️ Highly Available Database

  • Master-Slave Replication: 1 primary + 2 secondary MySQL instances
  • Persistent Storage: EBS volumes with gp2 storage class
  • Automatic Failover: Kubernetes handles pod rescheduling

🚀 CI/CD Pipeline

  • Manual Approval Gates: Review changes before applying
  • Plan Before Apply: Preview infrastructure changes
  • Safe Destroy Option: Protected destruction with confirmation
  • Automated Testing: Terraform validation on every build

💰 Cost Optimization

  • Single NAT Gateway: Reduced from 3 to 1 (saves ~$64/month)
  • Right-Sized Instances: t3.small instead of t3.medium
  • On-Demand Pricing: No long-term commitments
  • Easy Teardown: Destroy infrastructure when not needed

What I Learned

Technical Skills

Infrastructure as Code

  • Writing modular, reusable Terraform configurations
  • Managing state files remotely in S3
  • Using Terraform modules for best practices
  • Handling provider version constraints and lock files

AWS Networking

  • Designing VPCs with public/private subnet architecture
  • Implementing NAT gateways for private subnet internet access
  • Subnet tagging for Kubernetes integration
  • Security group configuration for EKS

Kubernetes & EKS

  • Understanding EKS control plane vs worker nodes
  • Configuring node groups with launch templates
  • Setting up Fargate profiles for serverless workloads
  • Installing and managing the EBS CSI driver

CI/CD & Automation

  • Building Jenkins pipelines with Jenkinsfile
  • Integrating GitLab with Jenkins
  • Implementing approval gates and safeguards
  • Managing credentials securely in CI/CD

Helm & Package Management

  • Deploying applications using Helm charts
  • Configuring values for production deployments
  • Understanding chart versions and dependencies

Problem-Solving Skills

Debugging Complex Issues

  • Dependency Cycles: Resolved Terraform circular dependencies by using exec authentication instead of data sources
  • Provider Syntax Errors: Learned version-specific Helm provider syntax differences
  • Version Conflicts: Managed Terraform lock file conflicts across environments
  • Timeout Issues: Diagnosed and optimized slow infrastructure deployments

Performance Optimization

  • Reduced deployment time from 60+ minutes to 20-25 minutes
  • Optimized NAT gateway configuration for cost and speed
  • Right-sized EC2 instances for workload requirements

Best Practices

  • Always use module outputs instead of data sources for provider configuration
  • Pin provider versions for reproducibility
  • Implement approval gates for destructive operations
  • Document configuration decisions for future reference

Quick Start

Prerequisites

# Required tools
terraform --version  # >= 1.6.1
aws --version       # AWS CLI configured
kubectl version     # Kubernetes CLI
git --version       # Git for version control

5-Minute Setup

# 1. Clone the repository
git clone https://gitlab.com/ascharle/eks-terra.git
cd eks-terra

# 2. Configure AWS credentials
aws configure

# 3. Initialize Terraform
terraform init

# 4. Review the plan
terraform plan

# 5. Deploy infrastructure
terraform apply

# 6. Configure kubectl
aws eks update-kubeconfig --name my-cluster --region us-east-2

# 7. Verify deployment
kubectl get nodes
kubectl get pods

That’s it! Your EKS cluster is running.


📖 Detailed Setup Guide

Step 1: Prerequisites Setup

Install Required Tools

Terraform:

# macOS
brew install terraform

# Linux
wget https://releases.hashicorp.com/terraform/1.6.1/terraform_1.6.1_linux_amd64.zip
unzip terraform_1.6.1_linux_amd64.zip
sudo mv terraform /usr/local/bin/

# Verify
terraform version

AWS CLI:

# macOS
brew install awscli

# Linux
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Configure
aws configure
# Enter your Access Key ID, Secret Key, Region (us-east-2), Output format (json)

kubectl:

# macOS
brew install kubectl

# Linux
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

AWS Account Setup

  1. Create an AWS account (free tier eligible)
  2. Create an IAM user with permissions:
    • AmazonEKSClusterPolicy
    • AmazonEKSWorkerNodePolicy
    • AmazonEC2FullAccess
    • AmazonVPCFullAccess
    • IAMFullAccess
  3. Generate access keys
  4. Configure AWS CLI with credentials

Step 2: Repository Setup

# Clone the repository
git clone https://gitlab.com/ascharle/eks-terra.git
cd eks-terra

# Review the configuration files
ls -la
# providers.tf - AWS provider and backend configuration
# vpc.tf - Network infrastructure
# eks.tf - EKS cluster and node groups
# mysql.tf - MySQL Helm chart deployment
# variables.tf - Variable definitions
# terraform.tfvars - Variable values
# outputs.tf - Output definitions

Step 3: Customize Configuration (Optional)

Edit terraform.tfvars:

env_prefix     = "dev"           # Environment prefix
k8s_version    = "1.32"          # Kubernetes version
cluster_name   = "my-cluster"    # EKS cluster name
region         = "us-east-2"     # AWS region

Modify instance types in eks.tf:

eks_managed_node_groups = {
  main = {
    instance_types = ["t3.small"]  # Change to t3.medium for more resources
    min_size       = 1             # Minimum nodes
    max_size       = 3             # Maximum nodes
    desired_size   = 2             # Desired nodes
  }
}

Step 4: Deploy Infrastructure

# Initialize Terraform (downloads providers and modules)
terraform init

# Validate configuration syntax
terraform validate

# Preview changes (IMPORTANT: Review before applying!)
terraform plan

# Apply changes
terraform apply

# Review the plan output
# Type 'yes' when prompted

# Wait 20-25 minutes for deployment to complete

What happens during deployment:

  1. VPC and networking (3-5 min)
  2. EKS cluster control plane (10-12 min)
  3. Worker nodes (5-7 min)
  4. Fargate profile (2-3 min)
  5. MySQL database (2-3 min)

Step 5: Verify Deployment

# Configure kubectl to use the new cluster
aws eks update-kubeconfig --name my-cluster --region us-east-2

# Check cluster nodes
kubectl get nodes
# Should show 2 nodes in "Ready" status

# Check all pods
kubectl get pods -A
# Should see system pods + MySQL pods

# Check MySQL specifically
kubectl get pods -l app.kubernetes.io/name=mysql
# Should show:
# mysql-primary-0      1/1  Running
# mysql-secondary-0    1/1  Running
# mysql-secondary-1    1/1  Running

# Get cluster endpoint
kubectl cluster-info

Step 6: Access MySQL (Optional)

# Forward MySQL port to localhost
kubectl port-forward svc/mysql 3306:3306

# In another terminal, connect to MySQL
mysql -h 127.0.0.1 -P 3306 -u root -p
# Password: my-secret-pw (from mysql.tf)

# Run test queries
mysql> SHOW DATABASES;
mysql> CREATE DATABASE test;
mysql> USE test;
mysql> CREATE TABLE users (id INT, name VARCHAR(50));
mysql> INSERT INTO users VALUES (1, 'Test User');
mysql> SELECT * FROM users;

Step 7: Clean Up (When Done)

# IMPORTANT: This destroys everything!
terraform destroy

# Review what will be deleted
# Type 'yes' when prompted

# Wait 10-15 minutes for cleanup

# Verify all resources are deleted
aws eks list-clusters --region us-east-2
# Should be empty

CI/CD Pipeline

Pipeline Architecture

graph LR
    A[Developer] -->|Push Code| B[GitLab]
    B -->|Manual Trigger| C[Jenkins]
    C -->|Pull Code| B
    C -->|Plan| D{Review Plan}
    D -->|Approve| E[Terraform Apply]
    D -->|Reject| F[Cancel]
    E --> G[AWS Infrastructure]
    

Jenkins Pipeline Stages

Pipeline Flow:

graph TD
    A[Checkout Code] --> B[Verify AWS Access]
    B --> C[Terraform Init]
    C --> D[Terraform Validate]
    D --> E{Action Selected}
    
    E -->|plan| F[Generate Plan]
    E -->|apply| G[Generate Plan]
    E -->|destroy| H[Destroy Approval]
    
    G --> I[Manual Approval]
    I --> J[Apply Infrastructure]
    
    H --> K[Type DESTROY]
    K --> L[Destroy Infrastructure]
    
    F --> M[Complete]
    J --> N[Get Outputs]

    

Pipeline Parameters

When you trigger a build in Jenkins, you choose:

1. ACTION:

  • plan – Preview changes without modifying anything (always safe)
  • apply – Create or update infrastructure (requires approval)
  • destroy – Delete all resources (requires typing “DESTROY”)

2. AUTO_APPROVE:

  • false (recommended) – Manual approval required
  • true – Skip approval (dangerous!)

How to Use the Pipeline

Deploy New Infrastructure

1. Open Jenkins → eks-build job
2. Click "Build with Parameters"
3. Select:
   - ACTION: apply
   - AUTO_APPROVE: false
4. Click "Build"
5. Watch the plan output
6. Click "Apply" to proceed or "Abort" to cancel
7. Wait for completion (~20-25 minutes)

Preview Changes Only

1. Jenkins → eks-build job
2. Build with Parameters
3. Select:
   - ACTION: plan
4. Click "Build"
5. Review the plan (no changes made)

Destroy Infrastructure

1. Jenkins → eks-build job
2. Build with Parameters
3. Select:
   - ACTION: destroy
   - AUTO_APPROVE: false
4. Click "Build"
5. Jenkins asks: "Type DESTROY to confirm"
6. Type exactly: DESTROY
7. Click OK
8. Infrastructure is deleted (~10 minutes)

Pipeline Security Features

No automatic deployments – You control when changes happen
Manual approval gates – Review before applying
Destroy confirmation – Type “DESTROY” to prevent accidents
Plan artifacts – Download and review plans offline
Credential management – AWS keys secured in Jenkins

Setting Up the Pipeline

See JENKINS file for complete setup instructions including:

  • Jenkins installation
  • GitLab integration
  • Credential configuration
  • Webhook setup (optional)

Project Structure

eks-terra/
├── README.md                  # This file
├── JENKINS.md                 # CI/CD setup guide
├── .gitignore                 # Ignored files
├── Jenkinsfile                # Jenkins pipeline definition
│
├── providers.tf               # Terraform & provider configuration
├── variables.tf               # Variable declarations
├── terraform.tfvars          # Variable values (DO NOT COMMIT with secrets)
├── outputs.tf                # Output definitions
│
├── vpc.tf                    # VPC and networking resources
├── eks.tf                    # EKS cluster and node groups
├── mysql.tf                  # MySQL Helm chart deployment
│

File Descriptions

Core Terraform Files:

  • providers.tf

    • Terraform version requirements (>= 1.6.1)
    • Provider versions (AWS, Kubernetes, Helm)
    • S3 backend configuration for state storage
    • Provider authentication settings
  • variables.tf

    • Input variable definitions
    • Types, descriptions, and default values
    • Includes: region, cluster_name, k8s_version, env_prefix
  • terraform.tfvars

    • Actual variable values
    • DO NOT commit with real AWS credentials!
    • Example values for non-sensitive data
  • outputs.tf

    • Cluster endpoint URL
    • kubectl configuration command
    • VPC ID and subnet IDs
    • Security group IDs

Infrastructure Files:

  • vpc.tf

    • VPC with CIDR 10.0.0.0/16
    • 3 public subnets (10.0.101-103.0/24)
    • 3 private subnets (10.0.1-3.0/24)
    • Internet Gateway for public subnets
    • NAT Gateway for private subnet internet access
    • Route tables and associations
    • Kubernetes-specific subnet tags
  • eks.tf

    • EKS cluster (v1.32)
    • Managed node group (1-3 t3.small instances)
    • Fargate profile for “my-app” namespace
    • EBS CSI driver addon
    • IAM roles and policies
    • Security groups
    • CloudWatch logging
  • mysql.tf

    • Helm release configuration
    • Bitnami MySQL chart (v9.14.1)
    • Replication architecture (1 primary + 2 secondaries)
    • EBS persistent volumes
    • Database credentials

CI/CD Files:

  • Jenkinsfile

    • Declarative pipeline syntax
    • Three actions: plan, apply, destroy
    • Approval gates for safety
    • Error handling and cleanup
  • .gitignore

    • Terraform state files
    • Lock files
    • Variable files with secrets
    • Local directories

Challenges & Solutions

Challenge 1: Terraform Dependency Cycle

Problem:

Error: Cycle detected:
  provider[kubernetes] → data.aws_eks_cluster
  data.aws_eks_cluster → module.eks
  module.eks → kubernetes_config_map.aws_auth
  kubernetes_config_map.aws_auth → provider[kubernetes]

Root Cause:
The Kubernetes provider tried to authenticate using data sources that depended on the EKS module, which created resources using the Kubernetes provider. Classic circular dependency!

Solution:
Switched from data source authentication to exec authentication:

# OLD (causes cycle):
provider "kubernetes" {
  host  = data.aws_eks_cluster.cluster.endpoint
  token = data.aws_eks_cluster_auth.cluster.token
}

# NEW (no cycle):
provider "kubernetes" {
  host = module.eks.cluster_endpoint
  exec {
    command = "aws"
    args    = ["eks", "get-token", "--cluster-name", var.cluster_name]
  }
}

Why it works: Module outputs are direct references (no dependency on data sources), and exec authentication generates tokens dynamically without data sources.

Learning: Always prefer module outputs over data sources when configuring providers.


Challenge 2: Helm Provider Version Syntax

Problem:

Error: Unsupported block type "kubernetes"
Did you mean to define argument "kubernetes"? 
If so, use the equals sign to assign it a value.

Root Cause:
Helm provider syntax changed between versions:

  • v2.x uses block syntax: kubernetes { }
  • v3.x uses argument syntax: kubernetes = { }

Solution:
Matched syntax to provider version:

# For Helm v2.11.0 (project requirement):
provider "helm" {
  kubernetes {  # No equals sign!
    host = module.eks.cluster_endpoint
    # ...
  }
}

Learning: Always check version-specific documentation. Lock file mismatches can cause subtle syntax errors.


Challenge 3: Instance Type Free Tier Restriction

Problem:

Error: InvalidParameterCombination
The specified instance type is not eligible for Free Tier.

Root Cause:
AWS Free Tier only allows t2.micro and t3.micro instances. Initially used t3.medium which isn’t eligible.

Solution:
Changed to t3.small (good balance of resources and cost):

instance_types = ["t3.small"]  # 2 vCPUs, 2GB RAM

Cost comparison:

  • t3.medium: ~$0.0416/hour (~$90/month for 3 nodes)
  • t3.small: ~$0.0208/hour (~$45/month for 3 nodes)

Learning: Always verify instance types against account quotas and Free Tier eligibility.


Challenge 4: Slow Infrastructure Deployment

Problem:
Initial deployments took 60+ minutes, with node group timeouts.

Root Cause:

  • 3 NAT Gateways (one per AZ) instead of 1
  • Unnecessary VPN gateway
  • Complex networking setup

Solution:
Optimized VPC configuration:

# Before:
enable_nat_gateway = true
# (created 3 NAT gateways)

# After:
enable_nat_gateway   = true
single_nat_gateway   = true  # Only 1 NAT gateway
# Removed: enable_vpn_gateway = true

Results:

  • Deployment time: 60+ min → 20-25 min
  • Monthly cost savings: ~$64 (2 fewer NAT gateways)

Learning: Start with minimum viable configuration, scale up only when needed.


Challenge 5: Terraform Lock File Conflicts

Problem:

Error: hashicorp/helm locked to 3.1.1 
but configured version constraint requires ~> 2.11.0

Root Cause:
.terraform.lock.hcl file committed to Git locked providers to wrong versions across different environments.

Solution:
Added lock file to .gitignore:

.terraform.lock.hcl

Each environment (local, Jenkins) maintains its own lock file with correct versions.

Alternative for teams:
Run terraform init -upgrade to update lock file, then commit the updated version.

Learning: For personal projects, don’t commit lock files. For teams, commit them but keep them updated.


Challenge 6: Swapped Subnet Tags

Problem:
Subnet tags were backwards:

  • Private subnets tagged for public load balancers
  • Public subnets tagged for internal load balancers

Solution:

# Private subnets (for internal ELB):
tags = {
  "kubernetes.io/role/internal-elb" = "1"
}

# Public subnets (for external ELB):
tags = {
  "kubernetes.io/role/elb" = "1"
}

Impact:
Won’t break deployment but will cause issues when creating LoadBalancer services later.

Learning: Subnet tagging is critical for Kubernetes AWS integration.


Challenge 7: Path Issues in CI/CD

Problem:

Error: permission denied: /home/charles/terra-pro/values.yaml

Root Cause:
Hardcoded absolute path that worked locally but failed in Jenkins:

# BAD:
values = [file("/home/charles/terra-pro/values.yaml")]

Solution:
Use relative path with ${path.module}:

# GOOD:
values = [file("${path.module}/values.yaml")]

Even better solution:
Replaced values file with set blocks (no file needed):

set {
  name  = "architecture"
  value = "replication"
}

Learning: Never use absolute paths in infrastructure code. Use relative paths or eliminate external files.


Cost Optimization

Current Monthly Cost Estimate

If running 24/7 in us-east-2:

Resource Quantity Unit Cost Monthly Cost
EKS Control Plane 1 $0.10/hour ~$72
Worker Nodes (t3.small) 2 $0.0208/hour ~$30
NAT Gateway 1 $0.045/hour ~$32
NAT Data Transfer ~100GB $0.045/GB ~$5
EBS Volumes (gp2) 3 x 20GB $0.10/GB/month ~$6
CloudWatch Logs Minimal Variable ~$2
TOTAL ~$147/month

Cost Savings Implemented

Single NAT Gateway

  • Before: 3 NAT gateways ($96/month)
  • After: 1 NAT gateway ($32/month)
  • Savings: $64/month (67% reduction)

Right-Sized Instances

  • Before: 3x t3.medium ($90/month)
  • After: 2x t3.small ($30/month)
  • Savings: $60/month (67% reduction)

Removed Unnecessary Resources

  • Removed VPN Gateway (would be $36/month)
  • Reduced node count from 3 to 2

Total Monthly Savings: ~$160 🎉

Additional Cost Optimization Tips

1. Use Spot Instances (Advanced)

capacity_type = "SPOT"  # Instead of "ON_DEMAND"
# Savings: Up to 70% on EC2 costs

2. Auto-Scaling Based on Time

# Scale down during nights/weekends
kubectl scale deployment --replicas=0 my-app
# Or use Karpenter for intelligent scaling

3. Destroy When Not In Use

# Weekend shutdown
terraform destroy  # Friday evening
terraform apply    # Monday morning
# Savings: ~$100/weekend

4. Use AWS Free Tier

  • 750 hours/month of t2.micro (covers 1 node 24/7)
  • Some EBS storage included
  • First 12 months free

5. Reserved Instances (Production)

  • 1-year commitment: 30-40% savings
  • 3-year commitment: 50-60% savings
  • Only for long-running production workloads

Current Setup vs Production

Current (Learning) Setup:

  • On-Demand instances
  • Development environment
  • Can be destroyed when not in use
  • Cost: ~$147/month (if running 24/7)
  • Actual cost: ~$0-50/month (destroy when not testing)

Production Setup:

  • Reserved or Spot instances
  • Multi-region deployment
  • Auto-scaling enabled
  • Enhanced monitoring
  • Cost: $500-2000/month (depending on scale)

Future Enhancements

Short-Term Improvements

1. Monitoring & Observability

  • Set up Prometheus for metrics collection
  • Deploy Grafana for visualization dashboards
  • Configure CloudWatch Container Insights
  • Implement AWS X-Ray for distributed tracing
  • Set up alerting with PagerDuty or Slack

2. Security Hardening

  • Enable AWS Secrets Manager for database passwords
  • Implement Pod Security Policies
  • Set up AWS WAF for web application firewall
  • Enable VPC Flow Logs for network monitoring
  • Implement least-privilege IAM policies

3. High Availability

  • Multi-region deployment with Route53 failover
  • Implement database backups with snapshots
  • Set up cluster autoscaling
  • Configure pod disruption budgets
  • Implement horizontal pod autoscaling

4. CI/CD Enhancements

  • Add automated testing to pipeline (Terratest)
  • Implement automatic plan on PR
  • Set up cost estimation with Infracost
  • Add security scanning (Checkov, tfsec)
  • Implement drift detection

5. GitOps Approach

  • Migrate to ArgoCD or Flux for GitOps
  • Separate infrastructure and application repos
  • Implement automated sync from Git
  • Set up progressive delivery

Long-Term Improvements

6. Advanced Kubernetes Features

  • Service mesh implementation (Istio or Linkerd)
  • Implement network policies
  • Set up Kubernetes operators for database management
  • Advanced scheduling with node affinity
  • Implement custom metrics for HPA

8. Cost Management

  • Implement Karpenter for cost-optimized scaling
  • Set up Kubecost for Kubernetes cost visibility
  • Configure AWS Compute Optimizer
  • Implement spot instance interruption handling
  • Create detailed cost allocation tags

9. Disaster Recovery

  • Automated backup and restore procedures
  • Multi-region replication
  • Regular disaster recovery drills
  • RPO/RTO documentation and testing

📞 Contact

Charles Asirifi


Acknowledgments

  • Terraform AWS Modules – For well-maintained VPC and EKS modules
  • Bitnami – For the production-ready MySQL Helm chart
  • AWS Documentation – For comprehensive EKS guides
  • HashiCorp Learn – For Terraform tutorials and best practices
  • DevSecOps Bootcamp – For project requirements and guidance

Project Stats

Terraform AWS Resources Kubernetes Version Deployment Time


Leave a Comment

Your email address will not be published. Required fields are marked *