AWS EKS Infrastructure Automation with Terraform & Jenkins

Project Overview
Architecture
Technologies Used
Key Features
What I Learned
Quick Start
Detailed Setup Guide
CI/CD Pipeline
Project Structure
Challenges & Solutions
Cost Optimization
Future Enhancements
Contact

Project Overview

The Challenge

Modern cloud infrastructure requires more than just clicking buttons in a web console. Organizations need automated, repeatable, and version-controlled infrastructure that can be deployed consistently across environments.

The Solution

I built a production-grade automated infrastructure deployment pipeline that provisions a complete AWS EKS (Elastic Kubernetes Service) cluster using Infrastructure as Code principles. The entire infrastructure—from networking to database deployment—is defined in code, version-controlled in GitLab, and automatically deployed through Jenkins CI/CD.

Real-World Impact

This project demonstrates skills that directly translate to enterprise DevOps roles:

Automated cloud provisioning reduces deployment time from hours to minutes
Infrastructure as Code eliminates configuration drift and manual errors
CI/CD integration enables rapid, reliable deployments
Kubernetes orchestration provides scalable application hosting
Cost optimization through smart resource management

🛠️ Technologies Used

Infrastructure & Cloud

Terraform (v1.6.1) – Infrastructure as Code tool
AWS EKS (v1.32) – Managed Kubernetes service
AWS VPC – Virtual Private Cloud networking
AWS EC2 – Compute instances for Kubernetes nodes
AWS EBS – Persistent storage volumes

Container Orchestration

Kubernetes (v1.32) – Container orchestration platform
Helm (v2.11.0) – Kubernetes package manager
AWS Fargate – Serverless container runtime

CI/CD & Version Control

Jenkins – Automation server for CI/CD
GitLab – Source code management and version control
Git – Distributed version control

Database

MySQL (Bitnami Chart v9.14.1) – Relational database with replication
- 1 Primary instance
- 2 Secondary replicas for high availability

Configuration Management

Terraform Modules:
- terraform-aws-modules/vpc/aws (v5.2.0)
- terraform-aws-modules/eks/aws (v19.20.0)

✨ Key Features

🔄 Automated Infrastructure Provisioning

One-Click Deployment: Entire infrastructure created from a single Terraform apply
Repeatable: Destroy and recreate identical environments instantly
Version Controlled: All infrastructure changes tracked in Git

🌐 High-Availability Networking

Multi-AZ Deployment: Resources spread across 3 availability zones
Private Networking: Worker nodes isolated in private subnets
Secure Internet Access: NAT Gateway for outbound connections
Load Balancer Ready: Subnet tagging for automatic LB discovery

☸️ Production-Grade Kubernetes Cluster

Managed Control Plane: AWS handles master node management
Auto-Scaling Nodes: 1-3 worker nodes based on demand
Fargate Integration: Serverless pods for “my-app” namespace
EBS CSI Driver: Dynamic persistent volume provisioning

🗄️ Highly Available Database

Master-Slave Replication: 1 primary + 2 secondary MySQL instances
Persistent Storage: EBS volumes with gp2 storage class
Automatic Failover: Kubernetes handles pod rescheduling

🚀 CI/CD Pipeline

Manual Approval Gates: Review changes before applying
Plan Before Apply: Preview infrastructure changes
Safe Destroy Option: Protected destruction with confirmation
Automated Testing: Terraform validation on every build

💰 Cost Optimization

Single NAT Gateway: Reduced from 3 to 1 (saves ~$64/month)
Right-Sized Instances: t3.small instead of t3.medium
On-Demand Pricing: No long-term commitments
Easy Teardown: Destroy infrastructure when not needed

What I Learned

Technical Skills

Infrastructure as Code

Writing modular, reusable Terraform configurations
Managing state files remotely in S3
Using Terraform modules for best practices
Handling provider version constraints and lock files

AWS Networking

Designing VPCs with public/private subnet architecture
Implementing NAT gateways for private subnet internet access
Subnet tagging for Kubernetes integration
Security group configuration for EKS

Kubernetes & EKS

Understanding EKS control plane vs worker nodes
Configuring node groups with launch templates
Setting up Fargate profiles for serverless workloads
Installing and managing the EBS CSI driver

CI/CD & Automation

Building Jenkins pipelines with Jenkinsfile
Integrating GitLab with Jenkins
Implementing approval gates and safeguards
Managing credentials securely in CI/CD

Helm & Package Management

Deploying applications using Helm charts
Configuring values for production deployments
Understanding chart versions and dependencies

Problem-Solving Skills

Debugging Complex Issues

Dependency Cycles: Resolved Terraform circular dependencies by using exec authentication instead of data sources
Provider Syntax Errors: Learned version-specific Helm provider syntax differences
Version Conflicts: Managed Terraform lock file conflicts across environments
Timeout Issues: Diagnosed and optimized slow infrastructure deployments

Performance Optimization

Reduced deployment time from 60+ minutes to 20-25 minutes
Optimized NAT gateway configuration for cost and speed
Right-sized EC2 instances for workload requirements

Best Practices

Always use module outputs instead of data sources for provider configuration
Pin provider versions for reproducibility
Implement approval gates for destructive operations
Document configuration decisions for future reference

Quick Start

Prerequisites

# Required tools
terraform --version  # >= 1.6.1
aws --version       # AWS CLI configured
kubectl version     # Kubernetes CLI
git --version       # Git for version control

5-Minute Setup

# 1. Clone the repository
git clone https://gitlab.com/ascharle/eks-terra.git
cd eks-terra

# 2. Configure AWS credentials
aws configure

# 3. Initialize Terraform
terraform init

# 4. Review the plan
terraform plan

# 5. Deploy infrastructure
terraform apply

# 6. Configure kubectl
aws eks update-kubeconfig --name my-cluster --region us-east-2

# 7. Verify deployment
kubectl get nodes
kubectl get pods

That’s it! Your EKS cluster is running. ✅

📖 Detailed Setup Guide

Step 1: Prerequisites Setup

Install Required Tools

Terraform:

# macOS
brew install terraform

# Linux
wget https://releases.hashicorp.com/terraform/1.6.1/terraform_1.6.1_linux_amd64.zip
unzip terraform_1.6.1_linux_amd64.zip
sudo mv terraform /usr/local/bin/

# Verify
terraform version

AWS CLI:

# macOS
brew install awscli

# Linux
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Configure
aws configure
# Enter your Access Key ID, Secret Key, Region (us-east-2), Output format (json)

kubectl:

# macOS
brew install kubectl

# Linux
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

AWS Account Setup

Create an AWS account (free tier eligible)
Create an IAM user with permissions:
- AmazonEKSClusterPolicy
- AmazonEKSWorkerNodePolicy
- AmazonEC2FullAccess
- AmazonVPCFullAccess
- IAMFullAccess
Generate access keys
Configure AWS CLI with credentials

Step 2: Repository Setup

# Clone the repository
git clone https://gitlab.com/ascharle/eks-terra.git
cd eks-terra

# Review the configuration files
ls -la
# providers.tf - AWS provider and backend configuration
# vpc.tf - Network infrastructure
# eks.tf - EKS cluster and node groups
# mysql.tf - MySQL Helm chart deployment
# variables.tf - Variable definitions
# terraform.tfvars - Variable values
# outputs.tf - Output definitions

Step 3: Customize Configuration (Optional)

Edit terraform.tfvars:

env_prefix     = "dev"           # Environment prefix
k8s_version    = "1.32"          # Kubernetes version
cluster_name   = "my-cluster"    # EKS cluster name
region         = "us-east-2"     # AWS region

Modify instance types in eks.tf:

eks_managed_node_groups = {
  main = {
    instance_types = ["t3.small"]  # Change to t3.medium for more resources
    min_size       = 1             # Minimum nodes
    max_size       = 3             # Maximum nodes
    desired_size   = 2             # Desired nodes
  }
}

Step 4: Deploy Infrastructure

# Initialize Terraform (downloads providers and modules)
terraform init

# Validate configuration syntax
terraform validate

# Preview changes (IMPORTANT: Review before applying!)
terraform plan

# Apply changes
terraform apply

# Review the plan output
# Type 'yes' when prompted

# Wait 20-25 minutes for deployment to complete

What happens during deployment:

VPC and networking (3-5 min)
EKS cluster control plane (10-12 min)
Worker nodes (5-7 min)
Fargate profile (2-3 min)
MySQL database (2-3 min)

Step 5: Verify Deployment

# Configure kubectl to use the new cluster
aws eks update-kubeconfig --name my-cluster --region us-east-2

# Check cluster nodes
kubectl get nodes
# Should show 2 nodes in "Ready" status

# Check all pods
kubectl get pods -A
# Should see system pods + MySQL pods

# Check MySQL specifically
kubectl get pods -l app.kubernetes.io/name=mysql
# Should show:
# mysql-primary-0      1/1  Running
# mysql-secondary-0    1/1  Running
# mysql-secondary-1    1/1  Running

# Get cluster endpoint
kubectl cluster-info

Step 6: Access MySQL (Optional)

# Forward MySQL port to localhost
kubectl port-forward svc/mysql 3306:3306

# In another terminal, connect to MySQL
mysql -h 127.0.0.1 -P 3306 -u root -p
# Password: my-secret-pw (from mysql.tf)

# Run test queries
mysql> SHOW DATABASES;
mysql> CREATE DATABASE test;
mysql> USE test;
mysql> CREATE TABLE users (id INT, name VARCHAR(50));
mysql> INSERT INTO users VALUES (1, 'Test User');
mysql> SELECT * FROM users;

Step 7: Clean Up (When Done)

# IMPORTANT: This destroys everything!
terraform destroy

# Review what will be deleted
# Type 'yes' when prompted

# Wait 10-15 minutes for cleanup

# Verify all resources are deleted
aws eks list-clusters --region us-east-2
# Should be empty

CI/CD Pipeline

Pipeline Architecture

graph LR
    A[Developer] -->|Push Code| B[GitLab]
    B -->|Manual Trigger| C[Jenkins]
    C -->|Pull Code| B
    C -->|Plan| D{Review Plan}
    D -->|Approve| E[Terraform Apply]
    D -->|Reject| F[Cancel]
    E --> G[AWS Infrastructure]

Jenkins Pipeline Stages

Pipeline Flow:

graph TD
    A[Checkout Code] --> B[Verify AWS Access]
    B --> C[Terraform Init]
    C --> D[Terraform Validate]
    D --> E{Action Selected}
    
    E -->|plan| F[Generate Plan]
    E -->|apply| G[Generate Plan]
    E -->|destroy| H[Destroy Approval]
    
    G --> I[Manual Approval]
    I --> J[Apply Infrastructure]
    
    H --> K[Type DESTROY]
    K --> L[Destroy Infrastructure]
    
    F --> M[Complete]
    J --> N[Get Outputs]

Pipeline Parameters

When you trigger a build in Jenkins, you choose:

1. ACTION:

plan – Preview changes without modifying anything (always safe)
apply – Create or update infrastructure (requires approval)
destroy – Delete all resources (requires typing “DESTROY”)

2. AUTO_APPROVE:

false (recommended) – Manual approval required
true – Skip approval (dangerous!)

How to Use the Pipeline

Deploy New Infrastructure

1. Open Jenkins → eks-build job
2. Click "Build with Parameters"
3. Select:
   - ACTION: apply
   - AUTO_APPROVE: false
4. Click "Build"
5. Watch the plan output
6. Click "Apply" to proceed or "Abort" to cancel
7. Wait for completion (~20-25 minutes)

Preview Changes Only

1. Jenkins → eks-build job
2. Build with Parameters
3. Select:
   - ACTION: plan
4. Click "Build"
5. Review the plan (no changes made)

Destroy Infrastructure

1. Jenkins → eks-build job
2. Build with Parameters
3. Select:
   - ACTION: destroy
   - AUTO_APPROVE: false
4. Click "Build"
5. Jenkins asks: "Type DESTROY to confirm"
6. Type exactly: DESTROY
7. Click OK
8. Infrastructure is deleted (~10 minutes)

Pipeline Security Features

No automatic deployments – You control when changes happen
Manual approval gates – Review before applying
Destroy confirmation – Type “DESTROY” to prevent accidents
Plan artifacts – Download and review plans offline
Credential management – AWS keys secured in Jenkins

Setting Up the Pipeline

See JENKINS file for complete setup instructions including:

Jenkins installation
GitLab integration
Credential configuration
Webhook setup (optional)

Project Structure

eks-terra/
├── README.md                  # This file
├── JENKINS.md                 # CI/CD setup guide
├── .gitignore                 # Ignored files
├── Jenkinsfile                # Jenkins pipeline definition
│
├── providers.tf               # Terraform & provider configuration
├── variables.tf               # Variable declarations
├── terraform.tfvars          # Variable values (DO NOT COMMIT with secrets)
├── outputs.tf                # Output definitions
│
├── vpc.tf                    # VPC and networking resources
├── eks.tf                    # EKS cluster and node groups
├── mysql.tf                  # MySQL Helm chart deployment
│

File Descriptions

Core Terraform Files:

providers.tf
- Terraform version requirements (>= 1.6.1)
- Provider versions (AWS, Kubernetes, Helm)
- S3 backend configuration for state storage
- Provider authentication settings
variables.tf
- Input variable definitions
- Types, descriptions, and default values
- Includes: region, cluster_name, k8s_version, env_prefix
terraform.tfvars
- Actual variable values
- DO NOT commit with real AWS credentials!
- Example values for non-sensitive data
outputs.tf
- Cluster endpoint URL
- kubectl configuration command
- VPC ID and subnet IDs
- Security group IDs

Infrastructure Files:

vpc.tf
- VPC with CIDR 10.0.0.0/16
- 3 public subnets (10.0.101-103.0/24)
- 3 private subnets (10.0.1-3.0/24)
- Internet Gateway for public subnets
- NAT Gateway for private subnet internet access
- Route tables and associations
- Kubernetes-specific subnet tags
eks.tf
- EKS cluster (v1.32)
- Managed node group (1-3 t3.small instances)
- Fargate profile for “my-app” namespace
- EBS CSI driver addon
- IAM roles and policies
- Security groups
- CloudWatch logging
mysql.tf
- Helm release configuration
- Bitnami MySQL chart (v9.14.1)
- Replication architecture (1 primary + 2 secondaries)
- EBS persistent volumes
- Database credentials

CI/CD Files:

Jenkinsfile
- Declarative pipeline syntax
- Three actions: plan, apply, destroy
- Approval gates for safety
- Error handling and cleanup
.gitignore
- Terraform state files
- Lock files
- Variable files with secrets
- Local directories

Challenges & Solutions

Challenge 1: Terraform Dependency Cycle

Problem:

Error: Cycle detected:
  provider[kubernetes] → data.aws_eks_cluster
  data.aws_eks_cluster → module.eks
  module.eks → kubernetes_config_map.aws_auth
  kubernetes_config_map.aws_auth → provider[kubernetes]

Root Cause:
The Kubernetes provider tried to authenticate using data sources that depended on the EKS module, which created resources using the Kubernetes provider. Classic circular dependency!

Solution:
Switched from data source authentication to exec authentication:

# OLD (causes cycle):
provider "kubernetes" {
  host  = data.aws_eks_cluster.cluster.endpoint
  token = data.aws_eks_cluster_auth.cluster.token
}

# NEW (no cycle):
provider "kubernetes" {
  host = module.eks.cluster_endpoint
  exec {
    command = "aws"
    args    = ["eks", "get-token", "--cluster-name", var.cluster_name]
  }
}

Why it works: Module outputs are direct references (no dependency on data sources), and exec authentication generates tokens dynamically without data sources.

Learning: Always prefer module outputs over data sources when configuring providers.

Challenge 2: Helm Provider Version Syntax

Problem:

Error: Unsupported block type "kubernetes"
Did you mean to define argument "kubernetes"? 
If so, use the equals sign to assign it a value.

Root Cause:
Helm provider syntax changed between versions:

v2.x uses block syntax: kubernetes { }
v3.x uses argument syntax: kubernetes = { }

Solution:
Matched syntax to provider version:

# For Helm v2.11.0 (project requirement):
provider "helm" {
  kubernetes {  # No equals sign!
    host = module.eks.cluster_endpoint
    # ...
  }
}

Learning: Always check version-specific documentation. Lock file mismatches can cause subtle syntax errors.

Challenge 3: Instance Type Free Tier Restriction

Problem:

Error: InvalidParameterCombination
The specified instance type is not eligible for Free Tier.

Root Cause:
AWS Free Tier only allows t2.micro and t3.micro instances. Initially used t3.medium which isn’t eligible.

Solution:
Changed to t3.small (good balance of resources and cost):

instance_types = ["t3.small"]  # 2 vCPUs, 2GB RAM

Cost comparison:

t3.medium: ~$0.0416/hour (~$90/month for 3 nodes)
t3.small: ~$0.0208/hour (~$45/month for 3 nodes)

Learning: Always verify instance types against account quotas and Free Tier eligibility.

Challenge 4: Slow Infrastructure Deployment

Problem:
Initial deployments took 60+ minutes, with node group timeouts.

Root Cause:

3 NAT Gateways (one per AZ) instead of 1
Unnecessary VPN gateway
Complex networking setup

Solution:
Optimized VPC configuration:

# Before:
enable_nat_gateway = true
# (created 3 NAT gateways)

# After:
enable_nat_gateway   = true
single_nat_gateway   = true  # Only 1 NAT gateway
# Removed: enable_vpn_gateway = true

Results:

Deployment time: 60+ min → 20-25 min
Monthly cost savings: ~$64 (2 fewer NAT gateways)

Learning: Start with minimum viable configuration, scale up only when needed.

Challenge 5: Terraform Lock File Conflicts

Problem:

Error: hashicorp/helm locked to 3.1.1 
but configured version constraint requires ~> 2.11.0

Root Cause:
.terraform.lock.hcl file committed to Git locked providers to wrong versions across different environments.

Solution:
Added lock file to .gitignore:

.terraform.lock.hcl

Each environment (local, Jenkins) maintains its own lock file with correct versions.

Alternative for teams:
Run terraform init -upgrade to update lock file, then commit the updated version.

Learning: For personal projects, don’t commit lock files. For teams, commit them but keep them updated.

Challenge 6: Swapped Subnet Tags

Problem:
Subnet tags were backwards:

Private subnets tagged for public load balancers
Public subnets tagged for internal load balancers

Solution:

# Private subnets (for internal ELB):
tags = {
  "kubernetes.io/role/internal-elb" = "1"
}

# Public subnets (for external ELB):
tags = {
  "kubernetes.io/role/elb" = "1"
}

Impact:
Won’t break deployment but will cause issues when creating LoadBalancer services later.

Learning: Subnet tagging is critical for Kubernetes AWS integration.

Challenge 7: Path Issues in CI/CD

Problem:

Error: permission denied: /home/charles/terra-pro/values.yaml

Root Cause:
Hardcoded absolute path that worked locally but failed in Jenkins:

# BAD:
values = [file("/home/charles/terra-pro/values.yaml")]

Solution:
Use relative path with ${path.module}:

# GOOD:
values = [file("${path.module}/values.yaml")]

Even better solution:
Replaced values file with set blocks (no file needed):

set {
  name  = "architecture"
  value = "replication"
}

Learning: Never use absolute paths in infrastructure code. Use relative paths or eliminate external files.

Cost Optimization

Current Monthly Cost Estimate

If running 24/7 in us-east-2:

Resource	Quantity	Unit Cost	Monthly Cost
EKS Control Plane	1	$0.10/hour	~$72
Worker Nodes (t3.small)	2	$0.0208/hour	~$30
NAT Gateway	1	$0.045/hour	~$32
NAT Data Transfer	~100GB	$0.045/GB	~$5
EBS Volumes (gp2)	3 x 20GB	$0.10/GB/month	~$6
CloudWatch Logs	Minimal	Variable	~$2
TOTAL			~$147/month

Cost Savings Implemented

✅ Single NAT Gateway

Before: 3 NAT gateways ($96/month)
After: 1 NAT gateway ($32/month)
Savings: $64/month (67% reduction)

✅ Right-Sized Instances

Before: 3x t3.medium ($90/month)
After: 2x t3.small ($30/month)
Savings: $60/month (67% reduction)

✅ Removed Unnecessary Resources

Removed VPN Gateway (would be $36/month)
Reduced node count from 3 to 2

Total Monthly Savings: ~$160 🎉

Additional Cost Optimization Tips

1. Use Spot Instances (Advanced)

capacity_type = "SPOT"  # Instead of "ON_DEMAND"
# Savings: Up to 70% on EC2 costs

2. Auto-Scaling Based on Time

# Scale down during nights/weekends
kubectl scale deployment --replicas=0 my-app
# Or use Karpenter for intelligent scaling

3. Destroy When Not In Use

# Weekend shutdown
terraform destroy  # Friday evening
terraform apply    # Monday morning
# Savings: ~$100/weekend

4. Use AWS Free Tier

750 hours/month of t2.micro (covers 1 node 24/7)
Some EBS storage included
First 12 months free

5. Reserved Instances (Production)

1-year commitment: 30-40% savings
3-year commitment: 50-60% savings
Only for long-running production workloads

Current Setup vs Production

Current (Learning) Setup:

On-Demand instances
Development environment
Can be destroyed when not in use
Cost: ~$147/month (if running 24/7)
Actual cost: ~$0-50/month (destroy when not testing)

Production Setup:

Reserved or Spot instances
Multi-region deployment
Auto-scaling enabled
Enhanced monitoring
Cost: $500-2000/month (depending on scale)

Future Enhancements

Short-Term Improvements

1. Monitoring & Observability

Set up Prometheus for metrics collection
Deploy Grafana for visualization dashboards
Configure CloudWatch Container Insights
Implement AWS X-Ray for distributed tracing
Set up alerting with PagerDuty or Slack

2. Security Hardening

Enable AWS Secrets Manager for database passwords
Implement Pod Security Policies
Set up AWS WAF for web application firewall
Enable VPC Flow Logs for network monitoring
Implement least-privilege IAM policies

3. High Availability

Multi-region deployment with Route53 failover
Implement database backups with snapshots
Set up cluster autoscaling
Configure pod disruption budgets
Implement horizontal pod autoscaling

4. CI/CD Enhancements

Add automated testing to pipeline (Terratest)
Implement automatic plan on PR
Set up cost estimation with Infracost
Add security scanning (Checkov, tfsec)
Implement drift detection

5. GitOps Approach

Migrate to ArgoCD or Flux for GitOps
Separate infrastructure and application repos
Implement automated sync from Git
Set up progressive delivery

Long-Term Improvements

6. Advanced Kubernetes Features

Service mesh implementation (Istio or Linkerd)
Implement network policies
Set up Kubernetes operators for database management
Advanced scheduling with node affinity
Implement custom metrics for HPA

8. Cost Management

Implement Karpenter for cost-optimized scaling
Set up Kubecost for Kubernetes cost visibility
Configure AWS Compute Optimizer
Implement spot instance interruption handling
Create detailed cost allocation tags

9. Disaster Recovery

Automated backup and restore procedures
Multi-region replication
Regular disaster recovery drills
RPO/RTO documentation and testing

📞 Contact

Charles Asirifi

📧 Email: as@charlesas.com
💼 LinkedIn: linkedin.com/in/as-charles
🌐 Portfolio: charlesas.com
📁 GitLab: gitlab.com/ascharle

Acknowledgments

Terraform AWS Modules – For well-maintained VPC and EKS modules
Bitnami – For the production-ready MySQL Helm chart
AWS Documentation – For comprehensive EKS guides
HashiCorp Learn – For Terraform tutorials and best practices
DevSecOps Bootcamp – For project requirements and guidance