AWS Deep Dive
Amazon Web Services — the world's most comprehensive and broadly adopted cloud platform. This guide covers production-grade configurations with real CLI commands and Terraform snippets.
Compute
EC2 — Elastic Compute Cloud
EC2 provides resizable virtual servers in the cloud. Choosing the right instance type is critical for performance and cost efficiency.
| Family | Optimized For | Examples | Use Cases |
|---|---|---|---|
| General Purpose | Balanced CPU/Memory | m7g, m6i, t3, t4g | Web servers, app servers, dev environments |
| Compute Optimized | High CPU performance | c7g, c6i, c6a | Batch processing, media encoding, gaming, HPC |
| Memory Optimized | Large RAM workloads | r7g, r6i, x2idn, u-24tb1 | In-memory databases, SAP HANA, real-time analytics |
| Storage Optimized | High disk I/O | i4i, im4gn, d3en | NoSQL DBs, data warehouses, OLTP, distributed FS |
| Accelerated Computing | GPU / FPGA | p4d, g5, inf2, trn1 | ML training/inference, video rendering, HPC |
# Launch EC2 instance with detailed options
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type m6i.xlarge \
--key-name my-keypair \
--subnet-id subnet-0a1b2c3d4e5f \
--security-group-ids sg-0abc123 \
--iam-instance-profile Name=my-instance-profile \
--user-data file://user-data.sh \
--block-device-mappings '[{"DeviceName":"/dev/xvda","Ebs":{"VolumeSize":50,"VolumeType":"gp3","Iops":3000,"Encrypted":true}}]' \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=web-server-01},{Key=Environment,Value=prod}]' \
--metadata-options "HttpTokens=required,HttpEndpoint=enabled" \
--placement '{"AvailabilityZone":"ap-southeast-1a","Tenancy":"default"}'
# Describe instances with filter
aws ec2 describe-instances \
--filters "Name=tag:Environment,Values=prod" "Name=instance-state-name,Values=running" \
--query 'Reservations[*].Instances[*].[InstanceId,InstanceType,PrivateIpAddress,Tags[?Key==`Name`].Value|[0]]' \
--output table
# Create AMI from running instance
aws ec2 create-image \
--instance-id i-1234567890abcdef0 \
--name "my-app-ami-$(date +%Y%m%d)" \
--description "Production app AMI" \
--no-reboot
# Modify instance type (must be stopped)
aws ec2 modify-instance-attribute \
--instance-id i-1234567890abcdef0 \
--instance-type '{"Value":"m6i.2xlarge"}'
# Create placement group for low-latency cluster
aws ec2 create-placement-group \
--group-name my-cluster-pg \
--strategy cluster
Security best practice: Always set
HttpTokens=required in metadata options to enforce IMDSv2 and prevent SSRF attacks targeting the instance metadata service.
EKS — Elastic Kubernetes Service
# Create EKS cluster with eksctl (recommended)
eksctl create cluster \
--name prod-cluster \
--region ap-southeast-1 \
--version 1.29 \
--nodegroup-name managed-ng-1 \
--node-type m6i.xlarge \
--nodes 3 \
--nodes-min 2 \
--nodes-max 10 \
--managed \
--asg-access \
--with-oidc \
--ssh-access \
--ssh-public-key my-keypair \
--vpc-private-subnets subnet-aaa,subnet-bbb,subnet-ccc
# Add Fargate profile for serverless pods
eksctl create fargateprofile \
--cluster prod-cluster \
--name fp-default \
--namespace default \
--namespace kube-system \
--labels env=fargate
# Install EKS add-ons
# VPC CNI
aws eks create-addon \
--cluster-name prod-cluster \
--addon-name vpc-cni \
--addon-version v1.16.0-eksbuild.1 \
--service-account-role-arn arn:aws:iam::123456789012:role/AmazonEKSVPCCNIRole
# CoreDNS
aws eks create-addon \
--cluster-name prod-cluster \
--addon-name coredns \
--addon-version v1.11.1-eksbuild.4
# kube-proxy
aws eks create-addon \
--cluster-name prod-cluster \
--addon-name kube-proxy \
--addon-version v1.29.0-eksbuild.1
# EBS CSI Driver
aws eks create-addon \
--cluster-name prod-cluster \
--addon-name aws-ebs-csi-driver \
--addon-version v1.26.0-eksbuild.1 \
--service-account-role-arn arn:aws:iam::123456789012:role/AmazonEKSEBSCSIDriverRole
# AWS Load Balancer Controller (via Helm)
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=prod-cluster \
--set serviceAccount.create=false \
--set serviceAccount.name=aws-load-balancer-controller
# Update kubeconfig
aws eks update-kubeconfig --region ap-southeast-1 --name prod-cluster
Lambda — Serverless Functions
# Create Lambda function
aws lambda create-function \
--function-name process-orders \
--runtime python3.12 \
--role arn:aws:iam::123456789012:role/lambda-execution-role \
--handler app.handler \
--zip-file fileb://function.zip \
--timeout 30 \
--memory-size 512 \
--environment Variables="{DB_HOST=db.example.com,STAGE=prod}" \
--vpc-config SubnetIds=subnet-aaa,subnet-bbb,SecurityGroupIds=sg-0abc123 \
--layers arn:aws:lambda:ap-southeast-1:123456789012:layer:my-deps:3
# Configure reserved concurrency (prevent throttling neighbors)
aws lambda put-function-concurrency \
--function-name process-orders \
--reserved-concurrent-executions 100
# Configure provisioned concurrency (eliminate cold starts)
aws lambda put-provisioned-concurrency-config \
--function-name process-orders \
--qualifier prod \
--provisioned-concurrent-executions 10
# Add SQS trigger
aws lambda create-event-source-mapping \
--event-source-arn arn:aws:sqs:ap-southeast-1:123456789012:orders-queue \
--function-name process-orders \
--batch-size 10 \
--maximum-batching-window-in-seconds 5
Networking
VPC — Virtual Private Cloud
# Create VPC with CIDR
aws ec2 create-vpc \
--cidr-block 10.10.0.0/16 \
--tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=prod-vpc}]'
# Create subnets across 3 AZs
# Public subnets (for ALB, NAT GW)
for i in 1 2 3; do
aws ec2 create-subnet \
--vpc-id vpc-0abc123 \
--cidr-block "10.10.${i}.0/24" \
--availability-zone "ap-southeast-1$(echo 'abc' | cut -c${i})" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=public-subnet-${i}},{Key=Type,Value=public}]"
done
# Private subnets (app tier)
for i in 1 2 3; do
aws ec2 create-subnet \
--vpc-id vpc-0abc123 \
--cidr-block "10.10.1${i}.0/24" \
--availability-zone "ap-southeast-1$(echo 'abc' | cut -c${i})" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=private-subnet-${i}},{Key=Type,Value=private}]"
done
# Isolated subnets (database tier — no internet access)
for i in 1 2 3; do
aws ec2 create-subnet \
--vpc-id vpc-0abc123 \
--cidr-block "10.10.2${i}.0/24" \
--availability-zone "ap-southeast-1$(echo 'abc' | cut -c${i})" \
--tag-specifications "ResourceType=subnet,Tags=[{Key=Name,Value=isolated-subnet-${i}},{Key=Type,Value=isolated}]"
done
# Create and attach Internet Gateway
aws ec2 create-internet-gateway \
--tag-specifications 'ResourceType=internet-gateway,Tags=[{Key=Name,Value=prod-igw}]'
aws ec2 attach-internet-gateway --internet-gateway-id igw-0abc123 --vpc-id vpc-0abc123
# Create NAT Gateway with EIP (one per AZ for HA)
aws ec2 allocate-address --domain vpc
aws ec2 create-nat-gateway \
--subnet-id subnet-public-1a \
--allocation-id eipalloc-0abc123 \
--tag-specifications 'ResourceType=natgateway,Tags=[{Key=Name,Value=nat-gw-1a}]'
# VPC Endpoint — Gateway type for S3/DynamoDB (free)
aws ec2 create-vpc-endpoint \
--vpc-id vpc-0abc123 \
--service-name com.amazonaws.ap-southeast-1.s3 \
--vpc-endpoint-type Gateway \
--route-table-ids rtb-private-1 rtb-private-2 rtb-private-3
# VPC Endpoint — Interface type for ECR (charges apply)
aws ec2 create-vpc-endpoint \
--vpc-id vpc-0abc123 \
--service-name com.amazonaws.ap-southeast-1.ecr.dkr \
--vpc-endpoint-type Interface \
--subnet-ids subnet-private-1 subnet-private-2 \
--security-group-ids sg-endpoints \
--private-dns-enabled
Load Balancers
# Create Application Load Balancer (Layer 7)
aws elbv2 create-load-balancer \
--name prod-alb \
--type application \
--scheme internet-facing \
--subnets subnet-public-1a subnet-public-1b subnet-public-1c \
--security-groups sg-alb \
--ip-address-type ipv4
# Create target group with health check
aws elbv2 create-target-group \
--name prod-tg-app \
--protocol HTTP \
--port 8080 \
--vpc-id vpc-0abc123 \
--health-check-path /health \
--health-check-interval-seconds 15 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3 \
--target-type ip
# Create HTTPS listener with redirect
aws elbv2 create-listener \
--load-balancer-arn arn:aws:elasticloadbalancing:... \
--protocol HTTPS \
--port 443 \
--certificates CertificateArn=arn:aws:acm:ap-southeast-1:123456789012:certificate/abc \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...
# Add path-based routing rule
aws elbv2 create-rule \
--listener-arn arn:aws:elasticloadbalancing:...:listener/... \
--priority 10 \
--conditions '[{"Field":"path-pattern","Values":["/api/*"]}]' \
--actions '[{"Type":"forward","TargetGroupArn":"arn:aws:elasticloadbalancing:...:targetgroup/api-tg/..."}]'
# Create Network Load Balancer (Layer 4 — static IPs, PrivateLink)
aws elbv2 create-load-balancer \
--name prod-nlb \
--type network \
--scheme internal \
--subnets subnet-private-1a subnet-private-1b subnet-private-1c
Route 53
# Create private hosted zone
aws route53 create-hosted-zone \
--name internal.example.com \
--caller-reference $(date +%s) \
--hosted-zone-config Comment="Internal DNS",PrivateZone=true \
--vpc VPCRegion=ap-southeast-1,VPCId=vpc-0abc123
# Create A record (alias to ALB)
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234ABCD \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "app.example.com",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "Z35SXDOTRQ7X7K",
"DNSName": "prod-alb-123456789.ap-southeast-1.elb.amazonaws.com",
"EvaluateTargetHealth": true
}
}
}]
}'
# Weighted routing (for canary/A-B testing)
# 90% to v1, 10% to v2
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234ABCD \
--change-batch '{
"Changes": [
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "api.example.com",
"Type": "A",
"SetIdentifier": "v1",
"Weight": 90,
"AliasTarget": { "HostedZoneId": "...", "DNSName": "v1-alb...", "EvaluateTargetHealth": true }
}
},
{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "api.example.com",
"Type": "A",
"SetIdentifier": "v2",
"Weight": 10,
"AliasTarget": { "HostedZoneId": "...", "DNSName": "v2-alb...", "EvaluateTargetHealth": true }
}
}
]
}'
# Latency-based routing (multi-region)
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234ABCD \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "api.example.com",
"Type": "A",
"SetIdentifier": "ap-southeast-1",
"Region": "ap-southeast-1",
"AliasTarget": { "HostedZoneId": "...", "DNSName": "sg-alb...", "EvaluateTargetHealth": true }
}
}]
}'
CloudFront
# Create CloudFront distribution with S3 origin and WAF
aws cloudfront create-distribution --distribution-config '{
"Origins": {
"Quantity": 1,
"Items": [{
"Id": "S3-prod-static",
"DomainName": "prod-bucket.s3.ap-southeast-1.amazonaws.com",
"S3OriginConfig": { "OriginAccessIdentity": "origin-access-identity/cloudfront/ABCDEF" }
}]
},
"DefaultCacheBehavior": {
"TargetOriginId": "S3-prod-static",
"ViewerProtocolPolicy": "redirect-to-https",
"CachePolicyId": "658327ea-f89d-4fab-a63d-7e88639e58f6",
"Compress": true
},
"WebACLId": "arn:aws:wafv2:us-east-1:123456789012:global/webacl/prod-waf/abc",
"PriceClass": "PriceClass_All",
"Enabled": true,
"Comment": "Production CDN"
}'
Storage
S3 — Simple Storage Service
# Create bucket with versioning and encryption
aws s3api create-bucket \
--bucket prod-app-assets-123456789012 \
--region ap-southeast-1 \
--create-bucket-configuration LocationConstraint=ap-southeast-1
aws s3api put-bucket-versioning \
--bucket prod-app-assets-123456789012 \
--versioning-configuration Status=Enabled
aws s3api put-bucket-encryption \
--bucket prod-app-assets-123456789012 \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {
"SSEAlgorithm": "aws:kms",
"KMSMasterKeyID": "arn:aws:kms:ap-southeast-1:123456789012:key/abc123"
},
"BucketKeyEnabled": true
}]
}'
# Block all public access
aws s3api put-public-access-block \
--bucket prod-app-assets-123456789012 \
--public-access-block-configuration \
BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
# Lifecycle policy: transition to cheaper storage tiers
aws s3api put-bucket-lifecycle-configuration \
--bucket prod-app-assets-123456789012 \
--lifecycle-configuration '{
"Rules": [{
"ID": "transition-old-objects",
"Status": "Enabled",
"Filter": { "Prefix": "logs/" },
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" },
{ "Days": 90, "StorageClass": "GLACIER_IR" },
{ "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
],
"Expiration": { "Days": 2555 }
}]
}'
# Cross-region replication (requires versioning on both buckets)
aws s3api put-bucket-replication \
--bucket prod-app-assets-123456789012 \
--replication-configuration '{
"Role": "arn:aws:iam::123456789012:role/s3-replication-role",
"Rules": [{
"ID": "replicate-to-dr",
"Status": "Enabled",
"Filter": {},
"Destination": {
"Bucket": "arn:aws:s3:::prod-app-assets-dr-123456789012",
"StorageClass": "STANDARD_IA",
"EncryptionConfiguration": {
"ReplicaKmsKeyID": "arn:aws:kms:us-east-1:123456789012:key/xyz789"
}
},
"DeleteMarkerReplication": { "Status": "Enabled" }
}]
}'
# Generate presigned URL (15-minute expiry)
aws s3 presign s3://prod-app-assets-123456789012/reports/Q4-2025.pdf \
--expires-in 900
EBS Volume Types
| Type | Use Case | Max IOPS | Max Throughput | Notes |
|---|---|---|---|---|
| gp3 | General purpose SSD | 16,000 | 1,000 MiB/s | Baseline 3,000 IOPS free; independently configure IOPS/throughput |
| io2 Block Express | Critical databases | 256,000 | 4,000 MiB/s | 99.999% durability; sub-millisecond latency |
| st1 | Throughput-intensive HDD | 500 | 500 MiB/s | Big data, log processing; cannot be boot volume |
| sc1 | Cold HDD (infrequent access) | 250 | 250 MiB/s | Lowest cost; cold data archives |
# Create encrypted gp3 volume with custom IOPS
aws ec2 create-volume \
--volume-type gp3 \
--size 200 \
--iops 6000 \
--throughput 500 \
--encrypted \
--kms-key-id arn:aws:kms:ap-southeast-1:123456789012:key/abc123 \
--availability-zone ap-southeast-1a \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=prod-db-data}]'
# Create snapshot with retention tag
aws ec2 create-snapshot \
--volume-id vol-0abc123 \
--description "Daily backup $(date +%Y-%m-%d)" \
--tag-specifications 'ResourceType=snapshot,Tags=[{Key=Retention,Value=30d}]'
EFS — Elastic File System
# Create EFS with encryption and performance mode
aws efs create-file-system \
--encrypted \
--kms-key-id arn:aws:kms:ap-southeast-1:123456789012:key/abc123 \
--performance-mode generalPurpose \
--throughput-mode elastic \
--tags Key=Name,Value=prod-efs
# Create mount targets (one per AZ)
for subnet in subnet-1a subnet-1b subnet-1c; do
aws efs create-mount-target \
--file-system-id fs-0abc123 \
--subnet-id $subnet \
--security-groups sg-efs-mount
done
# Create access point (for EKS persistent volumes)
aws efs create-access-point \
--file-system-id fs-0abc123 \
--posix-user Uid=1000,Gid=1000 \
--root-directory Path=/data/app,CreationInfo={OwnerUid=1000,OwnerGid=1000,Permissions=755}
Security
IAM — Least Privilege Policy Examples
# S3 read-only access to specific bucket and prefix
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListBucketWithPrefix",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::prod-app-assets-123456789012",
"Condition": {
"StringLike": { "s3:prefix": ["reports/*"] }
}
},
{
"Sid": "ReadObjectsInPrefix",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:GetObjectVersion"],
"Resource": "arn:aws:s3:::prod-app-assets-123456789012/reports/*"
}
]
}
# EKS pod-level IAM via IRSA (IAM Roles for Service Accounts)
# 1. Create OIDC provider for cluster
eksctl utils associate-iam-oidc-provider \
--cluster prod-cluster \
--approve
# 2. Create role with trust policy scoped to specific SA
aws iam create-role \
--role-name prod-app-sa-role \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/ABCD1234"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.ap-southeast-1.amazonaws.com/id/ABCD1234:sub": "system:serviceaccount:default:my-app-sa",
"oidc.eks.ap-southeast-1.amazonaws.com/id/ABCD1234:aud": "sts.amazonaws.com"
}
}
}]
}'
KMS — Key Management Service
# Create customer-managed CMK (symmetric)
aws kms create-key \
--description "Production application data encryption key" \
--key-usage ENCRYPT_DECRYPT \
--key-spec SYMMETRIC_DEFAULT \
--tags TagKey=Environment,TagValue=prod
# Create key alias
aws kms create-alias \
--alias-name alias/prod-app-key \
--target-key-id arn:aws:kms:ap-southeast-1:123456789012:key/abc123
# Grant cross-account access via key policy
aws kms put-key-policy \
--key-id alias/prod-app-key \
--policy-name default \
--policy '{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Enable IAM Root Access",
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::123456789012:root" },
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Allow Cross Account Usage",
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::999888777666:role/app-role" },
"Action": ["kms:Decrypt", "kms:GenerateDataKey"],
"Resource": "*"
}
]
}'
# Envelope encryption example (encrypt data key with CMK)
# 1. Generate data key
aws kms generate-data-key \
--key-id alias/prod-app-key \
--key-spec AES_256
# 2. Decrypt data key when needed
aws kms decrypt \
--ciphertext-blob fileb://encrypted-data-key.bin \
--key-id alias/prod-app-key
Secrets Manager
# Create a secret
aws secretsmanager create-secret \
--name prod/myapp/db-credentials \
--description "Production database credentials" \
--secret-string '{"username":"admin","password":"S3cur3P@ssw0rd","host":"db.example.com","port":5432}' \
--kms-key-id alias/prod-app-key \
--tags Key=Environment,Value=prod
# Retrieve secret value
aws secretsmanager get-secret-value \
--secret-id prod/myapp/db-credentials \
--query SecretString \
--output text | python3 -m json.tool
# Enable automatic rotation (requires a Lambda rotation function)
aws secretsmanager rotate-secret \
--secret-id prod/myapp/db-credentials \
--rotation-lambda-arn arn:aws:lambda:ap-southeast-1:123456789012:function:SecretsManagerRotation \
--rotation-rules AutomaticallyAfterDays=30
# Python boto3 — retrieve secret in application code
import boto3
import json
def get_secret(secret_name: str, region: str = "ap-southeast-1") -> dict:
client = boto3.client("secretsmanager", region_name=region)
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response["SecretString"])
creds = get_secret("prod/myapp/db-credentials")
db_host = creds["host"]
db_pass = creds["password"]
Security Services Overview
# Enable GuardDuty (threat detection)
aws guardduty create-detector \
--enable \
--finding-publishing-frequency FIFTEEN_MINUTES \
--features '[{"Name":"S3_DATA_EVENTS","Status":"ENABLED"},{"Name":"EKS_AUDIT_LOGS","Status":"ENABLED"},{"Name":"MALWARE_PROTECTION","Status":"ENABLED"}]'
# Enable Security Hub with standards
aws securityhub enable-security-hub \
--enable-default-standards
aws securityhub batch-enable-standards \
--standards-subscription-requests \
'[{"StandardsArn":"arn:aws:securityhub:ap-southeast-1::standards/cis-aws-foundations-benchmark/v/1.4.0"},
{"StandardsArn":"arn:aws:securityhub:ap-southeast-1::standards/aws-foundational-security-best-practices/v/1.0.0"}]'
# Enable CloudTrail (management events + data events)
aws cloudtrail create-trail \
--name prod-trail \
--s3-bucket-name prod-cloudtrail-logs-123456789012 \
--include-global-service-events \
--is-multi-region-trail \
--enable-log-file-validation \
--kms-key-id alias/prod-app-key
aws cloudtrail start-logging --name prod-trail
# AWS Config — enable recorder and conformance pack
aws configservice put-configuration-recorder \
--configuration-recorder '{
"name": "default",
"roleARN": "arn:aws:iam::123456789012:role/config-role",
"recordingGroup": {
"allSupported": true,
"includeGlobalResourceTypes": true
}
}'
Monitoring
CloudWatch
# Create CloudWatch alarm for high CPU
aws cloudwatch put-metric-alarm \
--alarm-name "prod-ec2-high-cpu" \
--alarm-description "CPU above 80% for 5 minutes" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--threshold 80 \
--comparison-operator GreaterThanOrEqualToThreshold \
--evaluation-periods 2 \
--dimensions Name=AutoScalingGroupName,Value=prod-asg \
--alarm-actions arn:aws:sns:ap-southeast-1:123456789012:prod-alerts \
--ok-actions arn:aws:sns:ap-southeast-1:123456789012:prod-alerts
# CloudWatch Logs Insights — query examples
# Top 10 error messages in last hour
aws logs start-query \
--log-group-name /aws/eks/prod-cluster/application \
--start-time $(date -d '1 hour ago' +%s) \
--end-time $(date +%s) \
--query-string '
fields @timestamp, @message
| filter @message like /ERROR/
| stats count(*) as error_count by @message
| sort error_count desc
| limit 10
'
# P99 latency from ALB access logs
aws logs start-query \
--log-group-name /aws/elasticloadbalancing/prod-alb \
--start-time $(date -d '1 hour ago' +%s) \
--end-time $(date +%s) \
--query-string '
fields @timestamp, target_processing_time
| filter elb_status_code >= 200
| stats pct(target_processing_time, 99) as p99_latency,
pct(target_processing_time, 95) as p95_latency,
avg(target_processing_time) as avg_latency
by bin(5m)
'
# Container Insights — enable on EKS
aws eks update-addon \
--cluster-name prod-cluster \
--addon-name amazon-cloudwatch-observability \
--addon-version v1.7.0-eksbuild.1
# Create dashboard
aws cloudwatch put-dashboard \
--dashboard-name prod-overview \
--dashboard-body file://dashboard.json
AWS X-Ray: Enable distributed tracing by adding the X-Ray SDK to your application and the X-Ray daemon as a sidecar container in EKS pods. Use
aws xray get-service-graph to visualize service dependencies and identify bottlenecks.