GCP Cost Optimization
GCP Pricing Models
GCP offers four distinct pricing models for Compute Engine resources. Understanding how they stack and interact is critical to building an efficient commitment portfolio.
On-Demand (Pay-as-you-go)
Per-second billing (minimum 1 minute) for most VM types. No commitment required. This is the baseline price used as the reference for all discount calculations. Custom machine types are also available at On-Demand pricing — useful for precisely matching workload requirements without paying for unused CPU or memory.
Sustained Use Discounts (SUD) — Automatic
GCP automatically applies SUDs when you use an N1, N2, or C2 VM for more than 25% of the month. Discounts increase linearly up to 30% for full-month usage with no action required. SUDs apply per-project, per-region, and are calculated on aggregate vCPU and memory consumed — not per individual VM. This means even VM instances that are created and destroyed throughout the month can accumulate SUD benefit if total resource hours are high enough.
Committed Use Discounts (CUDs)
Purchase a commitment for 1 year or 3 years in exchange for a deeper discount beyond SUD. CUDs come in two flavors: Resource-based (commit to specific vCPU and memory in a region) and Flexible (commit to a spend level in $/hour). See the dedicated section below for a full comparison.
Preemptible VMs / Spot VMs
Preemptible VMs (the original offering) and Spot VMs (newer, no maximum 24-hour runtime limit) provide discounts of 60–91% by using spare GCP capacity. Google can terminate these VMs with a 30-second shutdown notice. Spot VMs are the recommended choice for new workloads as they have no maximum runtime restriction and support the same machine types.
Committed Use Discounts (CUDs)
Resource-Based vs Flexible CUDs
| Attribute | Resource-Based CUD | Flexible CUD |
|---|---|---|
| What you commit | Specific vCPU and memory in a region | $/hour of compute spend in a region |
| Max discount (1yr) | Up to 37% (on top of SUD, combined ~57%) | Up to 28% |
| Max discount (3yr) | Up to 70% vs On-Demand | Up to 28% (3yr currently same rate as 1yr for Flexible) |
| Machine family flexibility | Low — bound to specific machine series (e.g., N2) | High — applies across N2, C2, M2, C3, and more |
| Applies to | Compute Engine VMs (specific series) | Most Compute Engine machine types |
| Best for | Stable, predictable workloads on a known machine series | Mixed or evolving machine families; GKE node pools |
Purchasing a CUD
# Purchase a 1-year resource-based CUD for N2 vCPUs in us-central1
gcloud compute commitments create my-n2-commitment \
--region=us-central1 \
--plan=12-month \
--resources=vcpu=32,memory=128GB \
--type=GENERAL_PURPOSE_N2
# List existing commitments and their utilization
gcloud compute commitments list \
--filter="region:us-central1" \
--format="table(name,region,plan,status,startTimestamp,endTimestamp)"
Preemptible and Spot VMs
Key Differences
| Attribute | Preemptible VM | Spot VM |
|---|---|---|
| Maximum runtime | 24 hours (hard limit) | No maximum runtime |
| Shutdown notice | 30 seconds | 30 seconds |
| Pricing | Fixed discount (~60–91% vs On-Demand) | Same pricing as Preemptible; dynamic in some regions |
| Recommendation | Legacy; use for existing workloads only | Preferred for new workloads; no 24-hr restart penalty |
| Live migration | Not supported | Not supported |
Shutdown Script for Graceful Termination
#!/bin/bash
# /etc/spot-shutdown.sh
# Set as the instance's shutdown-script metadata key.
# GCP calls this script when the VM receives a preemption notice.
echo "Spot VM preemption notice received at $(date)" >> /var/log/spot-shutdown.log
# Stop application gracefully
systemctl stop myapp
# Flush in-progress work to GCS (example: copy work-in-progress files)
gsutil -m cp /var/app/in-progress/* gs://my-bucket/checkpoints/
# Deregister from backend service (if not behind a managed instance group)
INSTANCE_NAME=$(curl -s -H "Metadata-Flavor: Google" \
"http://metadata.google.internal/computeMetadata/v1/instance/name")
ZONE=$(curl -s -H "Metadata-Flavor: Google" \
"http://metadata.google.internal/computeMetadata/v1/instance/zone" | awk -F/ '{print $NF}')
gcloud compute backend-services remove-backend my-backend-service \
--instance-group=my-instance-group \
--instance-group-zone="$ZONE" \
--global 2>/dev/null || true
echo "Graceful shutdown complete" >> /var/log/spot-shutdown.log
Managed Instance Group with Spot VMs (Terraform)
resource "google_compute_instance_template" "spot_template" {
name_prefix = "spot-worker-"
machine_type = "n2-standard-4"
disk {
source_image = "debian-cloud/debian-12"
auto_delete = true
boot = true
disk_size_gb = 50
}
network_interface {
network = "default"
subnetwork = var.subnet_self_link
}
scheduling {
preemptible = true
automatic_restart = false
provisioning_model = "SPOT"
instance_termination_action = "STOP"
}
metadata = {
shutdown-script = file("${path.module}/scripts/spot-shutdown.sh")
}
lifecycle {
create_before_destroy = true
}
}
resource "google_compute_instance_group_manager" "spot_mig" {
name = "spot-worker-mig"
base_instance_name = "spot-worker"
zone = var.zone
version {
instance_template = google_compute_instance_template.spot_template.id
}
target_size = 10
auto_healing_policies {
health_check = google_compute_health_check.default.id
initial_delay_sec = 120
}
}
Compute Engine Rightsizing with GCP Recommender
GCP Recommender continuously analyzes VM CPU and memory metrics and generates rightsizing recommendations. Recommendations are available in the console under "Recommendations Hub" and via the Recommender API and gcloud CLI.
# List VM rightsizing recommendations for a project in a specific zone
gcloud recommender recommendations list \
--project=my-project \
--location=asia-southeast1-b \
--recommender=google.compute.instance.MachineTypeRecommender \
--format="table(
name.basename(),
content.overview.resourceName,
content.overview.recommendation,
primaryImpact.costProjection.cost.units,
stateInfo.state
)"
# List idle VM recommendations (candidates for shutdown or deletion)
gcloud recommender recommendations list \
--project=my-project \
--location=asia-southeast1-b \
--recommender=google.compute.instance.IdleResourceRecommender \
--format="table(
name.basename(),
content.overview.resourceName,
primaryImpact.costProjection.cost.units,
stateInfo.state
)"
# Mark a recommendation as claimed (indicate you are acting on it)
gcloud recommender recommendations mark-claimed \
RECOMMENDATION_ID \
--project=my-project \
--location=asia-southeast1-b \
--recommender=google.compute.instance.MachineTypeRecommender \
--etag=ETAG_VALUE \
--state-metadata=priority=high
GCS Cost Optimization
Storage Classes
| Storage Class | Use Case | Min Storage Duration | Retrieval Cost | Approx. Cost/GB/mo |
|---|---|---|---|---|
| Standard | Frequently accessed, hot data | None | Free | $0.020 |
| Nearline | Accessed less than once per month | 30 days | $0.01/GB | $0.010 |
| Coldline | Accessed less than once per quarter | 90 days | $0.02/GB | $0.004 |
| Archive | Long-term retention, rarely accessed | 365 days | $0.05/GB | $0.0012 |
GCS Lifecycle Rule Example
{
"lifecycle": {
"rule": [
{
"action": {
"type": "SetStorageClass",
"storageClass": "NEARLINE"
},
"condition": {
"age": 30,
"matchesStorageClass": ["STANDARD"]
}
},
{
"action": {
"type": "SetStorageClass",
"storageClass": "COLDLINE"
},
"condition": {
"age": 90,
"matchesStorageClass": ["NEARLINE"]
}
},
{
"action": {
"type": "SetStorageClass",
"storageClass": "ARCHIVE"
},
"condition": {
"age": 365,
"matchesStorageClass": ["COLDLINE"]
}
},
{
"action": {
"type": "Delete"
},
"condition": {
"age": 2555,
"isLive": false
}
}
]
}
}
# Apply lifecycle configuration to a GCS bucket
gsutil lifecycle set lifecycle.json gs://my-bucket
# Verify the policy was applied
gsutil lifecycle get gs://my-bucket
BigQuery Cost Optimization
BigQuery costs are driven primarily by query processing (bytes scanned) and storage. On-demand pricing charges per byte scanned; slot reservations provide predictable capacity-based pricing.
On-Demand vs Slot Reservations
| Model | Pricing Basis | Best For | Risk |
|---|---|---|---|
| On-Demand | $6.25 per TB of bytes scanned | Exploratory analysis, low query volume, unpredictable workloads | A single poorly written query scanning petabytes can create an unexpectedly large bill |
| Slot Reservations (Standard) | Flat hourly rate per slot; 100-slot minimum | Consistent, high-volume query workloads | Unused slots are wasted; must model query concurrency to size correctly |
| Slot Reservations (Enterprise) | Monthly or annual commitment; autoscaling available | Production BI platforms, large data teams with SLA requirements | Higher commitment; requires accurate forecasting |
Partitioned Tables
Partitioning a table on a DATE or TIMESTAMP column, or on an INTEGER range, allows BigQuery to skip entire partitions that do not match the query's filter — dramatically reducing bytes scanned and therefore cost.
-- Create a partitioned and clustered table for optimal cost
CREATE TABLE `my_project.my_dataset.events`
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
OPTIONS (
require_partition_filter = TRUE,
partition_expiration_days = 365
) AS
SELECT * FROM `my_project.my_dataset.events_raw`;
-- Query that benefits from partition pruning (scans only 2026-03-28 partition)
SELECT
user_id,
COUNT(*) AS event_count
FROM `my_project.my_dataset.events`
WHERE DATE(event_timestamp) = '2026-03-28'
AND event_type = 'purchase'
GROUP BY user_id;
-- Estimate bytes processed before running a query (dry run)
-- Run with --dry_run flag via CLI:
-- bq query --dry_run --use_legacy_sql=false 'SELECT ...'
Query Cost Estimation
# Estimate bytes processed for a query before executing it
bq query \
--dry_run \
--use_legacy_sql=false \
'SELECT user_id, COUNT(*) FROM `my_project.dataset.events`
WHERE DATE(event_timestamp) = "2026-03-28"
GROUP BY user_id'
# Output: "Query successfully validated. Assuming the tables are not modified,
# running this query will process X bytes."
# Cost estimate = X bytes / 1e12 * $6.25
SELECT * on large tables in BigQuery. Always select only the columns you need. BigQuery bills per column scanned (columnar storage), so selecting 3 columns from a 200-column table can reduce scan cost by 98%.
GKE Cost Optimization
Cluster Autoscaler
Enable Cluster Autoscaler on all node pools to automatically add nodes when pending pods cannot be scheduled and remove underutilized nodes. Combined with Spot node pools for non-critical workloads, this is typically the single largest GKE cost reduction lever.
# Enable autoscaling on a node pool (min 1, max 10 nodes)
gcloud container clusters update my-cluster \
--enable-autoscaling \
--min-nodes=1 \
--max-nodes=10 \
--node-pool=default-pool \
--region=asia-southeast1
# Enable the Vertical Pod Autoscaler (VPA) on the cluster
gcloud container clusters update my-cluster \
--enable-vertical-pod-autoscaling \
--region=asia-southeast1
Spot Node Pools in Terraform
resource "google_container_node_pool" "spot_pool" {
name = "spot-node-pool"
cluster = google_container_cluster.primary.name
location = var.region
node_count = null # managed by autoscaler
autoscaling {
min_node_count = 0
max_node_count = 20
}
node_config {
machine_type = "n2-standard-4"
disk_size_gb = 50
disk_type = "pd-standard"
spot = true # Use Spot VMs for this node pool
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
labels = {
"cloud.google.com/gke-spot" = "true"
environment = "prod"
team = "platform"
}
taint {
key = "cloud.google.com/gke-spot"
value = "true"
effect = "NO_SCHEDULE"
}
}
management {
auto_repair = true
auto_upgrade = true
}
}
# In your Kubernetes workload spec, tolerate the Spot taint:
# tolerations:
# - key: "cloud.google.com/gke-spot"
# operator: "Equal"
# value: "true"
# effect: "NO_SCHEDULE"
Node Pool Rightsizing
- Use Workload Identity and minimize daemon set resource requests to avoid reserving large amounts of node capacity for infrastructure pods.
- Enable Node Auto Provisioning (NAP) to let GKE automatically select the optimal machine type and size for pending pods.
- Review node utilization using:
kubectl top nodesand the GKE Workload Metrics in Cloud Monitoring. - Set explicit
requestsandlimitson all containers — missing resource requests cause the scheduler to place pods sub-optimally and inflate node count.
Network Egress Costs
GCP's network pricing is tiered by destination. Designing your architecture to minimize cross-region and internet egress is essential for data-intensive workloads.
| Transfer Path | Approximate Cost | Notes |
|---|---|---|
| Same region, same zone (internal IP) | Free | Use internal IPs always for intra-region traffic |
| Same region, different zone (internal IP) | $0.01/GB | Charged in both directions ($0.02/GB round-trip) |
| Cross-region within GCP (Americas) | ~$0.01–0.02/GB | Route-dependent; check GCP pricing calculator for specific pairs |
| Egress to internet (first 1 TB/mo) | $0.08/GB (Americas/EMEA) | First 1 GB/month free; pricing varies by destination continent |
| Egress via Cloud CDN to internet | ~$0.02–0.06/GB | Tiered; significantly cheaper than VM direct egress at scale |
| GCS to Compute Engine (same region) | Free | No egress charge for GCS-to-GCE in the same region |
| BigQuery to Compute Engine (same region) | Free (Storage API) | BigQuery Storage Read API in the same region incurs no egress |
Using Cloud CDN to Reduce Egress
# Enable Cloud CDN on a backend bucket (serving static assets from GCS)
gcloud compute backend-buckets create my-static-assets \
--gcs-bucket-name=my-static-bucket \
--enable-cdn \
--cache-mode=CACHE_ALL_STATIC
# Enable CDN on an existing HTTP(S) load balancer backend service
gcloud compute backend-services update my-backend-service \
--enable-cdn \
--cache-mode=CACHE_ALL_STATIC \
--global
# Check CDN cache hit rate in Cloud Monitoring
# Use the metric: loadbalancing.googleapis.com/https/request_count
# Split by: cache_result (HIT, MISS, BYPASS)
Cloud SQL Cost Optimization
Rightsizing Cloud SQL Instances
Cloud SQL does not have a native recommender equivalent to Compute Optimizer. Monitor CPU, memory, and disk I/O via Cloud Monitoring and manually rightsize when average CPU utilization is consistently below 20%. Use the gcloud sql instances patch command to change machine type; a brief restart is required.
Committed Use Discounts for Cloud SQL
Cloud SQL supports CUDs for instances running on shared-core or dedicated-core machine types. A 1-year commitment delivers approximately 25% savings; a 3-year commitment delivers approximately 52% savings compared to On-Demand for the same instance type.
High Availability Only for Production
Cloud SQL HA (regional instances with standby) roughly doubles the instance cost because GCP provisions a hot standby in a second zone. Enforce a policy via Terraform variable: set availability_type = "REGIONAL" for production and availability_type = "ZONAL" for staging and development environments.
resource "google_sql_database_instance" "primary" {
name = "my-postgres-${var.environment}"
database_version = "POSTGRES_15"
region = var.region
settings {
tier = var.environment == "prod" ? "db-custom-4-15360" : "db-custom-2-7680"
availability_type = var.environment == "prod" ? "REGIONAL" : "ZONAL"
backup_configuration {
enabled = true
start_time = "02:00"
point_in_time_recovery_enabled = var.environment == "prod" ? true : false
retained_backups = var.environment == "prod" ? 14 : 3
}
insights_config {
query_insights_enabled = var.environment == "prod" ? true : false
}
}
deletion_protection = var.environment == "prod" ? true : false
}
Budget and Billing Alerts
GCP Budget alerts are project- or billing-account-scoped. They can trigger Pub/Sub notifications for programmatic responses in addition to email alerts.
# Create a monthly budget alert for a GCP project
gcloud billing budgets create \
--billing-account=BILLING_ACCOUNT_ID \
--display-name="Monthly Project Budget - my-project" \
--budget-amount=5000USD \
--calendar-period=month \
--projects=projects/my-project \
--threshold-rule=percent=0.5,basis=CURRENT_SPEND \
--threshold-rule=percent=0.8,basis=CURRENT_SPEND \
--threshold-rule=percent=1.0,basis=CURRENT_SPEND \
--threshold-rule=percent=1.0,basis=FORECASTED_SPEND \
--notifications-rule-pubsub-topic=projects/my-project/topics/billing-alerts \
--notifications-rule-disable-default-iam-recipients=false
# List all budgets for a billing account
gcloud billing budgets list \
--billing-account=BILLING_ACCOUNT_ID \
--format="table(name,displayName,amount.specifiedAmount.units,thresholdRules)"
billing-alerts Pub/Sub topic to a Cloud Function that posts to your team's Slack channel with a direct link to the GCP Cost Management console. For budget overruns above 100%, trigger an automatic page via PagerDuty.
Cost Allocation with Labels
GCP uses labels (key-value pairs) for cost allocation rather than the tag system used by AWS. Labels applied to Compute Engine VMs, GCS buckets, BigQuery datasets, and other resources appear in billing exports.
Recommended Label Strategy
| Label Key | Example Values | Purpose |
|---|---|---|
environment |
prod, staging, dev, sandbox | Separate environment spend; enforce HA/non-HA policies |
team |
platform, backend, data, security | Chargeback and showback by engineering team |
project |
customer-portal, data-lake | Attribute spend to product initiatives |
cost-center |
cc-1001, cc-2003 | Align with finance billing codes |
managed-by |
terraform, manual, gke | Track IaC coverage; identify drift-prone resources |
# Apply labels to a Compute Engine instance
gcloud compute instances add-labels my-vm \
--zone=asia-southeast1-b \
--labels=environment=prod,team=platform,project=customer-portal,cost-center=cc-1001
# Apply labels to a GCS bucket
gsutil label ch \
-l environment:prod \
-l team:data \
-l project:data-lake \
gs://my-bucket
# Find unlabeled Compute Engine instances (missing 'environment' label)
gcloud compute instances list \
--format="table(name,zone,status,labels)" \
--filter="NOT labels.environment:*"
Billing Export to BigQuery
Exporting GCP billing data to BigQuery enables powerful custom analysis using SQL. Enable the export in the Billing console under Billing Export → Standard Usage Cost and Detailed Usage Cost.
-- Top 10 cost drivers by service for the current month
SELECT
service.description AS service,
SUM(cost) + SUM(IFNULL((SELECT SUM(c.amount)
FROM UNNEST(credits) c), 0)) AS total_net_cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
GROUP BY service
ORDER BY total_net_cost_usd DESC
LIMIT 10;
-- Cost by label (team) for the current month
SELECT
(SELECT value FROM UNNEST(labels) WHERE key = 'team') AS team,
SUM(cost) AS total_cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
GROUP BY team
ORDER BY total_cost_usd DESC;
-- Identify resources with no 'environment' label
SELECT
DISTINCT resource.name AS resource,
service.description AS service,
SUM(cost) AS cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
AND NOT EXISTS (
SELECT 1 FROM UNNEST(labels) l WHERE l.key = 'environment'
)
GROUP BY resource, service
HAVING cost_usd > 10
ORDER BY cost_usd DESC;
Recommender API Integration
The GCP Recommender API provides programmatic access to all recommendation types. Use it to build automated cost governance workflows that integrate with your engineering processes.
# List all active recommendations across all recommenders in a project/location
gcloud recommender recommendations list \
--project=my-project \
--location=asia-southeast1-b \
--recommender=google.compute.instance.MachineTypeRecommender \
--filter="stateInfo.state=ACTIVE" \
--format=json | python3 -c "
import json, sys
recs = json.load(sys.stdin)
for r in recs:
name = r.get('content', {}).get('overview', {}).get('resourceName', 'N/A')
savings = r.get('primaryImpact', {}).get('costProjection', {}).get('cost', {}).get('units', 'N/A')
print(f'Resource: {name} | Monthly savings: \${savings}')
"
# List idle persistent disk recommendations
gcloud recommender recommendations list \
--project=my-project \
--location=asia-southeast1 \
--recommender=google.compute.disk.IdleResourceRecommender \
--format="table(
content.overview.resourceName,
primaryImpact.costProjection.cost.units,
stateInfo.state
)"
# Mark a recommendation as succeeded after implementing it
gcloud recommender recommendations mark-succeeded \
RECOMMENDATION_ID \
--project=my-project \
--location=asia-southeast1-b \
--recommender=google.compute.instance.MachineTypeRecommender \
--etag=ETAG_VALUE \
--state-metadata=implementedBy=terraform,ticket=INFRA-1234