GCP Cost Optimization

GCP Cost Optimization — Google Cloud differentiates itself with automatic Sustained Use Discounts, flexible Committed Use Discounts, and native cost intelligence tools. This guide covers GCP-specific pricing mechanics and practical optimization techniques for Compute Engine, GCS, BigQuery, GKE, and Cloud SQL.

GCP Pricing Models

GCP offers four distinct pricing models for Compute Engine resources. Understanding how they stack and interact is critical to building an efficient commitment portfolio.

On-Demand (Pay-as-you-go)

Per-second billing (minimum 1 minute) for most VM types. No commitment required. This is the baseline price used as the reference for all discount calculations. Custom machine types are also available at On-Demand pricing — useful for precisely matching workload requirements without paying for unused CPU or memory.

Sustained Use Discounts (SUD) — Automatic

GCP automatically applies SUDs when you use an N1, N2, or C2 VM for more than 25% of the month. Discounts increase linearly up to 30% for full-month usage with no action required. SUDs apply per-project, per-region, and are calculated on aggregate vCPU and memory consumed — not per individual VM. This means even VM instances that are created and destroyed throughout the month can accumulate SUD benefit if total resource hours are high enough.

Committed Use Discounts (CUDs)

Purchase a commitment for 1 year or 3 years in exchange for a deeper discount beyond SUD. CUDs come in two flavors: Resource-based (commit to specific vCPU and memory in a region) and Flexible (commit to a spend level in $/hour). See the dedicated section below for a full comparison.

Preemptible VMs / Spot VMs

Preemptible VMs (the original offering) and Spot VMs (newer, no maximum 24-hour runtime limit) provide discounts of 60–91% by using spare GCP capacity. Google can terminate these VMs with a 30-second shutdown notice. Spot VMs are the recommended choice for new workloads as they have no maximum runtime restriction and support the same machine types.

Committed Use Discounts (CUDs)

Resource-Based vs Flexible CUDs

Attribute Resource-Based CUD Flexible CUD
What you commit Specific vCPU and memory in a region $/hour of compute spend in a region
Max discount (1yr) Up to 37% (on top of SUD, combined ~57%) Up to 28%
Max discount (3yr) Up to 70% vs On-Demand Up to 28% (3yr currently same rate as 1yr for Flexible)
Machine family flexibility Low — bound to specific machine series (e.g., N2) High — applies across N2, C2, M2, C3, and more
Applies to Compute Engine VMs (specific series) Most Compute Engine machine types
Best for Stable, predictable workloads on a known machine series Mixed or evolving machine families; GKE node pools

Purchasing a CUD

# Purchase a 1-year resource-based CUD for N2 vCPUs in us-central1
gcloud compute commitments create my-n2-commitment \
  --region=us-central1 \
  --plan=12-month \
  --resources=vcpu=32,memory=128GB \
  --type=GENERAL_PURPOSE_N2

# List existing commitments and their utilization
gcloud compute commitments list \
  --filter="region:us-central1" \
  --format="table(name,region,plan,status,startTimestamp,endTimestamp)"
Strategy: Stack Resource-Based CUDs and Sustained Use Discounts for maximum savings on stable N2 workloads. Use Flexible CUDs for GKE clusters where the node pool machine type may change over time. Never commit more vCPU/memory than your lowest observed monthly consumption.

Preemptible and Spot VMs

Key Differences

Attribute Preemptible VM Spot VM
Maximum runtime 24 hours (hard limit) No maximum runtime
Shutdown notice 30 seconds 30 seconds
Pricing Fixed discount (~60–91% vs On-Demand) Same pricing as Preemptible; dynamic in some regions
Recommendation Legacy; use for existing workloads only Preferred for new workloads; no 24-hr restart penalty
Live migration Not supported Not supported

Shutdown Script for Graceful Termination

#!/bin/bash
# /etc/spot-shutdown.sh
# Set as the instance's shutdown-script metadata key.
# GCP calls this script when the VM receives a preemption notice.

echo "Spot VM preemption notice received at $(date)" >> /var/log/spot-shutdown.log

# Stop application gracefully
systemctl stop myapp

# Flush in-progress work to GCS (example: copy work-in-progress files)
gsutil -m cp /var/app/in-progress/* gs://my-bucket/checkpoints/

# Deregister from backend service (if not behind a managed instance group)
INSTANCE_NAME=$(curl -s -H "Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/name")

ZONE=$(curl -s -H "Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/zone" | awk -F/ '{print $NF}')

gcloud compute backend-services remove-backend my-backend-service \
  --instance-group=my-instance-group \
  --instance-group-zone="$ZONE" \
  --global 2>/dev/null || true

echo "Graceful shutdown complete" >> /var/log/spot-shutdown.log

Managed Instance Group with Spot VMs (Terraform)

resource "google_compute_instance_template" "spot_template" {
  name_prefix  = "spot-worker-"
  machine_type = "n2-standard-4"

  disk {
    source_image = "debian-cloud/debian-12"
    auto_delete  = true
    boot         = true
    disk_size_gb = 50
  }

  network_interface {
    network    = "default"
    subnetwork = var.subnet_self_link
  }

  scheduling {
    preemptible        = true
    automatic_restart  = false
    provisioning_model = "SPOT"
    instance_termination_action = "STOP"
  }

  metadata = {
    shutdown-script = file("${path.module}/scripts/spot-shutdown.sh")
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "google_compute_instance_group_manager" "spot_mig" {
  name               = "spot-worker-mig"
  base_instance_name = "spot-worker"
  zone               = var.zone

  version {
    instance_template = google_compute_instance_template.spot_template.id
  }

  target_size = 10

  auto_healing_policies {
    health_check      = google_compute_health_check.default.id
    initial_delay_sec = 120
  }
}

Compute Engine Rightsizing with GCP Recommender

GCP Recommender continuously analyzes VM CPU and memory metrics and generates rightsizing recommendations. Recommendations are available in the console under "Recommendations Hub" and via the Recommender API and gcloud CLI.

# List VM rightsizing recommendations for a project in a specific zone
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --format="table(
    name.basename(),
    content.overview.resourceName,
    content.overview.recommendation,
    primaryImpact.costProjection.cost.units,
    stateInfo.state
  )"

# List idle VM recommendations (candidates for shutdown or deletion)
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.IdleResourceRecommender \
  --format="table(
    name.basename(),
    content.overview.resourceName,
    primaryImpact.costProjection.cost.units,
    stateInfo.state
  )"

# Mark a recommendation as claimed (indicate you are acting on it)
gcloud recommender recommendations mark-claimed \
  RECOMMENDATION_ID \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --etag=ETAG_VALUE \
  --state-metadata=priority=high
Automation tip: Use the Recommender API in a Cloud Function triggered by a Pub/Sub schedule to automatically create GitHub issues or Jira tickets for each new high-priority recommendation. This brings cost recommendations into your engineering workflow without requiring engineers to check the console.

GCS Cost Optimization

Storage Classes

Storage Class Use Case Min Storage Duration Retrieval Cost Approx. Cost/GB/mo
Standard Frequently accessed, hot data None Free $0.020
Nearline Accessed less than once per month 30 days $0.01/GB $0.010
Coldline Accessed less than once per quarter 90 days $0.02/GB $0.004
Archive Long-term retention, rarely accessed 365 days $0.05/GB $0.0012

GCS Lifecycle Rule Example

{
  "lifecycle": {
    "rule": [
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "NEARLINE"
        },
        "condition": {
          "age": 30,
          "matchesStorageClass": ["STANDARD"]
        }
      },
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "COLDLINE"
        },
        "condition": {
          "age": 90,
          "matchesStorageClass": ["NEARLINE"]
        }
      },
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "ARCHIVE"
        },
        "condition": {
          "age": 365,
          "matchesStorageClass": ["COLDLINE"]
        }
      },
      {
        "action": {
          "type": "Delete"
        },
        "condition": {
          "age": 2555,
          "isLive": false
        }
      }
    ]
  }
}
# Apply lifecycle configuration to a GCS bucket
gsutil lifecycle set lifecycle.json gs://my-bucket

# Verify the policy was applied
gsutil lifecycle get gs://my-bucket

BigQuery Cost Optimization

BigQuery costs are driven primarily by query processing (bytes scanned) and storage. On-demand pricing charges per byte scanned; slot reservations provide predictable capacity-based pricing.

On-Demand vs Slot Reservations

Model Pricing Basis Best For Risk
On-Demand $6.25 per TB of bytes scanned Exploratory analysis, low query volume, unpredictable workloads A single poorly written query scanning petabytes can create an unexpectedly large bill
Slot Reservations (Standard) Flat hourly rate per slot; 100-slot minimum Consistent, high-volume query workloads Unused slots are wasted; must model query concurrency to size correctly
Slot Reservations (Enterprise) Monthly or annual commitment; autoscaling available Production BI platforms, large data teams with SLA requirements Higher commitment; requires accurate forecasting

Partitioned Tables

Partitioning a table on a DATE or TIMESTAMP column, or on an INTEGER range, allows BigQuery to skip entire partitions that do not match the query's filter — dramatically reducing bytes scanned and therefore cost.

-- Create a partitioned and clustered table for optimal cost
CREATE TABLE `my_project.my_dataset.events`
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
OPTIONS (
  require_partition_filter = TRUE,
  partition_expiration_days = 365
) AS
SELECT * FROM `my_project.my_dataset.events_raw`;

-- Query that benefits from partition pruning (scans only 2026-03-28 partition)
SELECT
  user_id,
  COUNT(*) AS event_count
FROM `my_project.my_dataset.events`
WHERE DATE(event_timestamp) = '2026-03-28'
  AND event_type = 'purchase'
GROUP BY user_id;

-- Estimate bytes processed before running a query (dry run)
-- Run with --dry_run flag via CLI:
-- bq query --dry_run --use_legacy_sql=false 'SELECT ...'

Query Cost Estimation

# Estimate bytes processed for a query before executing it
bq query \
  --dry_run \
  --use_legacy_sql=false \
  'SELECT user_id, COUNT(*) FROM `my_project.dataset.events`
   WHERE DATE(event_timestamp) = "2026-03-28"
   GROUP BY user_id'

# Output: "Query successfully validated. Assuming the tables are not modified,
# running this query will process X bytes."
# Cost estimate = X bytes / 1e12 * $6.25
Warning: Avoid SELECT * on large tables in BigQuery. Always select only the columns you need. BigQuery bills per column scanned (columnar storage), so selecting 3 columns from a 200-column table can reduce scan cost by 98%.

GKE Cost Optimization

Cluster Autoscaler

Enable Cluster Autoscaler on all node pools to automatically add nodes when pending pods cannot be scheduled and remove underutilized nodes. Combined with Spot node pools for non-critical workloads, this is typically the single largest GKE cost reduction lever.

# Enable autoscaling on a node pool (min 1, max 10 nodes)
gcloud container clusters update my-cluster \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=10 \
  --node-pool=default-pool \
  --region=asia-southeast1

# Enable the Vertical Pod Autoscaler (VPA) on the cluster
gcloud container clusters update my-cluster \
  --enable-vertical-pod-autoscaling \
  --region=asia-southeast1

Spot Node Pools in Terraform

resource "google_container_node_pool" "spot_pool" {
  name       = "spot-node-pool"
  cluster    = google_container_cluster.primary.name
  location   = var.region
  node_count = null  # managed by autoscaler

  autoscaling {
    min_node_count = 0
    max_node_count = 20
  }

  node_config {
    machine_type = "n2-standard-4"
    disk_size_gb = 50
    disk_type    = "pd-standard"

    spot = true  # Use Spot VMs for this node pool

    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]

    labels = {
      "cloud.google.com/gke-spot" = "true"
      environment                 = "prod"
      team                        = "platform"
    }

    taint {
      key    = "cloud.google.com/gke-spot"
      value  = "true"
      effect = "NO_SCHEDULE"
    }
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }
}

# In your Kubernetes workload spec, tolerate the Spot taint:
# tolerations:
# - key: "cloud.google.com/gke-spot"
#   operator: "Equal"
#   value: "true"
#   effect: "NO_SCHEDULE"

Node Pool Rightsizing

  • Use Workload Identity and minimize daemon set resource requests to avoid reserving large amounts of node capacity for infrastructure pods.
  • Enable Node Auto Provisioning (NAP) to let GKE automatically select the optimal machine type and size for pending pods.
  • Review node utilization using: kubectl top nodes and the GKE Workload Metrics in Cloud Monitoring.
  • Set explicit requests and limits on all containers — missing resource requests cause the scheduler to place pods sub-optimally and inflate node count.

Network Egress Costs

GCP's network pricing is tiered by destination. Designing your architecture to minimize cross-region and internet egress is essential for data-intensive workloads.

Transfer Path Approximate Cost Notes
Same region, same zone (internal IP) Free Use internal IPs always for intra-region traffic
Same region, different zone (internal IP) $0.01/GB Charged in both directions ($0.02/GB round-trip)
Cross-region within GCP (Americas) ~$0.01–0.02/GB Route-dependent; check GCP pricing calculator for specific pairs
Egress to internet (first 1 TB/mo) $0.08/GB (Americas/EMEA) First 1 GB/month free; pricing varies by destination continent
Egress via Cloud CDN to internet ~$0.02–0.06/GB Tiered; significantly cheaper than VM direct egress at scale
GCS to Compute Engine (same region) Free No egress charge for GCS-to-GCE in the same region
BigQuery to Compute Engine (same region) Free (Storage API) BigQuery Storage Read API in the same region incurs no egress

Using Cloud CDN to Reduce Egress

# Enable Cloud CDN on a backend bucket (serving static assets from GCS)
gcloud compute backend-buckets create my-static-assets \
  --gcs-bucket-name=my-static-bucket \
  --enable-cdn \
  --cache-mode=CACHE_ALL_STATIC

# Enable CDN on an existing HTTP(S) load balancer backend service
gcloud compute backend-services update my-backend-service \
  --enable-cdn \
  --cache-mode=CACHE_ALL_STATIC \
  --global

# Check CDN cache hit rate in Cloud Monitoring
# Use the metric: loadbalancing.googleapis.com/https/request_count
# Split by: cache_result (HIT, MISS, BYPASS)

Cloud SQL Cost Optimization

Rightsizing Cloud SQL Instances

Cloud SQL does not have a native recommender equivalent to Compute Optimizer. Monitor CPU, memory, and disk I/O via Cloud Monitoring and manually rightsize when average CPU utilization is consistently below 20%. Use the gcloud sql instances patch command to change machine type; a brief restart is required.

Committed Use Discounts for Cloud SQL

Cloud SQL supports CUDs for instances running on shared-core or dedicated-core machine types. A 1-year commitment delivers approximately 25% savings; a 3-year commitment delivers approximately 52% savings compared to On-Demand for the same instance type.

High Availability Only for Production

Cloud SQL HA (regional instances with standby) roughly doubles the instance cost because GCP provisions a hot standby in a second zone. Enforce a policy via Terraform variable: set availability_type = "REGIONAL" for production and availability_type = "ZONAL" for staging and development environments.

resource "google_sql_database_instance" "primary" {
  name             = "my-postgres-${var.environment}"
  database_version = "POSTGRES_15"
  region           = var.region

  settings {
    tier = var.environment == "prod" ? "db-custom-4-15360" : "db-custom-2-7680"

    availability_type = var.environment == "prod" ? "REGIONAL" : "ZONAL"

    backup_configuration {
      enabled                        = true
      start_time                     = "02:00"
      point_in_time_recovery_enabled = var.environment == "prod" ? true : false
      retained_backups               = var.environment == "prod" ? 14 : 3
    }

    insights_config {
      query_insights_enabled = var.environment == "prod" ? true : false
    }
  }

  deletion_protection = var.environment == "prod" ? true : false
}

Budget and Billing Alerts

GCP Budget alerts are project- or billing-account-scoped. They can trigger Pub/Sub notifications for programmatic responses in addition to email alerts.

# Create a monthly budget alert for a GCP project
gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="Monthly Project Budget - my-project" \
  --budget-amount=5000USD \
  --calendar-period=month \
  --projects=projects/my-project \
  --threshold-rule=percent=0.5,basis=CURRENT_SPEND \
  --threshold-rule=percent=0.8,basis=CURRENT_SPEND \
  --threshold-rule=percent=1.0,basis=CURRENT_SPEND \
  --threshold-rule=percent=1.0,basis=FORECASTED_SPEND \
  --notifications-rule-pubsub-topic=projects/my-project/topics/billing-alerts \
  --notifications-rule-disable-default-iam-recipients=false

# List all budgets for a billing account
gcloud billing budgets list \
  --billing-account=BILLING_ACCOUNT_ID \
  --format="table(name,displayName,amount.specifiedAmount.units,thresholdRules)"
Automation: Connect the billing-alerts Pub/Sub topic to a Cloud Function that posts to your team's Slack channel with a direct link to the GCP Cost Management console. For budget overruns above 100%, trigger an automatic page via PagerDuty.

Cost Allocation with Labels

GCP uses labels (key-value pairs) for cost allocation rather than the tag system used by AWS. Labels applied to Compute Engine VMs, GCS buckets, BigQuery datasets, and other resources appear in billing exports.

Recommended Label Strategy

Label Key Example Values Purpose
environment prod, staging, dev, sandbox Separate environment spend; enforce HA/non-HA policies
team platform, backend, data, security Chargeback and showback by engineering team
project customer-portal, data-lake Attribute spend to product initiatives
cost-center cc-1001, cc-2003 Align with finance billing codes
managed-by terraform, manual, gke Track IaC coverage; identify drift-prone resources
# Apply labels to a Compute Engine instance
gcloud compute instances add-labels my-vm \
  --zone=asia-southeast1-b \
  --labels=environment=prod,team=platform,project=customer-portal,cost-center=cc-1001

# Apply labels to a GCS bucket
gsutil label ch \
  -l environment:prod \
  -l team:data \
  -l project:data-lake \
  gs://my-bucket

# Find unlabeled Compute Engine instances (missing 'environment' label)
gcloud compute instances list \
  --format="table(name,zone,status,labels)" \
  --filter="NOT labels.environment:*"

Billing Export to BigQuery

Exporting GCP billing data to BigQuery enables powerful custom analysis using SQL. Enable the export in the Billing console under Billing Export → Standard Usage Cost and Detailed Usage Cost.

-- Top 10 cost drivers by service for the current month
SELECT
  service.description AS service,
  SUM(cost) + SUM(IFNULL((SELECT SUM(c.amount)
    FROM UNNEST(credits) c), 0)) AS total_net_cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
  DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
GROUP BY service
ORDER BY total_net_cost_usd DESC
LIMIT 10;

-- Cost by label (team) for the current month
SELECT
  (SELECT value FROM UNNEST(labels) WHERE key = 'team') AS team,
  SUM(cost) AS total_cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
  DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
GROUP BY team
ORDER BY total_cost_usd DESC;

-- Identify resources with no 'environment' label
SELECT
  DISTINCT resource.name AS resource,
  service.description AS service,
  SUM(cost) AS cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
  DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
  AND NOT EXISTS (
    SELECT 1 FROM UNNEST(labels) l WHERE l.key = 'environment'
  )
GROUP BY resource, service
HAVING cost_usd > 10
ORDER BY cost_usd DESC;

Recommender API Integration

The GCP Recommender API provides programmatic access to all recommendation types. Use it to build automated cost governance workflows that integrate with your engineering processes.

# List all active recommendations across all recommenders in a project/location
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --filter="stateInfo.state=ACTIVE" \
  --format=json | python3 -c "
import json, sys
recs = json.load(sys.stdin)
for r in recs:
    name = r.get('content', {}).get('overview', {}).get('resourceName', 'N/A')
    savings = r.get('primaryImpact', {}).get('costProjection', {}).get('cost', {}).get('units', 'N/A')
    print(f'Resource: {name} | Monthly savings: \${savings}')
"

# List idle persistent disk recommendations
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1 \
  --recommender=google.compute.disk.IdleResourceRecommender \
  --format="table(
    content.overview.resourceName,
    primaryImpact.costProjection.cost.units,
    stateInfo.state
  )"

# Mark a recommendation as succeeded after implementing it
gcloud recommender recommendations mark-succeeded \
  RECOMMENDATION_ID \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --etag=ETAG_VALUE \
  --state-metadata=implementedBy=terraform,ticket=INFRA-1234
Best practice: Schedule a weekly Cloud Function that calls the Recommender API, aggregates all active recommendations with potential savings greater than $50/month, and posts a digest to your team's cost channel. This keeps cost optimization continuously visible without requiring manual console checks.