GCP Cost Optimization

GCP Cost Optimization — Google Cloud differentiates itself with automatic Sustained Use Discounts, flexible Committed Use Discounts, and native cost intelligence tools. This guide covers GCP-specific pricing mechanics and practical optimization techniques for Compute Engine, GCS, BigQuery, GKE, and Cloud SQL.

GCP Pricing Models

GCP offers four distinct pricing models for Compute Engine resources. Understanding how they stack and interact is critical to building an efficient commitment portfolio.

On-Demand (Pay-as-you-go)

Per-second billing (minimum 1 minute) for most VM types. No commitment required. This is the baseline price used as the reference for all discount calculations. Custom machine types are also available at On-Demand pricing — useful for precisely matching workload requirements without paying for unused CPU or memory.

Sustained Use Discounts (SUD) — Automatic

GCP automatically applies SUDs when you use an N1, N2, or C2 VM for more than 25% of the month. Discounts increase linearly up to 30% for full-month usage with no action required. SUDs apply per-project, per-region, and are calculated on aggregate vCPU and memory consumed — not per individual VM. This means even VM instances that are created and destroyed throughout the month can accumulate SUD benefit if total resource hours are high enough.

Committed Use Discounts (CUDs)

Purchase a commitment for 1 year or 3 years in exchange for a deeper discount beyond SUD. CUDs come in two flavors: Resource-based (commit to specific vCPU and memory in a region) and Flexible (commit to a spend level in $/hour). See the dedicated section below for a full comparison.

Preemptible VMs / Spot VMs

Preemptible VMs (the original offering) and Spot VMs (newer, no maximum 24-hour runtime limit) provide discounts of 60–91% by using spare GCP capacity. Google can terminate these VMs with a 30-second shutdown notice. Spot VMs are the recommended choice for new workloads as they have no maximum runtime restriction and support the same machine types.

Committed Use Discounts (CUDs)

Resource-Based vs Flexible CUDs

Attribute	Resource-Based CUD	Flexible CUD
What you commit	Specific vCPU and memory in a region	$/hour of compute spend in a region
Max discount (1yr)	Up to 37% (on top of SUD, combined ~57%)	Up to 28%
Max discount (3yr)	Up to 70% vs On-Demand	Up to 28% (3yr currently same rate as 1yr for Flexible)
Machine family flexibility	Low — bound to specific machine series (e.g., N2)	High — applies across N2, C2, M2, C3, and more
Applies to	Compute Engine VMs (specific series)	Most Compute Engine machine types
Best for	Stable, predictable workloads on a known machine series	Mixed or evolving machine families; GKE node pools

Purchasing a CUD

# Purchase a 1-year resource-based CUD for N2 vCPUs in us-central1
gcloud compute commitments create my-n2-commitment \
  --region=us-central1 \
  --plan=12-month \
  --resources=vcpu=32,memory=128GB \
  --type=GENERAL_PURPOSE_N2

# List existing commitments and their utilization
gcloud compute commitments list \
  --filter="region:us-central1" \
  --format="table(name,region,plan,status,startTimestamp,endTimestamp)"

Strategy: Stack Resource-Based CUDs and Sustained Use Discounts for maximum savings on stable N2 workloads. Use Flexible CUDs for GKE clusters where the node pool machine type may change over time. Never commit more vCPU/memory than your lowest observed monthly consumption.

Preemptible and Spot VMs

Key Differences

Attribute	Preemptible VM	Spot VM
Maximum runtime	24 hours (hard limit)	No maximum runtime
Shutdown notice	30 seconds	30 seconds
Pricing	Fixed discount (~60–91% vs On-Demand)	Same pricing as Preemptible; dynamic in some regions
Recommendation	Legacy; use for existing workloads only	Preferred for new workloads; no 24-hr restart penalty
Live migration	Not supported	Not supported

Shutdown Script for Graceful Termination

#!/bin/bash
# /etc/spot-shutdown.sh
# Set as the instance's shutdown-script metadata key.
# GCP calls this script when the VM receives a preemption notice.

echo "Spot VM preemption notice received at $(date)" >> /var/log/spot-shutdown.log

# Stop application gracefully
systemctl stop myapp

# Flush in-progress work to GCS (example: copy work-in-progress files)
gsutil -m cp /var/app/in-progress/* gs://my-bucket/checkpoints/

# Deregister from backend service (if not behind a managed instance group)
INSTANCE_NAME=$(curl -s -H "Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/name")

ZONE=$(curl -s -H "Metadata-Flavor: Google" \
  "http://metadata.google.internal/computeMetadata/v1/instance/zone" | awk -F/ '{print $NF}')

gcloud compute backend-services remove-backend my-backend-service \
  --instance-group=my-instance-group \
  --instance-group-zone="$ZONE" \
  --global 2>/dev/null || true

echo "Graceful shutdown complete" >> /var/log/spot-shutdown.log

Managed Instance Group with Spot VMs (Terraform)

resource "google_compute_instance_template" "spot_template" {
  name_prefix  = "spot-worker-"
  machine_type = "n2-standard-4"

  disk {
    source_image = "debian-cloud/debian-12"
    auto_delete  = true
    boot         = true
    disk_size_gb = 50
  }

  network_interface {
    network    = "default"
    subnetwork = var.subnet_self_link
  }

  scheduling {
    preemptible        = true
    automatic_restart  = false
    provisioning_model = "SPOT"
    instance_termination_action = "STOP"
  }

  metadata = {
    shutdown-script = file("${path.module}/scripts/spot-shutdown.sh")
  }

  lifecycle {
    create_before_destroy = true
  }
}

resource "google_compute_instance_group_manager" "spot_mig" {
  name               = "spot-worker-mig"
  base_instance_name = "spot-worker"
  zone               = var.zone

  version {
    instance_template = google_compute_instance_template.spot_template.id
  }

  target_size = 10

  auto_healing_policies {
    health_check      = google_compute_health_check.default.id
    initial_delay_sec = 120
  }
}

Compute Engine Rightsizing with GCP Recommender

GCP Recommender continuously analyzes VM CPU and memory metrics and generates rightsizing recommendations. Recommendations are available in the console under "Recommendations Hub" and via the Recommender API and gcloud CLI.

# List VM rightsizing recommendations for a project in a specific zone
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --format="table(
    name.basename(),
    content.overview.resourceName,
    content.overview.recommendation,
    primaryImpact.costProjection.cost.units,
    stateInfo.state
  )"

# List idle VM recommendations (candidates for shutdown or deletion)
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.IdleResourceRecommender \
  --format="table(
    name.basename(),
    content.overview.resourceName,
    primaryImpact.costProjection.cost.units,
    stateInfo.state
  )"

# Mark a recommendation as claimed (indicate you are acting on it)
gcloud recommender recommendations mark-claimed \
  RECOMMENDATION_ID \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --etag=ETAG_VALUE \
  --state-metadata=priority=high

Automation tip: Use the Recommender API in a Cloud Function triggered by a Pub/Sub schedule to automatically create GitHub issues or Jira tickets for each new high-priority recommendation. This brings cost recommendations into your engineering workflow without requiring engineers to check the console.

GCS Cost Optimization

Storage Classes

Storage Class	Use Case	Min Storage Duration	Retrieval Cost	Approx. Cost/GB/mo
Standard	Frequently accessed, hot data	None	Free	$0.020
Nearline	Accessed less than once per month	30 days	$0.01/GB	$0.010
Coldline	Accessed less than once per quarter	90 days	$0.02/GB	$0.004
Archive	Long-term retention, rarely accessed	365 days	$0.05/GB	$0.0012

GCS Lifecycle Rule Example

{
  "lifecycle": {
    "rule": [
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "NEARLINE"
        },
        "condition": {
          "age": 30,
          "matchesStorageClass": ["STANDARD"]
        }
      },
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "COLDLINE"
        },
        "condition": {
          "age": 90,
          "matchesStorageClass": ["NEARLINE"]
        }
      },
      {
        "action": {
          "type": "SetStorageClass",
          "storageClass": "ARCHIVE"
        },
        "condition": {
          "age": 365,
          "matchesStorageClass": ["COLDLINE"]
        }
      },
      {
        "action": {
          "type": "Delete"
        },
        "condition": {
          "age": 2555,
          "isLive": false
        }
      }
    ]
  }
}

# Apply lifecycle configuration to a GCS bucket
gsutil lifecycle set lifecycle.json gs://my-bucket

# Verify the policy was applied
gsutil lifecycle get gs://my-bucket

BigQuery Cost Optimization

BigQuery costs are driven primarily by query processing (bytes scanned) and storage. On-demand pricing charges per byte scanned; slot reservations provide predictable capacity-based pricing.

On-Demand vs Slot Reservations

Model	Pricing Basis	Best For	Risk
On-Demand	$6.25 per TB of bytes scanned	Exploratory analysis, low query volume, unpredictable workloads	A single poorly written query scanning petabytes can create an unexpectedly large bill
Slot Reservations (Standard)	Flat hourly rate per slot; 100-slot minimum	Consistent, high-volume query workloads	Unused slots are wasted; must model query concurrency to size correctly
Slot Reservations (Enterprise)	Monthly or annual commitment; autoscaling available	Production BI platforms, large data teams with SLA requirements	Higher commitment; requires accurate forecasting

Partitioned Tables

Partitioning a table on a DATE or TIMESTAMP column, or on an INTEGER range, allows BigQuery to skip entire partitions that do not match the query's filter — dramatically reducing bytes scanned and therefore cost.

-- Create a partitioned and clustered table for optimal cost
CREATE TABLE `my_project.my_dataset.events`
PARTITION BY DATE(event_timestamp)
CLUSTER BY user_id, event_type
OPTIONS (
  require_partition_filter = TRUE,
  partition_expiration_days = 365
) AS
SELECT * FROM `my_project.my_dataset.events_raw`;

-- Query that benefits from partition pruning (scans only 2026-03-28 partition)
SELECT
  user_id,
  COUNT(*) AS event_count
FROM `my_project.my_dataset.events`
WHERE DATE(event_timestamp) = '2026-03-28'
  AND event_type = 'purchase'
GROUP BY user_id;

-- Estimate bytes processed before running a query (dry run)
-- Run with --dry_run flag via CLI:
-- bq query --dry_run --use_legacy_sql=false 'SELECT ...'

Query Cost Estimation

# Estimate bytes processed for a query before executing it
bq query \
  --dry_run \
  --use_legacy_sql=false \
  'SELECT user_id, COUNT(*) FROM `my_project.dataset.events`
   WHERE DATE(event_timestamp) = "2026-03-28"
   GROUP BY user_id'

# Output: "Query successfully validated. Assuming the tables are not modified,
# running this query will process X bytes."
# Cost estimate = X bytes / 1e12 * $6.25

Warning: Avoid SELECT * on large tables in BigQuery. Always select only the columns you need. BigQuery bills per column scanned (columnar storage), so selecting 3 columns from a 200-column table can reduce scan cost by 98%.

GKE Cost Optimization

Cluster Autoscaler

Enable Cluster Autoscaler on all node pools to automatically add nodes when pending pods cannot be scheduled and remove underutilized nodes. Combined with Spot node pools for non-critical workloads, this is typically the single largest GKE cost reduction lever.

# Enable autoscaling on a node pool (min 1, max 10 nodes)
gcloud container clusters update my-cluster \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=10 \
  --node-pool=default-pool \
  --region=asia-southeast1

# Enable the Vertical Pod Autoscaler (VPA) on the cluster
gcloud container clusters update my-cluster \
  --enable-vertical-pod-autoscaling \
  --region=asia-southeast1

Spot Node Pools in Terraform

resource "google_container_node_pool" "spot_pool" {
  name       = "spot-node-pool"
  cluster    = google_container_cluster.primary.name
  location   = var.region
  node_count = null  # managed by autoscaler

  autoscaling {
    min_node_count = 0
    max_node_count = 20
  }

  node_config {
    machine_type = "n2-standard-4"
    disk_size_gb = 50
    disk_type    = "pd-standard"

    spot = true  # Use Spot VMs for this node pool

    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]

    labels = {
      "cloud.google.com/gke-spot" = "true"
      environment                 = "prod"
      team                        = "platform"
    }

    taint {
      key    = "cloud.google.com/gke-spot"
      value  = "true"
      effect = "NO_SCHEDULE"
    }
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }
}

# In your Kubernetes workload spec, tolerate the Spot taint:
# tolerations:
# - key: "cloud.google.com/gke-spot"
#   operator: "Equal"
#   value: "true"
#   effect: "NO_SCHEDULE"

Node Pool Rightsizing

Use Workload Identity and minimize daemon set resource requests to avoid reserving large amounts of node capacity for infrastructure pods.
Enable Node Auto Provisioning (NAP) to let GKE automatically select the optimal machine type and size for pending pods.
Review node utilization using: kubectl top nodes and the GKE Workload Metrics in Cloud Monitoring.
Set explicit requests and limits on all containers — missing resource requests cause the scheduler to place pods sub-optimally and inflate node count.

Network Egress Costs

GCP's network pricing is tiered by destination. Designing your architecture to minimize cross-region and internet egress is essential for data-intensive workloads.

Transfer Path	Approximate Cost	Notes
Same region, same zone (internal IP)	Free	Use internal IPs always for intra-region traffic
Same region, different zone (internal IP)	$0.01/GB	Charged in both directions ($0.02/GB round-trip)
Cross-region within GCP (Americas)	~$0.01–0.02/GB	Route-dependent; check GCP pricing calculator for specific pairs
Egress to internet (first 1 TB/mo)	$0.08/GB (Americas/EMEA)	First 1 GB/month free; pricing varies by destination continent
Egress via Cloud CDN to internet	~$0.02–0.06/GB	Tiered; significantly cheaper than VM direct egress at scale
GCS to Compute Engine (same region)	Free	No egress charge for GCS-to-GCE in the same region
BigQuery to Compute Engine (same region)	Free (Storage API)	BigQuery Storage Read API in the same region incurs no egress

Using Cloud CDN to Reduce Egress

# Enable Cloud CDN on a backend bucket (serving static assets from GCS)
gcloud compute backend-buckets create my-static-assets \
  --gcs-bucket-name=my-static-bucket \
  --enable-cdn \
  --cache-mode=CACHE_ALL_STATIC

# Enable CDN on an existing HTTP(S) load balancer backend service
gcloud compute backend-services update my-backend-service \
  --enable-cdn \
  --cache-mode=CACHE_ALL_STATIC \
  --global

# Check CDN cache hit rate in Cloud Monitoring
# Use the metric: loadbalancing.googleapis.com/https/request_count
# Split by: cache_result (HIT, MISS, BYPASS)

Cloud SQL Cost Optimization

Rightsizing Cloud SQL Instances

Cloud SQL does not have a native recommender equivalent to Compute Optimizer. Monitor CPU, memory, and disk I/O via Cloud Monitoring and manually rightsize when average CPU utilization is consistently below 20%. Use the gcloud sql instances patch command to change machine type; a brief restart is required.

Committed Use Discounts for Cloud SQL

Cloud SQL supports CUDs for instances running on shared-core or dedicated-core machine types. A 1-year commitment delivers approximately 25% savings; a 3-year commitment delivers approximately 52% savings compared to On-Demand for the same instance type.

High Availability Only for Production

Cloud SQL HA (regional instances with standby) roughly doubles the instance cost because GCP provisions a hot standby in a second zone. Enforce a policy via Terraform variable: set availability_type = "REGIONAL" for production and availability_type = "ZONAL" for staging and development environments.

resource "google_sql_database_instance" "primary" {
  name             = "my-postgres-${var.environment}"
  database_version = "POSTGRES_15"
  region           = var.region

  settings {
    tier = var.environment == "prod" ? "db-custom-4-15360" : "db-custom-2-7680"

    availability_type = var.environment == "prod" ? "REGIONAL" : "ZONAL"

    backup_configuration {
      enabled                        = true
      start_time                     = "02:00"
      point_in_time_recovery_enabled = var.environment == "prod" ? true : false
      retained_backups               = var.environment == "prod" ? 14 : 3
    }

    insights_config {
      query_insights_enabled = var.environment == "prod" ? true : false
    }
  }

  deletion_protection = var.environment == "prod" ? true : false
}

Budget and Billing Alerts

GCP Budget alerts are project- or billing-account-scoped. They can trigger Pub/Sub notifications for programmatic responses in addition to email alerts.

# Create a monthly budget alert for a GCP project
gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="Monthly Project Budget - my-project" \
  --budget-amount=5000USD \
  --calendar-period=month \
  --projects=projects/my-project \
  --threshold-rule=percent=0.5,basis=CURRENT_SPEND \
  --threshold-rule=percent=0.8,basis=CURRENT_SPEND \
  --threshold-rule=percent=1.0,basis=CURRENT_SPEND \
  --threshold-rule=percent=1.0,basis=FORECASTED_SPEND \
  --notifications-rule-pubsub-topic=projects/my-project/topics/billing-alerts \
  --notifications-rule-disable-default-iam-recipients=false

# List all budgets for a billing account
gcloud billing budgets list \
  --billing-account=BILLING_ACCOUNT_ID \
  --format="table(name,displayName,amount.specifiedAmount.units,thresholdRules)"

Automation: Connect the billing-alerts Pub/Sub topic to a Cloud Function that posts to your team's Slack channel with a direct link to the GCP Cost Management console. For budget overruns above 100%, trigger an automatic page via PagerDuty.

Cost Allocation with Labels

GCP uses labels (key-value pairs) for cost allocation rather than the tag system used by AWS. Labels applied to Compute Engine VMs, GCS buckets, BigQuery datasets, and other resources appear in billing exports.

Recommended Label Strategy

Label Key	Example Values	Purpose
`environment`	prod, staging, dev, sandbox	Separate environment spend; enforce HA/non-HA policies
`team`	platform, backend, data, security	Chargeback and showback by engineering team
`project`	customer-portal, data-lake	Attribute spend to product initiatives
`cost-center`	cc-1001, cc-2003	Align with finance billing codes
`managed-by`	terraform, manual, gke	Track IaC coverage; identify drift-prone resources

# Apply labels to a Compute Engine instance
gcloud compute instances add-labels my-vm \
  --zone=asia-southeast1-b \
  --labels=environment=prod,team=platform,project=customer-portal,cost-center=cc-1001

# Apply labels to a GCS bucket
gsutil label ch \
  -l environment:prod \
  -l team:data \
  -l project:data-lake \
  gs://my-bucket

# Find unlabeled Compute Engine instances (missing 'environment' label)
gcloud compute instances list \
  --format="table(name,zone,status,labels)" \
  --filter="NOT labels.environment:*"

Billing Export to BigQuery

Exporting GCP billing data to BigQuery enables powerful custom analysis using SQL. Enable the export in the Billing console under Billing Export → Standard Usage Cost and Detailed Usage Cost.

-- Top 10 cost drivers by service for the current month
SELECT
  service.description AS service,
  SUM(cost) + SUM(IFNULL((SELECT SUM(c.amount)
    FROM UNNEST(credits) c), 0)) AS total_net_cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
  DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
GROUP BY service
ORDER BY total_net_cost_usd DESC
LIMIT 10;

-- Cost by label (team) for the current month
SELECT
  (SELECT value FROM UNNEST(labels) WHERE key = 'team') AS team,
  SUM(cost) AS total_cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
  DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
GROUP BY team
ORDER BY total_cost_usd DESC;

-- Identify resources with no 'environment' label
SELECT
  DISTINCT resource.name AS resource,
  service.description AS service,
  SUM(cost) AS cost_usd
FROM `my_project.billing_export.gcp_billing_export_v1_XXXXXX`
WHERE
  DATE(_PARTITIONTIME) >= DATE_TRUNC(CURRENT_DATE(), MONTH)
  AND NOT EXISTS (
    SELECT 1 FROM UNNEST(labels) l WHERE l.key = 'environment'
  )
GROUP BY resource, service
HAVING cost_usd > 10
ORDER BY cost_usd DESC;

Recommender API Integration

The GCP Recommender API provides programmatic access to all recommendation types. Use it to build automated cost governance workflows that integrate with your engineering processes.

# List all active recommendations across all recommenders in a project/location
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --filter="stateInfo.state=ACTIVE" \
  --format=json | python3 -c "
import json, sys
recs = json.load(sys.stdin)
for r in recs:
    name = r.get('content', {}).get('overview', {}).get('resourceName', 'N/A')
    savings = r.get('primaryImpact', {}).get('costProjection', {}).get('cost', {}).get('units', 'N/A')
    print(f'Resource: {name} | Monthly savings: \${savings}')
"

# List idle persistent disk recommendations
gcloud recommender recommendations list \
  --project=my-project \
  --location=asia-southeast1 \
  --recommender=google.compute.disk.IdleResourceRecommender \
  --format="table(
    content.overview.resourceName,
    primaryImpact.costProjection.cost.units,
    stateInfo.state
  )"

# Mark a recommendation as succeeded after implementing it
gcloud recommender recommendations mark-succeeded \
  RECOMMENDATION_ID \
  --project=my-project \
  --location=asia-southeast1-b \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --etag=ETAG_VALUE \
  --state-metadata=implementedBy=terraform,ticket=INFRA-1234

Best practice: Schedule a weekly Cloud Function that calls the Recommender API, aggregates all active recommendations with potential savings greater than $50/month, and posts a digest to your team's cost channel. This keeps cost optimization continuously visible without requiring manual console checks.