Last quarter, our Snowflake bill hit $47,000 a month. For a 12-person data team running roughly 200 pipelines and serving dashboards to 300 internal users, that felt excessive. Our CFO agreed. I was given a straightforward mandate: cut costs without breaking anything. Three months later, our bill sits at $18,600. That is a 60.4% reduction, and our p95 query latency actually improved by 14%. Here is exactly how we did it.
Understanding Where the Money Goes
Before touching anything, we needed to understand our spend. Snowflake charges primarily for compute (credits consumed by virtual warehouses) and storage. In our case, compute was 89% of the bill. Storage was almost a rounding error. That immediately told us where to focus.
The first thing I did was run a cost attribution query that every Snowflake customer should have bookmarked:
-- Credit consumption by warehouse over the last 30 days
SELECT
warehouse_name,
SUM(credits_used) AS total_credits,
SUM(credits_used) * 3.00 AS estimated_cost_usd,
ROUND(SUM(credits_used) / (SELECT SUM(credits_used)
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD('day', -30, CURRENT_TIMESTAMP())
) * 100, 1) AS pct_of_total
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
ORDER BY total_credits DESC;
The results were eye-opening. We had 14 warehouses. Three of them accounted for 72% of our spend. One warehouse named ANALYTICS_XL was burning $11,200 per month and was used by exactly two dashboards. That is the kind of thing you only find when you look.
Warehouse Sizing: Bigger Is Not Better
The single biggest mistake I see teams make with Snowflake cost optimization is oversizing warehouses. There is a persistent myth that a larger warehouse makes queries faster. For most analytical queries, that is simply not true. Snowflake warehouses scale horizontally, meaning a larger warehouse helps with queries that scan massive amounts of data, but a simple aggregation over a few million rows runs just as fast on an X-Small as it does on a Large.
We audited every warehouse by comparing its size to the actual query workload:
-- Find oversized warehouses: queries that don't need the horsepower
SELECT
warehouse_name,
warehouse_size,
COUNT(*) AS query_count,
AVG(total_elapsed_time) / 1000 AS avg_duration_sec,
AVG(bytes_scanned) / POWER(1024, 3) AS avg_gb_scanned,
MEDIAN(partitions_scanned) AS median_partitions_scanned
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -14, CURRENT_TIMESTAMP())
AND execution_status = 'SUCCESS'
AND warehouse_name IS NOT NULL
GROUP BY warehouse_name, warehouse_size
ORDER BY avg_gb_scanned ASC;
We discovered that 9 of our 14 warehouses were at least two sizes too large. Our REPORTING_LARGE warehouse had a median scan of 1.2 GB per query. An X-Small warehouse handles that in under 3 seconds. We were paying for a Large.
The downsizing was straightforward:
-- Downsize warehouses to match actual workload
ALTER WAREHOUSE REPORTING_LARGE SET WAREHOUSE_SIZE = 'XSMALL';
ALTER WAREHOUSE REPORTING_LARGE RENAME TO REPORTING_XS;
ALTER WAREHOUSE ANALYTICS_XL SET WAREHOUSE_SIZE = 'SMALL';
ALTER WAREHOUSE ANALYTICS_XL RENAME TO ANALYTICS_SM;
ALTER WAREHOUSE ETL_MEDIUM SET WAREHOUSE_SIZE = 'SMALL';
ALTER WAREHOUSE ETL_MEDIUM RENAME TO ETL_SM;
Impact: Warehouse downsizing alone saved us $14,800 per month. That was 31% of the original bill from a change that took one afternoon.
Auto-Suspend and Auto-Resume Tuning
Snowflake charges by the second, with a 60-second minimum. Every second a warehouse is running and idle, you are burning credits. The default auto-suspend timeout is 10 minutes (600 seconds). For most workloads, that is far too generous.
We analyzed idle time per warehouse to set appropriate suspend timeouts:
-- Analyze gaps between queries to determine optimal auto-suspend
WITH query_gaps AS (
SELECT
warehouse_name,
start_time,
LAG(end_time) OVER (
PARTITION BY warehouse_name ORDER BY start_time
) AS prev_end_time,
DATEDIFF('second', LAG(end_time) OVER (
PARTITION BY warehouse_name ORDER BY start_time
), start_time) AS gap_seconds
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP())
AND warehouse_name IS NOT NULL
AND execution_status = 'SUCCESS'
)
SELECT
warehouse_name,
COUNT(*) AS total_queries,
PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY gap_seconds) AS p50_gap,
PERCENTILE_CONT(0.90) WITHIN GROUP (ORDER BY gap_seconds) AS p90_gap,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY gap_seconds) AS p99_gap
FROM query_gaps
WHERE gap_seconds IS NOT NULL AND gap_seconds > 0
GROUP BY warehouse_name
ORDER BY warehouse_name;
For most of our warehouses, the p90 gap between queries was under 30 seconds. Setting auto-suspend to 60 seconds meant we were only paying for one extra minute of idle time in the worst case, instead of ten minutes.
-- Aggressive auto-suspend for all warehouses
ALTER WAREHOUSE REPORTING_XS SET AUTO_SUSPEND = 60 AUTO_RESUME = TRUE;
ALTER WAREHOUSE ANALYTICS_SM SET AUTO_SUSPEND = 60 AUTO_RESUME = TRUE;
ALTER WAREHOUSE ETL_SM SET AUTO_SUSPEND = 120 AUTO_RESUME = TRUE;
ALTER WAREHOUSE DASHBOARD_XS SET AUTO_SUSPEND = 60 AUTO_RESUME = TRUE;
ALTER WAREHOUSE TRANSFORM_SM SET AUTO_SUSPEND = 60 AUTO_RESUME = TRUE;
For ETL warehouses that run on a schedule, we went even further. We explicitly suspend the warehouse at the end of each pipeline run in our Airflow DAGs:
# In our Airflow DAG: explicitly suspend warehouse after pipeline completes
from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator
suspend_warehouse = SnowflakeOperator(
task_id='suspend_etl_warehouse',
sql="ALTER WAREHOUSE ETL_SM SUSPEND;",
snowflake_conn_id='snowflake_prod',
trigger_rule='all_done', # runs even if upstream fails
)
# Task ordering: extract >> transform >> load >> suspend_warehouse
Impact: Auto-suspend tuning saved us approximately $3,400 per month. Small numbers per warehouse, but they compound quickly across 14 warehouses running 24/7.
Query Optimization: Clustering Keys and Search Optimization
Reducing warehouse sizes only works if queries are efficient enough to run on smaller hardware. We invested two weeks into query optimization, and it paid for itself immediately.
Clustering Keys
Our largest fact table, events, had 4.2 billion rows. Queries filtered almost exclusively by event_date and customer_id, but the table had no clustering key defined. Snowflake's automatic micro-partitioning does a decent job, but when your table grows past a few billion rows, explicit clustering keys make a dramatic difference in partition pruning.
-- Check current clustering depth (higher = worse pruning)
SELECT SYSTEM$CLUSTERING_INFORMATION('analytics.events', '(event_date, customer_id)');
-- Define clustering key on the most-filtered columns
ALTER TABLE analytics.events CLUSTER BY (event_date, customer_id);
-- Monitor reclustering progress
SELECT
table_name,
clustering_key,
total_partition_count,
average_overlaps,
average_depth
FROM TABLE(information_schema.automatic_clustering_history(
date_range_start => DATEADD('hour', -24, CURRENT_TIMESTAMP()),
table_name => 'analytics.events'
));
After reclustering completed (about 18 hours for our table), the average partition scan for date-filtered queries dropped from 2,340 partitions to 87 partitions. Queries that took 45 seconds now took 3 seconds. More importantly, they consumed far fewer credits because less data was scanned.
Search Optimization Service
For point lookups -- queries filtering on high-cardinality columns like user_id or order_id -- we enabled the Search Optimization Service. This is not free (it costs additional storage and compute for maintenance), but for our lookup-heavy dashboard queries, it was a net positive.
-- Enable search optimization for point-lookup patterns
ALTER TABLE analytics.events ADD SEARCH OPTIMIZATION
ON EQUALITY(customer_id, session_id);
ALTER TABLE orders.line_items ADD SEARCH OPTIMIZATION
ON EQUALITY(order_id)
ON SUBSTRING(product_name);
-- Verify it's being used
SELECT * FROM TABLE(information_schema.search_optimization_history(
date_range_start => DATEADD('day', -7, CURRENT_TIMESTAMP())
));
Impact: Clustering keys and search optimization together reduced average query scan volume by 73%, which directly translated to being able to run on smaller warehouses. Estimated savings: $4,200 per month.
Eliminating Warehouse Sprawl
We had 14 warehouses. After our audit, we consolidated down to 6. The problem with warehouse sprawl is not just the direct cost -- it is that each warehouse has its own auto-suspend timer, its own idle overhead, and its own minimum billing increment. Fourteen warehouses each idling for 5 minutes a day is 70 minutes of wasted compute, multiplied by various sizes.
Our consolidation strategy was simple: group workloads by their characteristics, not by team ownership.
| Before (14 warehouses) | After (6 warehouses) | Rationale |
|---|---|---|
| ANALYTICS_XL, REPORTING_LARGE, BI_MEDIUM | QUERY_SM | All interactive analytics, similar scan profiles |
| ETL_MEDIUM, TRANSFORM_MEDIUM, LOAD_SMALL | ETL_SM | Scheduled batch, can share and queue |
| DASHBOARD_LARGE, EXEC_DASHBOARD_MED | DASHBOARD_XS | Dashboards use result cache heavily |
| DEV_LARGE, STAGING_MEDIUM | DEV_XS | Dev/test should never be large |
| ML_XLARGE | ML_MEDIUM | ML training needs more, but not XL |
| INGESTION_SMALL, CDC_SMALL | INGEST_XS | Small continuous loads, minimal compute |
-- Consolidate by migrating users and dropping redundant warehouses
-- Step 1: Reassign all roles to use the new consolidated warehouses
ALTER USER data_analyst_1 SET DEFAULT_WAREHOUSE = 'QUERY_SM';
ALTER USER data_analyst_2 SET DEFAULT_WAREHOUSE = 'QUERY_SM';
-- Step 2: Update role defaults
ALTER ROLE ANALYTICS_ROLE SET DEFAULT_WAREHOUSE = 'QUERY_SM';
ALTER ROLE ETL_ROLE SET DEFAULT_WAREHOUSE = 'ETL_SM';
-- Step 3: Suspend and drop the old warehouses
ALTER WAREHOUSE ANALYTICS_XL SUSPEND;
DROP WAREHOUSE ANALYTICS_XL;
ALTER WAREHOUSE REPORTING_LARGE SUSPEND;
DROP WAREHOUSE REPORTING_LARGE;
Impact: Fewer warehouses meant less idle time, fewer minimum-billing events, and simpler management. Estimated savings: $2,600 per month.
Result Caching: The Free Optimization
Snowflake caches query results for 24 hours. If the same query runs again and the underlying data has not changed, the result is returned instantly at zero credit cost. This is completely free, and many teams accidentally defeat it.
The most common way to break result caching is using non-deterministic functions:
-- BAD: This will never use result cache because of CURRENT_TIMESTAMP()
SELECT customer_id, SUM(amount)
FROM orders
WHERE order_date >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY customer_id;
-- GOOD: Use a fixed date boundary; cache will work for 24h
SELECT customer_id, SUM(amount)
FROM orders
WHERE order_date >= '2026-01-01'
GROUP BY customer_id;
-- GOOD: For dashboards, truncate to date so the cache key is stable within a day
SELECT customer_id, SUM(amount)
FROM orders
WHERE order_date >= DATEADD('day', -30, CURRENT_DATE)
GROUP BY customer_id;
We also found that some of our BI tool configurations were adding unique session identifiers as query comments, which broke the cache for every single execution. Removing those comments immediately improved our cache hit rate.
-- Check your result cache hit rate
SELECT
DATE_TRUNC('day', start_time) AS query_date,
COUNT_IF(bytes_scanned = 0 AND partitions_scanned = 0) AS cache_hits,
COUNT(*) AS total_queries,
ROUND(cache_hits / total_queries * 100, 1) AS cache_hit_pct
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -14, CURRENT_TIMESTAMP())
AND warehouse_name IS NOT NULL
AND execution_status = 'SUCCESS'
GROUP BY query_date
ORDER BY query_date DESC;
Our cache hit rate went from 12% to 41% after fixing these patterns. For a dashboard warehouse processing 8,000 queries per day, that meant roughly 2,300 additional queries served for free every day.
Impact: Improved result caching saved approximately $1,800 per month, entirely by fixing query patterns. Zero infrastructure changes.
Materialized Views vs. Pre-Aggregated Tables
We had several views that aggregated large tables on the fly. Every time a dashboard refreshed, it re-scanned billions of rows. Snowflake materialized views can handle this, but they come with limitations and maintenance costs. For many of our use cases, a simple pre-aggregated table refreshed by our ETL pipeline was cheaper and more flexible.
-- Materialized view: good for simple aggregations that Snowflake can maintain
CREATE MATERIALIZED VIEW analytics.daily_revenue_mv AS
SELECT
DATE_TRUNC('day', order_date) AS revenue_date,
region,
product_category,
COUNT(*) AS order_count,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_order_value
FROM orders.completed_orders
GROUP BY 1, 2, 3;
-- Pre-aggregated table: better for complex logic or when you need full control
CREATE OR REPLACE TABLE analytics.daily_metrics AS
SELECT
DATE_TRUNC('day', e.event_date) AS metric_date,
e.customer_segment,
COUNT(DISTINCT e.user_id) AS unique_users,
COUNT(DISTINCT CASE WHEN e.event_type = 'purchase' THEN e.user_id END) AS buyers,
SUM(CASE WHEN e.event_type = 'purchase' THEN e.revenue END) AS total_revenue,
-- Complex window functions not supported in materialized views
SUM(total_revenue) OVER (
PARTITION BY e.customer_segment
ORDER BY metric_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7d_revenue
FROM analytics.events e
GROUP BY 1, 2;
The decision framework we used: if the aggregation is simple (GROUP BY, SUM, COUNT) and the source table changes incrementally, use a materialized view. If it involves window functions, complex joins, or you want to control refresh timing, use a table refreshed by your pipeline. Both approaches avoid re-scanning raw data on every dashboard load.
Impact: Moving five heavy aggregations out of live queries and into pre-computed assets saved $1,600 per month in reduced query compute.
Zero-Copy Cloning for Dev Environments
Our development environment was a full copy of production data, refreshed weekly. That meant a weekly ETL job that scanned and copied terabytes of data, consuming significant compute credits and doubling our storage. Snowflake zero-copy cloning eliminated this entirely.
-- Replace weekly full-copy refresh with zero-copy clone
-- Old approach: INSERT OVERWRITE from prod tables (expensive, slow)
-- New approach: Clone entire database in seconds at zero compute cost
-- Drop the old dev database
DROP DATABASE IF EXISTS analytics_dev;
-- Create a zero-copy clone (metadata operation, near-instant, no compute)
CREATE DATABASE analytics_dev CLONE analytics_prod;
-- Grant access to dev roles
GRANT USAGE ON DATABASE analytics_dev TO ROLE DEV_ROLE;
GRANT USAGE ON ALL SCHEMAS IN DATABASE analytics_dev TO ROLE DEV_ROLE;
GRANT SELECT ON ALL TABLES IN DATABASE analytics_dev TO ROLE DEV_ROLE;
Zero-copy clones share the underlying storage until data diverges. A clone of our 3.8 TB production database completed in 11 seconds and consumed zero credits. The old full-copy approach took 4 hours and cost roughly $180 per refresh.
We scheduled weekly clones via a simple task:
-- Automated weekly dev refresh using zero-copy clone
CREATE OR REPLACE TASK admin.refresh_dev_clone
WAREHOUSE = INGEST_XS
SCHEDULE = 'USING CRON 0 6 * * MON America/New_York'
AS
BEGIN
DROP DATABASE IF EXISTS analytics_dev;
CREATE DATABASE analytics_dev CLONE analytics_prod;
GRANT USAGE ON DATABASE analytics_dev TO ROLE DEV_ROLE;
GRANT USAGE ON ALL SCHEMAS IN DATABASE analytics_dev TO ROLE DEV_ROLE;
GRANT SELECT ON ALL TABLES IN DATABASE analytics_dev TO ROLE DEV_ROLE;
END;
Impact: Eliminating full data copies for dev environments saved $720 per month in compute plus reduced storage costs.
Resource Monitors and Alerts
None of the above optimizations matter if costs can creep back up unnoticed. We set up resource monitors as hard guardrails and Snowflake alerts for early warning.
-- Resource monitor: hard cap per warehouse group
CREATE RESOURCE MONITOR etl_monitor
WITH CREDIT_QUOTA = 800
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 75 PERCENT DO NOTIFY
ON 90 PERCENT DO NOTIFY
ON 100 PERCENT DO SUSPEND;
ALTER WAREHOUSE ETL_SM SET RESOURCE_MONITOR = 'etl_monitor';
ALTER WAREHOUSE INGEST_XS SET RESOURCE_MONITOR = 'etl_monitor';
-- Account-level monitor as the last line of defense
CREATE RESOURCE MONITOR account_guard
WITH CREDIT_QUOTA = 6500
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 80 PERCENT DO NOTIFY
ON 95 PERCENT DO SUSPEND_IMMEDIATE;
ALTER ACCOUNT SET RESOURCE_MONITOR = 'account_guard';
-- Alert: notify when any single query costs more than $5 in credits
CREATE OR REPLACE ALERT admin.expensive_query_alert
WAREHOUSE = INGEST_XS
SCHEDULE = '60 MINUTE'
IF (EXISTS (
SELECT 1
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('hour', -1, CURRENT_TIMESTAMP())
AND credits_used_cloud_services > 1.5
AND execution_status = 'SUCCESS'
))
THEN
CALL SYSTEM$SEND_EMAIL(
'cost_alerts',
'data-team@company.com',
'Snowflake: Expensive query detected',
'A query consumed more than 1.5 credits in the last hour. Check QUERY_HISTORY.'
);
Resource monitors are non-negotiable. Without them, a single runaway query or a misconfigured pipeline can wipe out a month of savings overnight. We learned this the hard way when a junior engineer accidentally ran a cross-join on two billion-row tables on an XL warehouse. That single query cost $340.
The Full Before and After
Here is our complete cost breakdown, showing the monthly spend before and after each optimization:
| Optimization | Monthly Savings | Effort (Days) | Risk Level |
|---|---|---|---|
| Warehouse downsizing (14 warehouses) | $14,800 | 1 | Low |
| Query optimization (clustering + search opt) | $4,200 | 10 | Low |
| Auto-suspend tuning (600s to 60s) | $3,400 | 0.5 | None |
| Warehouse consolidation (14 to 6) | $2,600 | 2 | Medium |
| Result cache improvements | $1,800 | 1 | None |
| Materialized views + pre-aggregation | $1,600 | 3 | Low |
| Zero-copy cloning for dev | $720 | 0.5 | None |
| Total | $29,120 | 18 |
We went from $47,000/month to $18,600/month. Annualized, that is $340,800 in savings. The entire effort took 18 engineering days spread across three months, with zero downtime and zero user-facing regressions.
Common Money-Wasting Patterns to Watch For
After going through this exercise, I compiled a list of the most expensive mistakes I see data teams make on Snowflake. If any of these sound familiar, you are likely overspending:
- One warehouse per team. Organizational boundaries are not workload boundaries. A marketing analyst and a finance analyst running similar queries do not need separate Large warehouses. Group by workload profile, not org chart.
- Default auto-suspend of 10 minutes. Change this immediately on every warehouse. 60 seconds is appropriate for interactive workloads. For scheduled ETL, explicitly suspend at the end of the pipeline.
- Using XLARGE for everything "just in case." Start with X-Small. Scale up only when you have evidence that queries are bottlenecked on compute, not I/O or network. In our experience, 80% of workloads run perfectly on X-Small or Small.
- Running dev/staging on production-sized warehouses. Development queries are exploratory and rarely need to be fast. An X-Small warehouse with a 60-second auto-suspend is fine for dev. Zero-copy cloning solves the data freshness problem.
- Not monitoring credit consumption. If you do not have resource monitors set up today, stop reading and go set them up. It takes five minutes and it is the single most important guardrail against runaway costs.
- Ignoring the QUERY_HISTORY view. The
snowflake.account_usage.query_historyview is a goldmine. Query it weekly. Sort bycredits_used_cloud_servicesdescending. You will always find something to fix. - Using CURRENT_TIMESTAMP() in dashboard queries. This defeats result caching. Use
CURRENT_DATEor a fixed date parameter instead. For dashboards that refresh every 15 minutes, this alone can cut dashboard warehouse costs by 30-50%. - Full data copies for non-production environments. If you are running INSERT INTO...SELECT FROM production tables to refresh staging or dev, you are paying twice: once for the compute to copy and once for the storage. Zero-copy cloning costs nothing.
A Python Script to Automate the Audit
I wrote a small script that we run monthly to flag potential waste. It connects to Snowflake, queries the account usage views, and outputs a report. Here is the core of it:
import snowflake.connector
from datetime import datetime, timedelta
def run_cost_audit(conn):
cursor = conn.cursor()
# 1. Identify warehouses with low utilization
cursor.execute("""
SELECT
warehouse_name,
SUM(credits_used) AS credits_30d,
COUNT(DISTINCT DATE_TRUNC('hour', start_time)) AS active_hours,
ROUND(active_hours / (30 * 24) * 100, 1) AS utilization_pct
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
HAVING utilization_pct < 10
ORDER BY credits_30d DESC
""")
low_util = cursor.fetchall()
if low_util:
print("=== LOW UTILIZATION WAREHOUSES (< 10%) ===")
for row in low_util:
print(f" {row[0]}: {row[1]:.1f} credits, {row[3]}% utilized")
# 2. Find queries scanning excessive data on small result sets
cursor.execute("""
SELECT
query_id,
warehouse_name,
ROUND(bytes_scanned / POWER(1024, 3), 2) AS gb_scanned,
rows_produced,
total_elapsed_time / 1000 AS duration_sec,
SUBSTR(query_text, 1, 120) AS query_preview
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP())
AND bytes_scanned > 10 * POWER(1024, 3) -- > 10 GB
AND rows_produced < 100
AND execution_status = 'SUCCESS'
ORDER BY bytes_scanned DESC
LIMIT 20
""")
heavy_scans = cursor.fetchall()
if heavy_scans:
print("\n=== FULL-SCAN SUSPECTS (>10GB scanned, <100 rows returned) ===")
for row in heavy_scans:
print(f" {row[1]} | {row[2]} GB | {row[3]} rows | {row[5]}")
# 3. Check result cache hit rate
cursor.execute("""
SELECT
ROUND(COUNT_IF(bytes_scanned = 0) / COUNT(*) * 100, 1)
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP())
AND warehouse_name IS NOT NULL
AND execution_status = 'SUCCESS'
""")
cache_rate = cursor.fetchone()[0]
print(f"\n=== RESULT CACHE HIT RATE (7-day): {cache_rate}% ===")
if cache_rate < 20:
print(" WARNING: Cache hit rate is very low. Check for non-deterministic functions.")
cursor.close()
if __name__ == "__main__":
conn = snowflake.connector.connect(
account="your_account",
user="your_user",
authenticator="externalbrowser",
warehouse="INGEST_XS",
role="ACCOUNTADMIN"
)
run_cost_audit(conn)
conn.close()
We run this on the first Monday of every month. It has caught regressions three times already -- once when a new pipeline was accidentally using a Large warehouse, and twice when dashboard changes broke result caching.
Lessons Learned
Snowflake cost optimization is not a one-time project. It is an ongoing discipline. The changes we made in the first month delivered 80% of the savings. The remaining 20% came from iterating, monitoring, and catching regressions over the following two months.
If I had to distill our experience into three priorities for anyone looking to reduce their Snowflake bill, it would be these:
- Right-size your warehouses first. This is almost always the biggest lever. Start small, measure, and only scale up when queries are genuinely compute-bound.
- Set up resource monitors before you do anything else. You cannot optimize what you do not measure, and you cannot protect savings you do not guard.
- Make cost visibility part of your engineering culture. We added a Snowflake cost dashboard to our weekly data team standup. When engineers can see the cost of their queries, behavior changes naturally.
The cheapest credit is the one you never consume. Before scaling up a warehouse, always ask: can I make this query scan less data instead?
Our $47,000 monthly bill was not the result of any single bad decision. It was the accumulation of hundreds of small defaults left unquestioned: warehouses created at Medium "to be safe," auto-suspend left at 10 minutes "because that is what it was," dev environments running full copies "because that is how we have always done it." Questioning those defaults saved us $340,000 a year. It will likely save you too.
Leave a Comment