An honest review of Snowflake Cortex AI after 3 months in production: LLM functions, vector search, ML forecasting, cost analysis, and where it falls short.
Rust for Data Engineering: Why Polars, DataFusion, and Delta-rs Are Just the Beginning
A data engineer's honest exploration of Rust in the data ecosystem: Polars, DataFusion, Arrow-rs, Delta-rs, PyO3, performance benchmarks, and learning curve.
Prompt Engineering for Data Pipelines: Using LLMs to Clean, Classify, and Enrich Data
A hands-on guide to integrating LLMs into ETL pipelines for data classification, cleaning, and enrichment with Python code, cost analysis, and fallback patterns.
The Modern Data Stack Is Dead. Here's What Replaced It.
The Modern Data Stack promised simplicity but delivered cost explosions and tool sprawl. Here's what actually replaced it in 2026 and why consolidated wins.
Building Real-Time Dashboards That Don't Crush Your Database
Practical strategies for real-time analytics dashboards that stay fast under load: materialized views, caching layers, semantic layers, and push architectures.
ClickHouse vs Apache Druid vs StarRocks: Picking Your Real-Time Analytics Engine
A hands-on comparison of ClickHouse, Apache Druid, and StarRocks for real-time analytics with benchmarks, SQL examples, and a decision framework.
Kubernetes for ML Workloads: A Practical Guide to GPU Scheduling, Ray, and KubeFlow
A battle-tested guide to running ML workloads on Kubernetes: GPU scheduling with NVIDIA device plugin, distributed training, Ray, KubeFlow, and cost control.
Great Expectations vs Soda vs dbt Tests: Choosing Your Data Quality Framework
A practical comparison of Great Expectations, Soda Core, and dbt tests for data quality validation, with real code examples and integration guidance.
LLM Inference Optimization: From 200ms to 50ms Per Token
A hands-on guide to cutting LLM inference latency 4x with quantization, KV cache tricks, vLLM, speculative decoding, and GPU selection.
Data Mesh Two Years Later: What Actually Worked and What Failed
After two years of data mesh implementation with 300 engineers, here's an honest retrospective on what worked, what failed, and why hybrid won.
Python 3.13 for Data Engineers: Free-Threading, JIT, and What Actually Matters
A practical look at Python 3.13's free-threading (no-GIL), experimental JIT, and new typing features through the lens of real data engineering workloads.
Building a Feature Store from Scratch: Why We Skipped Feast and Built Our Own
A hands-on guide to building a custom ML feature store with Iceberg, Redis, and Python when Feast and Tecton don't fit your architecture.










