Articles — Page 2 | Data & ML Engineering

Snowflake

Snowflake Cortex AI: Hands-On Review After 3 Months in Production

Eugine the Great|Feb 17, 2026|16 min read

An honest review of Snowflake Cortex AI after 3 months in production: LLM functions, vector search, ML forecasting, cost analysis, and where it falls short.

Rust for Data Engineering: Why Polars, DataFusion, and Delta-rs Are Just the Beginning

Eugine the Great|Feb 14, 2026|14 min read

A data engineer's honest exploration of Rust in the data ecosystem: Polars, DataFusion, Arrow-rs, Delta-rs, PyO3, performance benchmarks, and learning curve.

Prompt Engineering for Data Pipelines: Using LLMs to Clean, Classify, and Enrich Data

Eugine the Great|Feb 11, 2026|20 min read

A hands-on guide to integrating LLMs into ETL pipelines for data classification, cleaning, and enrichment with Python code, cost analysis, and fallback patterns.

The Modern Data Stack Is Dead. Here's What Replaced It.

Eugine the Great|Feb 8, 2026|16 min read

The Modern Data Stack promised simplicity but delivered cost explosions and tool sprawl. Here's what actually replaced it in 2026 and why consolidated wins.

Building Real-Time Dashboards That Don't Crush Your Database

Eugine the Great|Feb 5, 2026|20 min read

Practical strategies for real-time analytics dashboards that stay fast under load: materialized views, caching layers, semantic layers, and push architectures.

ClickHouse vs Apache Druid vs StarRocks: Picking Your Real-Time Analytics Engine

Eugine the Great|Feb 2, 2026|15 min read

A hands-on comparison of ClickHouse, Apache Druid, and StarRocks for real-time analytics with benchmarks, SQL examples, and a decision framework.

Kubernetes for ML Workloads: A Practical Guide to GPU Scheduling, Ray, and KubeFlow

Eugine the Great|Jan 30, 2026|14 min read

A battle-tested guide to running ML workloads on Kubernetes: GPU scheduling with NVIDIA device plugin, distributed training, Ray, KubeFlow, and cost control.

Great Expectations vs Soda vs dbt Tests: Choosing Your Data Quality Framework

Eugine the Great|Jan 27, 2026|16 min read

A practical comparison of Great Expectations, Soda Core, and dbt tests for data quality validation, with real code examples and integration guidance.

LLM Inference Optimization: From 200ms to 50ms Per Token

Eugine the Great|Jan 24, 2026|18 min read

A hands-on guide to cutting LLM inference latency 4x with quantization, KV cache tricks, vLLM, speculative decoding, and GPU selection.

Data Mesh Two Years Later: What Actually Worked and What Failed

Data

Data & ML Engineering