Structured Logging 101

Structured Logging 101 for Python & Kubernetes

How to stop grepping random strings in production at 2 a.m.


Introduction: Why Your Logs Are Lying to You

If your current “logging strategy” is print() plus a few logger.info("Something happened"), you don’t have a logging strategy.

In Python apps running on Kubernetes, unstructured logs quickly become useless:

  • You can’t reliably filter by request ID or user ID.
  • You can’t correlate logs across services.
  • Dashboards and alerts based on text search are fragile and noisy.

Structured logging fixes this by turning every log line into data — typically JSON — that your log stack (Loki, Elasticsearch, OpenSearch, CloudWatch, Datadog, etc.) can parse, filter, and aggregate.

This article is a practical, opinionated guide to Structured Logging 101 for Python & Kubernetes: what it is, how to wire it up, and what to avoid so you don’t drown in JSON noise.


1. What Is Structured Logging?

Traditional logging:

[2025-11-26 10:30:01] INFO User logged in: user_id=42

This looks readable, but for your log system it’s just a blob of text.

Structured logging:

{"ts":"2025-11-26T10:30:01Z","level":"INFO","event":"user_login","user_id":42,"session_id":"abc123","service":"auth-api"}

Key differences:

  • Machine-parsable: fields like user_id are real keys, not text fragments.
  • Consistent shape: every log line shares a schema (level, timestamp, service, etc.).
  • Queryable: you can run “give me all event=user_login where user_id=42 with level=ERROR in the last 15 minutes”.

In Kubernetes, structured logs are critical because:

  • All containers write to stdout/stderr, and the platform ships those logs to a central backend.
  • Without structure, your observability stack becomes regex hell.
  • With structure, you can build real dashboards and alerts without brittle text searches.

2. Logging Architecture in Kubernetes (High-Level)

The typical flow in a Python app on Kubernetes:

  1. Your app writes structured logs to stdout/stderr (JSON lines).
  2. Container runtime / kubelet sends logs to node-level log files.
  3. A log agent (sidecar or DaemonSet) collects those logs:
    • e.g. Fluent Bit / Fluentd / Vector / Promtail.
  4. The agent forwards logs to a backend:
    • e.g. Loki, Elasticsearch/OpenSearch, Cloud Logging, Datadog, Splunk.
  5. You query, visualize, and alert in Grafana/Kibana/your log UI.

Key takeaway:
If your Python logs are already structured JSON, the rest of this pipeline becomes much simpler and more powerful.


3. Python: Making Logging Structured (Not a Mess)

3.1 Core Ideas

  • Use the standard logging module (don’t reinvent).
  • Add a JSON formatter.
  • Always log key-value pairs, not interpolated text.
  • Standardize fields: service, env, request_id, correlation_id, user_id, etc.

3.2 Minimal Structured Logger (Without Extra Libraries)

This is a small, “roll-your-own” JSON formatter. Good to understand, even if you later switch to a library.

import json
import logging
import sys
from datetime import datetime, timezone

class JsonFormatter(logging.Formatter):
    def format(self, record: logging.LogRecord) -> str:
        log = {
            "ts": datetime.fromtimestamp(record.created, tz=timezone.utc).isoformat(),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
        }

        # Attach extra fields (from logger.<level>(..., extra={...}))
        if record.__dict__.get("extra_fields"):
            log.update(record.__dict__["extra_fields"])

        return json.dumps(log, separators=(",", ":"))  # compact JSON

def get_logger(name: str, service: str, env: str) -> logging.Logger:
    logger = logging.getLogger(name)
    logger.setLevel(logging.INFO)
    logger.propagate = False

    if not logger.handlers:
        handler = logging.StreamHandler(sys.stdout)
        handler.setFormatter(JsonFormatter())
        logger.addHandler(handler)

    # Store static context on the logger
    logger = logging.LoggerAdapter(logger, {"extra_fields": {"service": service, "env": env}})
    return logger

# Usage
logger = get_logger(__name__, service="payments-api", env="prod")

logger.info("Payment created", extra={"extra_fields": {"order_id": 123, "amount": 49.90}})

What’s going on:

  • JsonFormatter converts a LogRecord into a JSON dict and dumps it.
  • LoggerAdapter injects static fields (service, env) automatically.
  • extra={"extra_fields": {...}} lets you add per-log structured data.

4. Correlation IDs & Request Context (The Real Value)

The biggest win in microservices: trace a single request across services.

You do that with:

  • correlation_id or trace_id
  • request_id
  • Sometimes span_id (if using tracing)

4.1 Example: FastAPI Middleware Injecting correlation_id

import uuid
from fastapi import FastAPI, Request
from starlette.middleware.base import BaseHTTPMiddleware

from my_logging import get_logger  # from previous snippet

app = FastAPI()
logger = get_logger(__name__, service="orders-api", env="prod")

class CorrelationMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        corr_id = request.headers.get("X-Correlation-ID", str(uuid.uuid4()))
        request.state.correlation_id = corr_id

        # Log request start
        logger.info(
            "request_start",
            extra={"extra_fields": {
                "event": "request_start",
                "method": request.method,
                "path": request.url.path,
                "correlation_id": corr_id,
            }},
        )

        response = await call_next(request)
        response.headers["X-Correlation-ID"] = corr_id
        return response

app.add_middleware(CorrelationMiddleware)

@app.get("/orders/{order_id}")
async def get_order(order_id: str, request: Request):
    logger.info(
        "fetch_order",
        extra={"extra_fields": {
            "event": "fetch_order",
            "order_id": order_id,
            "correlation_id": request.state.correlation_id,
        }},
    )
    return {"order_id": order_id}

Now you can query logs by correlation_id in your log backend and see:

  • request_start in gateway
  • fetch_order in orders-api
  • charge_customer in payments-api

All tied to the same ID.


5. Kubernetes: How to Wire It Up Correctly

5.1 Container Logging Basics

To play nice with Kubernetes:

  • Write logs to stdout/stderr, not local files.
  • Log one JSON object per line.
  • Don’t prepend timestamps or levels yourself if your formatter already handles it.

Deployment snippet (key part is env and no custom log path):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: orders-api
  template:
    metadata:
      labels:
        app: orders-api
    spec:
      containers:
        - name: orders-api
          image: my-registry/orders-api:1.0.0
          env:
            - name: LOG_LEVEL
              value: INFO
            - name: ENV
              value: prod
          ports:
            - containerPort: 8000

If you use a log agent (Fluent Bit, Promtail, etc.), configure it to:

  • Treat each line as JSON.
  • Add Kubernetes metadata (namespace, pod, container, node).
  • Forward to your backend.

5.2 Log Fields: App vs Kubernetes

A rough separation of responsibilities:

SourceExample Fields
App (Python)service, env, event, user_id, order_id, correlation_id, message, level
K8s/Agentkubernetes.namespace, kubernetes.pod, kubernetes.container, host, cluster

Don’t duplicate Kubernetes metadata in your app logs; let the agent inject it.


6. Using a Library Instead of DIY (Recommended)

DIY JSON formatter is fine for learning. In real projects, use a mature library for:

  • Better performance.
  • Structured context management.
  • Less boilerplate.

Two common Python choices:

6.1 structlog (very popular)

import logging
import structlog

logging.basicConfig(format="%(message)s", stream=sys.stdout, level=logging.INFO)

structlog.configure(
    processors=[
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.dict_tracebacks,
        structlog.processors.JSONRenderer(),
    ],
    context_class=dict,
    logger_factory=structlog.stdlib.LoggerFactory(),
)

logger = structlog.get_logger(service="orders-api", env="prod")

logger.info("order_created", order_id=123, amount=49.9, currency="USD")

This prints a JSON object with all those fields, plus time and level.

6.2 loguru (batteries-included)

from loguru import logger
import sys

logger.remove()
logger.add(sys.stdout, serialize=True)  # JSON

logger = logger.bind(service="orders-api", env="prod")
logger.info("order_created", order_id=123, amount=49.9)

My blunt advice:

  • If you’re already deep in standard logging, consider structlog.
  • If you’re starting fresh, loguru is easier, but be mindful of integration with older code.

7. Log Levels, Sampling, and Avoiding Noise

Unstructured logging is bad.
But structured noise is worse — now you have expensive, high-quality noise.

7.1 Basic Rules

  • DEBUG: detailed per-step state; use heavily in dev, sampled or off in prod.
  • INFO: important business events (order created, payment failed).
  • WARNING: unexpected but handled situations (retry scheduled).
  • ERROR: operation failed, user likely impacted.
  • CRITICAL: system severely unhealthy.

7.2 Don’t Log the Following (or Be Extremely Careful)

  • PII / secrets: emails, tokens, passwords, card numbers.
  • Huge payloads: large JSON bodies, binary blobs.
  • Per-item logs inside tight loops: log aggregates instead.

Example of aggregated logging instead of spamming:

def process_batch(items):
    success = 0
    failure = 0

    for item in items:
        try:
            handle(item)
            success += 1
        except Exception as exc:
            failure += 1
            # maybe DEBUG one representative error
            logger.debug("item_failed", error=str(exc), item_id=item.id)

    logger.info(
        "batch_processed",
        extra={"extra_fields": {
            "event": "batch_processed",
            "success": success,
            "failure": failure,
            "total": len(items),
        }},
    )

8. Structured Logs vs Metrics vs Traces

If you’re in Kubernetes and care about reliability, you should understand the split:

  • Logs = detailed event history. Great for debugging single incidents.
  • Metrics = numeric time series (latency, error rate, throughput). Great for alerts & SLOs.
  • Traces = end-to-end request paths across services.

Structured logs bridge the gap:

  • You can derive ad-hoc metrics from logs (e.g., count event=payment_failed).
  • You can pivot traces/logs using trace_id and span_id.
  • You can debug weird edge cases metrics/traces can’t fully explain.

If you don’t have time for full OpenTelemetry yet, at least:

  • Put correlation_id in logs.
  • Standardize event names.
  • Make logs JSON and parseable.

You’ll thank yourself later.


9. Common Pitfalls (That Will Bite You in Kubernetes)

Let’s be blunt about the usual failures:

  1. Mixing formats
    Half your logs JSON, half plain text → parsing breaks, dashboards lie.
  2. Logging in local time
    Always log in UTC. Always. You’re in Kubernetes; your pods are everywhere.
  3. Logging to files inside containers
    Containers are ephemeral. Logs get lost, sidecars can’t see them, and you fight file paths. Use stdout/stderr.
  4. Embedding stack traces as plain strings
    Prefer structured traces (error.type, error.message, error.stack), or use library support.
  5. No schema discipline
    Today you log user_id, tomorrow userid, the next day userId. Your queries and dashboards become a joke. Pick a naming convention and enforce it.

10. Summary & Takeaways

If you’re running Python services on Kubernetes, structured logging is not a “nice to have”; it’s table stakes.

Key takeaways:

  • Make every log line a JSON object with consistent fields.
  • Use standard Python logging + JSON formatter or a library like structlog or loguru.
  • In Kubernetes, log to stdout/stderr and let the platform/agent handle shipping.
  • Always include correlation IDs for cross-service tracing.
  • Be ruthless about log level discipline, sampling, and avoiding sensitive or massive data.

You don’t need a perfect observability stack to start. But you do need to stop treating logs as text blobs and start treating them as data.


Internal Link Ideas (for your blog)

You can internally link this article to:

  • “Intro to Observability for Data & ML Services (Logs, Metrics, Traces)”
  • “Designing Idempotent APIs for Data Pipelines”
  • “Kubernetes for Data Engineers: Pods, Jobs, and CronJobs Explained”
  • “Building a Loki + Grafana Stack for Application Logs”
  • “OpenTelemetry 101 for Python Microservices”

Image Prompt (for DALL·E / Midjourney)

“A modern, clean isometric illustration of a Python microservice running in a Kubernetes pod, streaming structured JSON logs into a centralized logging system, with labeled fields like ‘correlation_id’, ‘service’, and ‘level’. Minimalistic style, high contrast, dark background with neon accents.”

Leave a Reply

Your email address will not be published. Required fields are marked *