Descriptors (reusable, validated attributes)

When to use

  • You want reusable validation/coercion for settings (e.g., DSNs, positive ints).
  • Multiple classes should share the same attribute behavior (not copy/paste properties).
  • You need on-assign hooks (coerce type, validate range, mask secrets, lazy-load once).

Avoid when a one-off @property is enough or you’d prefer schema libs (pydantic/attrs).

Diagram (text)

IngestConfig.batch_size  ──> Validated descriptor
   set/get                └─ parses & validates; stores per-instance value

Python example (≤40 lines, type-hinted)

Reusable Validated descriptor; used for batch_size and dsn.

from __future__ import annotations
from typing import Callable, Any

Parse = Callable[[Any], Any]

class Validated:
    def __init__(self, parse: Parse): self.parse = parse
    def __set_name__(self, owner, name): self._slot = f"__{name}"
    def __get__(self, obj, owner=None):
        if obj is None: return self
        return obj.__dict__[self._slot]
    def __set__(self, obj, value):
        obj.__dict__[self._slot] = self.parse(value)

def parse_batch(v: Any) -> int:
    i = int(v)
    if not (1 <= i <= 1_000_000): raise ValueError("batch_size out of range")
    return i

def parse_dsn(v: Any) -> str:
    s = str(v)
    if not s.startswith("snowflake://"): raise ValueError("bad dsn")
    return s

class IngestConfig:
    batch_size: int = Validated(parse_batch)  # type: ignore[assignment]
    dsn: str = Validated(parse_dsn)          # type: ignore[assignment]
    def __init__(self, *, batch_size: int, dsn: str):
        self.batch_size = batch_size
        self.dsn = dsn

Tiny pytest (cements it)

def test_descriptor_validates_and_coerces():
    cfg = IngestConfig(batch_size="200", dsn="snowflake://acct/db")
    assert cfg.batch_size == 200
    cfg.batch_size = 500
    assert cfg.batch_size == 500

def test_descriptor_rejects_bad_values():
    cfg = IngestConfig(batch_size=10, dsn="snowflake://ok")
    import pytest
    with pytest.raises(ValueError): cfg.batch_size = 0
    with pytest.raises(ValueError): IngestConfig(batch_size=1, dsn="postgres://nope")

Trade-offs & pitfalls

  • Pros: Central, reusable rules; runs on every assignment; no copy/paste of @property logic.
  • Cons: Less obvious than simple attributes; debugging involves descriptor protocol.
  • Pitfalls:
    • Forgetting __set_name__ → clobbered storage names.
    • Storing state on the descriptor (shared!) instead of per-instance dict.
    • Mixing with @dataclass field generation can be tricky—prefer plain classes or exclude fields.

Pythonic alternatives

  • @property for one-off validation.
  • Pydantic/attrs for rich, declarative validation & defaults.
  • functools.cached_property for lazy, compute-once attributes (use when you just need caching).
  • Dataclasses + __post_init__ for lightweight checks.

Mini exercise

Implement a Lazy descriptor:

Lazy(factory: Callable[[Any], T])
  • On first __get__, call factory(self_obj), store result in the instance dict, then return it.
  • Add an invalidate(self_obj) helper to clear the cached value.
  • Use it to lazy-load a table schema via a provided fetch_schema(client, table).

Checks (quick checklist)

  • Descriptor stores per-instance data in obj.__dict__, not on the descriptor.
  • __set_name__ assigns a unique private slot name.
  • Parsing/validation raises clear errors; coercion is explicit.
  • Works across multiple classes (reusability proven).
  • Prefer pydantic/attrs when you need rich schemas and error messages.