Descriptors (reusable, validated attributes)
When to use
- You want reusable validation/coercion for settings (e.g., DSNs, positive ints).
- Multiple classes should share the same attribute behavior (not copy/paste properties).
- You need on-assign hooks (coerce type, validate range, mask secrets, lazy-load once).
Avoid when a one-off @property is enough or you’d prefer schema libs (pydantic/attrs).
Diagram (text)
IngestConfig.batch_size ──> Validated descriptor
set/get └─ parses & validates; stores per-instance value
Python example (≤40 lines, type-hinted)
Reusable Validated descriptor; used for batch_size and dsn.
from __future__ import annotations
from typing import Callable, Any
Parse = Callable[[Any], Any]
class Validated:
def __init__(self, parse: Parse): self.parse = parse
def __set_name__(self, owner, name): self._slot = f"__{name}"
def __get__(self, obj, owner=None):
if obj is None: return self
return obj.__dict__[self._slot]
def __set__(self, obj, value):
obj.__dict__[self._slot] = self.parse(value)
def parse_batch(v: Any) -> int:
i = int(v)
if not (1 <= i <= 1_000_000): raise ValueError("batch_size out of range")
return i
def parse_dsn(v: Any) -> str:
s = str(v)
if not s.startswith("snowflake://"): raise ValueError("bad dsn")
return s
class IngestConfig:
batch_size: int = Validated(parse_batch) # type: ignore[assignment]
dsn: str = Validated(parse_dsn) # type: ignore[assignment]
def __init__(self, *, batch_size: int, dsn: str):
self.batch_size = batch_size
self.dsn = dsn
Tiny pytest (cements it)
def test_descriptor_validates_and_coerces():
cfg = IngestConfig(batch_size="200", dsn="snowflake://acct/db")
assert cfg.batch_size == 200
cfg.batch_size = 500
assert cfg.batch_size == 500
def test_descriptor_rejects_bad_values():
cfg = IngestConfig(batch_size=10, dsn="snowflake://ok")
import pytest
with pytest.raises(ValueError): cfg.batch_size = 0
with pytest.raises(ValueError): IngestConfig(batch_size=1, dsn="postgres://nope")
Trade-offs & pitfalls
- Pros: Central, reusable rules; runs on every assignment; no copy/paste of
@propertylogic. - Cons: Less obvious than simple attributes; debugging involves descriptor protocol.
- Pitfalls:
- Forgetting
__set_name__→ clobbered storage names. - Storing state on the descriptor (shared!) instead of per-instance dict.
- Mixing with
@dataclassfield generation can be tricky—prefer plain classes or exclude fields.
- Forgetting
Pythonic alternatives
@propertyfor one-off validation.- Pydantic/attrs for rich, declarative validation & defaults.
functools.cached_propertyfor lazy, compute-once attributes (use when you just need caching).- Dataclasses +
__post_init__for lightweight checks.
Mini exercise
Implement a Lazy descriptor:
Lazy(factory: Callable[[Any], T])
- On first
__get__, callfactory(self_obj), store result in the instance dict, then return it. - Add an
invalidate(self_obj)helper to clear the cached value. - Use it to lazy-load a table schema via a provided
fetch_schema(client, table).
Checks (quick checklist)
- Descriptor stores per-instance data in
obj.__dict__, not on the descriptor. __set_name__assigns a unique private slot name.- Parsing/validation raises clear errors; coercion is explicit.
- Works across multiple classes (reusability proven).
- Prefer pydantic/attrs when you need rich schemas and error messages.




