Prototype (clone a configured “template” and tweak it)

When to use

  • You have a template config (ETL job, connector settings) and need many similar variants.
  • Creating from scratch is noisy/error-prone; cloning + small overrides is cleaner.
  • You want copies not tied to each other (no accidental shared mutation).

Avoid when each instance is very different or a small builder function is simpler.

Diagram (text)

Client ──> PrototypeRegistry ── create(key, overrides) ──> New ETLJob
             ▲                         (deep copy + replace)
          stores
         "templates"

Step-by-step idea

  1. Define a template object with sane defaults.
  2. Put it in a registry (by name).
  3. When needed, deep copy the template and override fields.
  4. Use the new object; changing it won’t affect others.

Python example (≤40 lines, type-hinted)

from __future__ import annotations
from dataclasses import dataclass, field, replace
from copy import deepcopy

@dataclass
class ETLJob:
    name: str
    source: str
    dest: str
    batch_size: int = 1000
    options: dict[str, str] = field(default_factory=dict)

class JobPrototypes:
    def __init__(self) -> None:
        self._items: dict[str, ETLJob] = {}
    def register(self, key: str, job: ETLJob) -> None:
        self._items[key] = job
    def create(self, key: str, **overrides) -> ETLJob:
        base = deepcopy(self._items[key])     # avoid shared nested state
        return replace(base, **overrides)     # apply field overrides

Tiny pytest (cements it)

def test_prototype_clones_are_independent():
    base = ETLJob("daily_base", "raw.events", "stage.events", options={"codec": "gzip"})
    reg = JobPrototypes(); reg.register("daily", base)

    eu = reg.create("daily", name="daily_eu", dest="stage.events_eu", options={"region":"eu"})
    us = reg.create("daily", name="daily_us")  # inherits "codec": "gzip" via deepcopy
    eu.options["extra"] = "x"

    assert eu.name == "daily_eu" and us.name == "daily_us"
    assert "extra" not in us.options               # no shared dicts
    assert base.options == {"codec": "gzip"}       # base unchanged

Trade-offs & pitfalls

  • Pros: Fast to spin up many variants; less duplication; good for test fixtures and job families.
  • Cons: A registry adds indirection; “copy vs reference” bugs are subtle.
  • Pitfalls:
    • Shared mutable defaults (e.g., options={}) → always use field(default_factory=dict).
    • Forgetting deepcopy for nested structures → edits leak across clones.
    • Too many “almost the same” templates → consider a Builder or config file.

Pythonic alternatives

  • dataclasses.replace(obj, field=val) without a registry if you already have a base instance.
  • Pydantic/attrs: model.copy(update={...}, deep=True) for validated, immutable-ish configs.
  • Frozen dataclasses + copy: make state immutable to prevent accidental edits.
  • Builder pattern**:** if construction rules are complex, prefer a builder over many templates.

Mini exercise

Add create_merged(self, key, **overrides) that deep merges dict fields like options (update keys instead of replacing the whole dict). Write a test showing base {"codec":"gzip"} merged with {"region":"eu"}{"codec":"gzip","region":"eu"}.

Checks (quick checklist)

  • Templates use default_factory for mutable fields.
  • Cloning uses deepcopy to avoid shared nested state.
  • Field overrides applied with replace (or .copy(update=...)).
  • Tests prove clones don’t affect each other or the base.
  • Consider Builder/config files if templates proliferate.