Flyweight (share heavy, read-only stuff instead of duplicating it)
When to use
- You create lots of similar objects that would each hold the same big immutable data (schemas, regexes, dictionaries).
- You want to cut memory and startup time by reusing that shared data.
- Example: many PII-masking steps all need the same compiled regexes.
Avoid when the shared thing is mutable (risk of cross-talk) or you only have a handful of objects.
Diagram (text)
PIIMasker ──> PatternPool.get("email") ── returns shared compiled regex
▲
many instances reuse the same flyweight (compiled regex object)
Python example (≤40 lines, type-hinted)
Concrete case: share compiled regexes across many calls/maskers.
from __future__ import annotations
import re
from functools import lru_cache
from dataclasses import dataclass
PATTERNS: dict[str, str] = {
"email": r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}",
"phone": r"\+?\d[\d\- ]{7,}\d",
}
@lru_cache(maxsize=32)
def get_pattern(name: str) -> re.Pattern[str]:
return re.compile(PATTERNS[name])
@dataclass(frozen=True)
class PIIMasker:
repl: str = "***"
def mask(self, text: str, rules: list[str]) -> str:
out = text
for rule in rules:
out = get_pattern(rule).sub(self.repl, out) # shared flyweight
return out
# usage
masker = PIIMasker()
clean = masker.mask("contact: a@b.com, phone +1-202-555-0101", ["email", "phone"])
Tiny pytest (cements it)
def test_flyweight_shares_compiled_regex():
a = get_pattern("email"); b = get_pattern("email")
assert a is b # same shared object
m = PIIMasker(repl="[redacted]")
s = m.mask("a@b.com and +1 202 555 0101", ["email","phone"])
assert "[redacted]" in s and "@" not in s
Trade-offs & pitfalls
- Pros: Less memory, faster repeated use; consistent behavior; cacheable construction.
- Cons: Indirection layer; cache size/eviction to think about.
- Pitfalls:
- Mutability: flyweights must be read-only; don’t stash state on them.
- Cache policy: unbounded caches can grow; too small can thrash.
- Scope confusion: per-process cache only; each process/container has its own.
Pythonic alternatives
functools.lru_cache(used above) or cachetools.TTLCache for size/TTL control.remodule’s own cache (exists, but you get less control; explicit caches are clearer).- Module-level singletons (constants) for tiny shared data.
- Weak refs (
weakref.WeakValueDictionary) if you want entries to disappear when unused.
Mini exercise
Add a maxsize knob: replace the hardcoded @lru_cache(maxsize=32) with a small factory that returns a cached get_pattern with a configurable size. Write a test that proves evictions occur when the cache is small.
Checks (quick checklist)
- Shared object is immutable/read-only.
- A factory/cache returns the shared instance for the same key.
- Clear eviction/size policy (or accept “process-lifetime” cache).
- No per-request state stored on the flyweight.
- Tests prove identity reuse (
a is b) and correct behavior.




