Typing Turns Python Architecture Into a Contract

How to use modern Python typing, protocols, dataclasses, and payload types to stop raw dictionaries from becoming the hidden architecture of a production system.

By Jovani Pink June 14, 2026 7 min — Platform & AI Engineering

Outcome focus: Gave Python teams a boundary pattern for converting untrusted payloads into typed domain objects before service logic, repositories, and agent tools can depend on them.

The bug looked like a missing key.

It was actually a missing architecture.

An event payload arrived from a partner API with "customer_id" set to null. The endpoint accepted it because the validation code only checked that the key existed. The service layer passed the raw dictionary through two helpers. A repository converted the value into a SQL parameter. The model scoring job recorded the row as "unknown customer" instead of rejecting it. Nobody noticed until a downstream report showed a small but stubborn cluster of impossible rows.

The failure was not that Python is dynamic.

The failure was that the system never stopped being a bag of strings.

Modern Python typing is valuable because it gives architecture something concrete to hold. It lets us say: outside this boundary, data is untrusted; inside this boundary, the service works with domain objects; dependencies are behavior contracts, not inheritance chains; repositories return known shapes; tool calls and event handlers cannot quietly invent new fields.

The typing guide for libraries frames types as part of a package's user experience. I would push the same idea inside applications. A service boundary without types is still a public API. It is just undocumented.

Type the Boundary First#

I do not start by annotating every internal variable.

I start at ingress.

If a payload arrives as JSON, the first typed shape can be a TypedDict, a Pydantic model, a msgspec struct, or a small validator function that produces a dataclass. The exact validation library is less important than the rule: do not let raw untrusted dictionaries become the internal representation.

For trusted domain records, the standard library dataclasses module is enough more often than teams expect.

from dataclasses import dataclass
from typing import NotRequired, TypedDict
 
 
class CustomerEventPayload(TypedDict):
    type: str
    customer_id: str
    occurred_at: str
    plan: NotRequired[str]
 
 
@dataclass(frozen=True, slots=True, kw_only=True)
class CustomerId:
    value: str
 
    def __post_init__(self) -> None:
        if not self.value:
            raise ValueError("customer id cannot be empty")
 
 
@dataclass(frozen=True, slots=True, kw_only=True)
class CustomerEvent:
    type: str
    customer_id: CustomerId
    occurred_at_iso: str
    plan: str | None = None
 
 
def parse_customer_event(payload: CustomerEventPayload) -> CustomerEvent:
    return CustomerEvent(
        type=payload["type"],
        customer_id=CustomerId(value=payload["customer_id"]),
        occurred_at_iso=payload["occurred_at"],
        plan=payload.get("plan"),
    )

This does not make runtime validation disappear. TypedDict is a static type over dictionaries; it does not validate a random JSON object at runtime. The boundary still needs runtime checks when input is untrusted. But once the boundary returns CustomerEvent, service code can stop wondering whether "customer_id" exists, whether it can be None, and whether it has been normalized.

That is the architectural payoff.

Prefer Shapes Over Families#

The inheritance-heavy version of this system usually starts innocently.

There is a base client. Then a partner client. Then a special partner client. Then a retrying partner client. Then a retrying partner client with audit logging and a slightly different response shape. Six months later, every test needs a subclass, every subclass inherits behavior it does not want, and the "base" class is mostly a memorial to old assumptions.

Modern Python gives us a cleaner way to express a dependency: a Protocol.

from typing import Protocol
 
 
class CustomerRepository(Protocol):
    def save_event(self, event: CustomerEvent) -> None: ...
 
 
class RiskScorer(Protocol):
    def score(self, event: CustomerEvent) -> float: ...
 
 
@dataclass(slots=True)
class CustomerEventService:
    repository: CustomerRepository
    scorer: RiskScorer
 
    def ingest(self, payload: CustomerEventPayload) -> float:
        event = parse_customer_event(payload)
        risk = self.scorer.score(event)
        self.repository.save_event(event)
        return risk

No base class is required. A real repository, fake repository, in-memory test double, or adapter around another service can satisfy the protocol if it has the right methods.

The tradeoff is that protocols make less visible runtime structure. A subclass hierarchy is easy to inspect in an IDE tree. A protocol depends on static checking and tests. I still prefer the protocol for application boundaries because it describes the behavior the service needs instead of forcing the dependency into a family tree.

If runtime checks are needed, @runtime_checkable exists. I use it sparingly. The stronger habit is to keep protocols near the services that consume them and let the type checker verify wiring.

Use Dataclasses for Values, Not Everything#

Dataclasses are excellent for domain values, commands, result objects, and internal records. The combination I reach for most often is:

@dataclass(frozen=True, slots=True, kw_only=True)
class ScoringDecision:
    customer_id: CustomerId
    risk_score: float
    reason_codes: tuple[str, ...]

frozen=True makes accidental mutation noisy. slots=True reduces per-instance overhead and blocks accidental attribute creation. kw_only=True makes constructor calls harder to scramble when fields are added later.

The mistake is using a dataclass as every possible thing at once: API request, domain entity, database row, queue message, and UI response. That looks efficient until one layer needs a field the others must never see.

Separate the shapes when the boundary changes.

@dataclass(frozen=True, slots=True, kw_only=True)
class CustomerEventRow:
    customer_id: str
    event_type: str
    occurred_at_iso: str
    risk_score: float
 
 
def to_row(event: CustomerEvent, risk_score: float) -> CustomerEventRow:
    return CustomerEventRow(
        customer_id=event.customer_id.value,
        event_type=event.type,
        occurred_at_iso=event.occurred_at_iso,
        risk_score=risk_score,
    )

The conversion feels like extra code. It is also where policy lives. What gets persisted? What gets redacted? What gets normalized? What is allowed to leave the service? A one-object-fits-all design hides those decisions.

Pattern Matching Belongs at Dispatch Boundaries#

Structural pattern matching is not a prettier switch.

It earns its keep when the shape of the object determines the path. The pattern matching specification made that explicit: matching is about patterns succeeding or failing against a subject.

Event dispatch is a good fit.

def route_event(payload: CustomerEventPayload) -> str:
    match payload:
        case {"type": "created", "customer_id": customer_id} if customer_id:
            return "create_customer"
        case {"type": "plan_changed", "customer_id": customer_id, "plan": plan} if customer_id and plan:
            return "update_plan"
        case {"type": event_type}:
            raise ValueError(f"unsupported customer event: {event_type}")
        case _:
            raise ValueError("malformed customer event")

The pitfall is using match deep inside business logic because it feels new. If the code is branching on one enum value, a dictionary dispatch may be clearer. If the code is unpacking a nested command or payload, match can remove a lot of manual key checks.

Deferred Annotations Change Imports, Not Design#

Python 3.14's deferred annotation semantics, implemented through PEP 649 and PEP 749, reduce a class of forward-reference and import-time problems. The 3.14 release notes describe annotations being stored in annotate functions and evaluated when needed.

This helps real projects. It does not remove the need for boundaries.

I would still keep domain types in a domain module, protocols close to consumers, adapters outside the domain, and heavy runtime imports away from pure model definitions. Deferred annotations make type expression easier to write. They do not make circular architecture good.

A Boundary Contract#

The artifact I want in a Python service is a short boundary contract. It can live in docs, but it should be reflected in code.

BoundaryInput shapeInternal shapeRuntime validationStatic enforcement
HTTP requestJSON dictcommand dataclassrequest validatorroute signature
Queue eventmessage bodyevent dataclassdecoder/validatorTypedDict and parser
RepositorySQL rowdomain projectionrow mapperreturn type
Agent tooltool args JSONcommand objectschema validatorprotocol and result type
External APIresponse JSONadapter DTOclient parserTypedDict or model

The mistake is allowing one boundary to skip the conversion because it is "internal." Internal data leaks. A queue becomes a public contract. A model feature table becomes another team's dependency. A helper becomes a library. If the shape matters, type it where it crosses the line.

The Quality Gate#

For new Python code, the gate I want is boring and strict:

pyproject.toml
[tool.pyright]
pythonVersion = "3.14"
typeCheckingMode = "strict"
include = ["src", "tests"]
 
[tool.ruff.lint]
select = ["E", "F", "I", "B", "UP", "SIM"]
 
[tool.ruff.format]
quote-style = "double"

If the team prefers mypy, use mypy in strict mode. If the team is piloting ty, keep it visible in the editor or an advisory CI job until the repo proves the checker covers the patterns it relies on. Fast feedback is wonderful. Correct feedback is the production gate.

The close is simple: type the edges first. Then type the domain objects. Then type the dependencies by behavior. After that, the implementation code becomes easier to read because the architecture is already in the signatures.

Back to all writing
On this page
  1. Type the Boundary First
  2. Prefer Shapes Over Families
  3. Use Dataclasses for Values, Not Everything
  4. Pattern Matching Belongs at Dispatch Boundaries
  5. Deferred Annotations Change Imports, Not Design
  6. A Boundary Contract
  7. The Quality Gate