Outcome focus: Reader can map a Python workflow to the right state-machine library, distinguish statechart formalism from durable execution, and know where to start contributing to xstate-python with file paths and named missing features.
state machinespythonxstate-pythonlanggraphagentic ai
Part 3 of 4. Part 1: When the State Chart Pays Off. Part 2: XState, Actors, and What the Stately Argument Actually Buys. Part 4: State Machines in Go, Elixir, Swift, and Zig.
A Python agent harness ran a three-step workflow: read a document, ask the user for approval, run a tool. The harness was written as await read_doc() then await wait_for_human() then await run_tool() inside an async function. The worker restarted between the human-approval step and the tool-run step. The harness resumed by re-running the function from the top, which meant read_doc was called again (cheap), wait_for_human returned immediately because the approval was already on file (free), and run_tool ran a second time. The tool had partial-completion side effects on the first run. The user got a duplicated artifact and a confused log.
The bug was not in the model or the prompt. The bug was that a long-running workflow was modeled as a function instead of a state machine, and the resume path did not know which step had already completed. The async function had implicit state, the worker restart turned the implicit state into corrupted behavior, and there was no place in the code that named "we just finished step two and have not started step three" as a state.
This post is about the Python ecosystem of state-machine and durable-execution libraries that solve different parts of this problem, where each one fits, and what it means to contribute to xstate-python, the Python port of XState that I work on.
Four Libraries, Four Different Problems#
The Python landscape has at least four serious entrants in this space, and they solve four different problems. The mistake teams make is picking based on community size or stars, not based on which problem they actually have.
| Library | Solves | Does not solve |
|---|---|---|
transitions (pytransitions/transitions) | In-memory state machines with hierarchy, parallel regions, callbacks. Most-used Python FSM. | Durability across worker restart. Statechart visualization that round-trips to other ecosystems. |
python-statemachine (fgmacedo/python-statemachine) | Pythonic declarative API, hierarchy, parallel, history, async, Mermaid diagrams. Newer entrant with v3 in 2026. | Same durability gap. Different API from XState, not a port. |
xstate-python (statelyai/xstate-python, with my fork at JovaniPink/xstate-python) | Statechart formalism that round-trips with the XState TypeScript ecosystem. SCXML compatibility. | Pre-1.0; many XState 5 features missing. Not a durable runtime. |
LangGraph (langchain-ai/langgraph) | Durable execution. Checkpointing across restarts. Human-in-the-loop interrupts. Resumable agent workflows. | Statechart formalism. Hierarchical states. Parallel regions. |
Temporal Python SDK (temporalio/sdk-python) | Industrial-grade durable workflows. Signals, queries, timers. Production-tested at scale. | Statechart formalism. Light-touch deployment (Temporal needs a server). |
A team picking the right tool for the agent-harness bug above wants durability first and statechart formalism second. That points to LangGraph or Temporal, not to a state-machine library. A team picking the right tool for a UI workflow with hierarchy and parallel regions wants the formalism, not the durability. That points to transitions or python-statemachine (or xstate-python, with caveats). The mistake is to pick one tool and try to make it solve both problems.
What xstate-python Actually Is#
xstate-python is a community port of XState to Python, hosted under the Stately organization. As of April 2026 it has around 200 stars, 13 open issues, one open pull request, and no published releases. The repository structure uses Poetry for dependency management and pytest for the test suite. The README describes it as "work in progress." The fork I work on shares the upstream commit history and is meant to be a place to land contributions before they are upstreamed.
The library implements the XState 4 era machine model: a Machine class instantiated from a JSON-shaped configuration, a transition method that takes a state and an event and returns the next state, and an interpreter that drives the machine. The features that exist work; the features that are missing are most of XState 5.
What works today, roughly:
- The
Machineconstructor and the JSON configuration format (states, transitions, context, actions, guards). transition(state, event)returning a new state value.- Initial state access through the interpreter.
- SCXML compatibility tested through the SCION Test Framework.
- A small set of examples in the
examples/directory.
What is missing or partial, roughly:
- The XState 5
setup({ types, actors, actions, guards }).createMachine(...)API. - The actor model with
createActor,invoke, and spawned children. - Async support throughout (only basic synchronous transitions today).
- Pydantic or
typing.Protocolintegration for typed context and events. - Persistence and resumability (the durable execution gap).
- A model-based testing analog of
@xstate/test.
The honest framing is that xstate-python sits in a niche: it is for teams who want statechart formalism in Python and want the artifact to round-trip with the XState TypeScript ecosystem (sketch in Stately Studio, generate JSON, run in both runtimes). For teams who only want statecharts in Python, python-statemachine or transitions is more polished. For teams who want durable execution, LangGraph or Temporal is the right tool.
The Agent Harness Failure, Mapped to State Machines#
Return to the agent harness that re-fired the tool. The honest fix is not a state-machine library. The honest fix is durable execution. LangGraph models the workflow as a graph where each node is a step, and the framework checkpoints state between nodes so that a worker restart resumes from the last checkpoint instead of from the top. The human-in-the-loop pattern uses an explicit interrupt that pauses the graph at a node, persists state, and resumes when the approval arrives.
A LangGraph version of the workflow looks roughly like this:
from typing import TypedDict
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.sqlite import SqliteSaver
class WorkflowState(TypedDict):
document_id: str
content: str | None
approval: bool | None
tool_result: dict | None
def read_doc(state: WorkflowState) -> WorkflowState:
content = fetch_document(state["document_id"])
return {**state, "content": content}
def wait_for_approval(state: WorkflowState) -> WorkflowState:
# Interrupt point. The graph pauses here, persists state, and waits.
return state
def run_tool(state: WorkflowState) -> WorkflowState:
if not state.get("approval"):
raise RuntimeError("attempted tool run without approval")
result = invoke_tool(state["content"])
return {**state, "tool_result": result}
graph = StateGraph(WorkflowState)
graph.add_node("read_doc", read_doc)
graph.add_node("wait_for_approval", wait_for_approval)
graph.add_node("run_tool", run_tool)
graph.add_edge("read_doc", "wait_for_approval")
graph.add_edge("wait_for_approval", "run_tool")
graph.add_edge("run_tool", END)
graph.set_entry_point("read_doc")
# The checkpointer is the durability mechanism. With it, a worker restart
# resumes from the last completed node, not from the entry point.
checkpointer = SqliteSaver.from_conn_string("checkpoints.db")
runnable = graph.compile(checkpointer=checkpointer, interrupt_before=["run_tool"])The interrupt at run_tool is the protection. The graph pauses after wait_for_approval completes, persists the state, and waits for the user to send a resume signal with the approval payload. The worker can restart, the process can die, the host can be redeployed, and the graph picks up where it left off because the state lives in the checkpointer, not in an in-memory async function.
This is what the post on the SaaS stack for LLM-assisted product development means by "graduate to LangGraph when durable state, interrupts, retries, or human review become real requirements." The graduation is not optional once the workflow has any of those properties. A non-durable harness will bite the team eventually.
The state-machine question is orthogonal. LangGraph is a graph runtime, not a statechart formalism. The graph above is sequential; it does not have hierarchy, parallel regions, or guard conditions in the XState sense. For workflows that also need that formalism, the right shape is to combine the two: an XState-style machine that names the states, transitions, and guards, and a LangGraph runtime (or Temporal) that gives the machine durability across restarts. The combination is non-trivial because LangGraph's model is graph-of-functions and XState's model is hierarchical-statechart, and the impedance mismatch is real.
A Contribution Roadmap for xstate-python#
The user-facing case for contributing to xstate-python is that the niche is real and underserved. Teams that want XState's formalism in Python today have to either run XState in TypeScript and message back to Python, or accept that python-statemachine and transitions give a different API that does not round-trip with Stately Studio. A working xstate-python that supports the XState 5 API is genuinely useful.
The repo's open issues name some of the missing pieces. Here is a concrete starter roadmap for a new contributor, ordered by what a reasonable first PR looks like.
First PR: typed event payloads with Pydantic. The current machine accepts {"type": "EVENT_NAME", ...} shapes as plain dicts. The XState 5 equivalent uses TypeScript discriminated unions. The Python equivalent is a Pydantic discriminated union or a typing.Protocol. Start in xstate/event.py (or wherever the event type lives), add a Pydantic model for events, and update the test suite to exercise the typed path. The PR is small, the test surface is clear, and it is a building block for everything that follows.
Second PR: the setup() API. XState 5 introduced setup({ types, actors, actions, guards }).createMachine(...) as the recommended entry point. Port this to Python as a setup function that returns a builder object whose create_machine method consumes the typed configuration. The Python idiom for the types parameter is debatable; my preference is Pydantic for context and events, with typing.Protocol for action and guard signatures. This is a larger PR because it touches the public API; coordinate with upstream first.
Third PR: createActor and the interpreter. The current interpreter is synchronous and takes the machine plus an initial state. The XState 5 actor model has createActor(machineLogic) returning an actor object with start(), send(), subscribe(), and getSnapshot(). Port this surface, with attention to how Python's asyncio interacts with the actor's internal queue. This PR is the foundation for everything async.
Fourth PR: invoked actors with fromPromise and fromCallback. XState 5's actor logic creators (fromPromise, fromCallback, fromObservable) make it cheap to invoke async work from a state. Python's analogs are coroutines, async iterators, and asyncio.Queue. Implement at least fromPromise (taking a coroutine factory) and the invoke field on a state node so that onDone and onError transitions work.
Fifth PR: spawned actors and the actor reference type. The dynamic-actor case from Part 2 needs spawn in actions and an ActorRef type that the parent's context can hold. This is a larger refactor because it interacts with the lifecycle code and with serialization (spawned actors should not survive restarts unless explicitly persisted).
Standalone PR (any time): Pydantic-typed context with assign helpers. XState's assign is assign({ field: ({ context, event }) => newValue }). The Python equivalent should accept a callable receiving a typed context and event and returning a partial update. This is independent of the larger setup work and can land first.
Standalone PR: model-based testing. XState ships @xstate/test which generates tests from the machine. A Python analog would walk the state graph and emit pytest fixtures or parametrized test cases for every transition path. This is a research-shaped PR and is the right place to start if the contributor cares about testing infrastructure.
For each of these, the right first move is to open an issue against the upstream repo describing the proposed API and asking for review before writing the implementation. The maintainers have a vision for the library's API; landing a PR without that conversation tends to produce throwaway work. The fork at JovaniPink/xstate-python is a place to prototype the implementation and rebase against upstream when consensus emerges.
Where xstate-python Sits in the Decision Tree#
A reader who has read this far might still be uncertain about whether to use xstate-python or one of the alternatives. The decision tree, rendered in plain operator language:
- If the workflow needs durable execution across worker restarts, use LangGraph or Temporal. Combine with a state-machine library inside the graph nodes if the formalism is also needed.
- If the workflow needs statechart formalism (hierarchy, parallel, guards) but is in-memory and short-lived, use
python-statemachinefor a Pythonic declarative API ortransitionsfor the largest community and the most extensions. - If the workflow needs statechart formalism and round-trip compatibility with the XState TypeScript ecosystem (Stately Studio, JSON-defined machines that run in both Python and TypeScript runtimes), use xstate-python and contribute to the gaps. This is a small niche today and a more useful one with each PR that lands.
- If the workflow does not need any of the above, pure Python with explicit
if/elifon a state field is fine. The discipline is the chart, not the library.
The post on evaluating multi-agent workflows names failure classes (handoff loss, retry drift, completion ambiguity, latency spiral) that are exactly the failures a state-machine plus durable-execution stack prevents. The post on sandboxed agents and production automation describes agents as runtime systems with a control plane, state, recovery, and explicit execution boundaries. Both posts are gesturing at the architecture this post names. The next time those failure classes show up in a code review, the right fix is usually not "add an idempotency key" or "add a retry limit"; the right fix is to model the workflow as a state machine with explicit transitions and a durable runtime under it.
Close#
The state-machine question and the durable-execution question are different questions, and Python's ecosystem has different libraries for each. xstate-python sits at the intersection for teams that want both formalism and TypeScript-ecosystem compatibility, and the contribution roadmap above is the path to making it useful enough to ship in production. The first PR is small (typed event payloads). The second is meaningful (the setup API). The third is structural (createActor and the interpreter). A new contributor who lands the first three has changed the library from "work in progress" to "usable for non-trivial work."
Part 4 finishes the series by stepping out of Python and TypeScript to look at how Go, Elixir, Swift, and Zig express state machines, and where the runtime guarantees of each language change which idioms are honest. The actor-model conversation in particular gets sharper in Elixir, where the BEAM gives you supervision for free and the formalism is a runtime feature, not a library choice.