Find the bugs your tests miss. Before they become incidents.
Avoid outages, failed launches, and midnight debugging.
I'm Dr. Thomas Pani. I help platform, infrastructure, SRE, and protocol teams find failure paths ordinary tests rarely cover in critical stateful software systems: retries, stale reads, crashes, partial failure, recovery paths, and broken invariants.
Find the bugs your tests miss. Before they become incidents.
Avoid outages, failed launches, and midnight debugging.
I'm Dr. Thomas Pani. I help platform, infrastructure, SRE, and protocol teams find failure paths ordinary tests rarely cover in critical stateful software systems: retries, stale reads, crashes, partial failure, recovery paths, and broken invariants.
No SaaS deployment required.
Engagements can work from design docs, or a controlled repository (under NDA if needed).
Who This Is For
-
1
You own a critical stateful system. A control plane, scheduler, reconciliation loop, database, payment or fiscal workflow, protocol, smart contract, or critical infrastructure.→ Make the critical behavior explicit.
-
2
A reliability failure would be expensive. A bug that causes financial loss, security exposure, data loss or inconsistency, downtime, missed SLAs, or reputation loss.→ Find risky paths while they are cheap to fix.
-
3
Existing tests cover known cases. Known examples pass. Rare schedules, stale reads, retries, crashes, partial failure, and recovery paths are still blind spots.→ Generate them instead.
-
4
Critical invariants are implicit. The "must never happen" properties are scattered across tribal knowledge, design docs, postmortems, review checklists, or AI prompts.→ Turn them into runnable tests.
Common starting points: during design or before launch, when state, ordering, or failure behavior is risky; after an incident; or before a migration, audit, or external review.
Common targets: control planes, schedulers, reconciliation loops, databases, payment flows, protocols, smart contracts, and stateful cloud infrastructure.
Scoped Engagements for Critical Systems
First engagements usually start with one critical subsystem.
Choose a pilot when you want runnable checks connected to your system. Choose a review when you need an independent risk assessment before a launch, migration, audit, or external review.
Critical Subsystem Pilot
Find failure paths in one critical subsystem and turn them into runnable checks. Evaluate the method before wider adoption.
For: Teams that own a critical subsystem where failure would mean downtime, data inconsistency, missed SLAs, or a long incident investigation.
Work: Identify invariants, model key state transitions and failure modes, generate traces, and connect findings to implementation tests or review artifacts.
You get: Technical findings, a runnable model/spec, walkthrough with your team, and an adoption recommendation.
Critical Subsystem Review
Get an independent targeted review before a launch, migration, audit, or external review. Leave with risks and mitigations.
For: Teams that need evidence before a launch, migration, audit, or external review.
Work: Review the design, failure modes, invariants, tests, and operational assumptions.
You get: Written findings, risk areas, suggested tests or invariants, and recommended mitigations.
Example Engagement: Aztec Labs Governance Protocol
2025Five-week verification engagement for Aztec Network’s on-chain governance protocol before it became critical infrastructure. The key risk was cross-contract behavior across voting rounds, upgrades, staking, slashing, and time-dependent phases.
In five weeks we turned the protocol rules into a runnable specification, checked 125 invariants and 992 verification conditions, and reported findings the Aztec team addressed. The work gave the team concrete, repeatable evidence about protocol behavior and a reusable review artifact.
Low-Friction Engagements
- Starts from one subsystem.
- Remote-first, with on-site workshops or kickoffs when useful.
- Written scope and deliverables before work begins.
- Fixed-fee available, priced after a short scoping call.
- Can run under NDA.
- No SaaS deployment required.
- I can work from the relevant code or design docs without production access.
- Output is engineering-readable and stays as an artifact.
Workshops and Training
In addition to technical engagements, I also run workshops on automating expected behavior checks in complex software systems.
Hands-on
- model critical behavior in a workshop system
- generate failure scenarios
- turn generated behavior into tests
- leave with next steps for candidate subsystems
DevConf 2026 · Brno, CZ
Find Distributed-Systems Bugs Before Production
How System Verification Works
Better reliability does not come from "more tests" alone.
Hand-written and AI-generated tests routinely miss critical behaviors. The work is to make the behavior explicit, generate hard paths, and check them against something the team can review and run again.
People call this
The important part: the spec is runnable code that produces tests and goes into CI.
What this looks like
Model behavior
Write an executable specification of key behavior. The result is an executable reference for how the system should behave.
Generate traces
Generate scenarios your team would never write by hand, from quick randomized runs to deep systematic exploration (“model checking”).
Check the system
Replay generated scenarios against the real system as tests. Then keep the spec as a design artifact, onboarding docs, review aid, test oracle, and CI check.
How this fits with other testing Show details
Executable specs are not a replacement for the rest of your testing stack. They strengthen software verification work where the missing piece is a behavioral oracle and early exploration of rare schedules.
Unit / Integration Tests
Good for known cases, regressions, and fixed interactions between components. Passing tests prove only the paths you wrote down.
BDD / E2E Scenarios
Good for user-facing flows, acceptance criteria, and fixed cross-component workflows. Scenario coverage is sparse; timing and failure interleavings are easy to miss.
PBT / Fuzzing
Good at finding unexpected inputs missed by example-based tests. Often tuned toward input bugs; specs add feedback on explored behavior and invariants.
Executable Specs + MBT
Good for exploring unknown behavior, clarifying design, and generating checks. Keep the spec connected to implementation tests and CI.
Deterministic Simulation
Good for real-system behavior under faults, timing constraints, and complex schedules. High setup cost, but strong reproducibility.
Chaos / Fault Injection
Useful for resilience validation under production-like pressure. Feedback comes late, and exploration is not systematic by default.
Why Me?
For 10+ years, I have worked both on practical verification tooling and on applying that tooling to real systems, across academia, industry, and consulting. The work below is about improving trust and reliability in stateful systems where small mistakes matter.
Aztec Labs: Governance Protocol
2025Formalized and checked 125 "must never happen" properties across a multi-contract governance system: voting, upgrades, permissions, and execution paths.
Ethereum Foundation: 3-Slot Finality
2024Formal modeling and verification of accountability, a fundamental safety property, in Ethereum blockchain 3-slot finality consensus.
Apalache: Model Checking for TLA+ and Quint
CurrentMaintainer of the symbolic model checker used for protocol and infrastructure verification work.
Quint: Executable TLA+-Style Specs
PastPreviously co-developed Quint from its early stages at Informal Systems, bringing TypeScript-like syntax to TLA+-style specs.
Book a 30-minute fit call
Email me with 2–3 sentences about the subsystem, the risk, and the timeline. No preparation is needed.
Based in Vienna, Austria.
Available across Europe and internationally.
I typically respond within one business day.
Email me to book a fit call- Good fit: real testing pain in a critical stateful system, before launch, after an incident, before a migration, or before an external review.
- Not a fit: more unit tests, broad code coverage, generic security scanning, or a document nobody will run.
- First call: we check fit and a small next step. For a pilot or review, we later agree on scope, deliverables, timeline, and required access.