DEALta

Stateful Review Orchestration for Multi-Team Workflows

Detects what changed in a document, routes review tasks to the right teams, and tracks which prior approvals are no longer safe to trust after later revisions.

Stateful Across Rounds Multi-Agent Orchestration Eval-First Langfuse Traced

Built from experience coordinating a supplier agreement across Legal, Finance, Commercial, Product/Tech, and Customer Support.

Recommendation

A supplier agreement under renegotiation

Three rounds of review. Five functions involved. Here's what changed in the final round — and why it matters.

The deal

Nexus, a travel platform, is finalising a supplier agreement with StayLink, an accommodation provider. The agreement governs commission structure, payment terms, liability, and content obligations.

In v2, five business functions reviewed the agreement. Legal, Finance, and Commercial each granted conditional approval — with specific terms that had to hold for their sign-off to remain valid.

v3 arrived. Four clauses changed. Three of those conditions were broken.

Why it's hard

Finance approved commission terms assuming mutual control over review timing. v3 shifted that control unilaterally to StayLink.

Legal approved liability terms assuming a symmetric cap. v3 introduced an asymmetry tied to booking volumes.

No single reviewer had visibility across all three broken approvals at once. DEALta detects the breaks, routes them to the right functions, and escalates before anyone proceeds.

This is the coordination problem DEALta is designed to solve.

Orchestration trace

Six specialised agents, orchestrated via LangGraph. Each writes to shared typed state — decisions accumulate, nothing passes outside the graph.

Detected changes

Invalidated approvals

Sign-offs granted in v2. Re-evaluated against v3 changes — all three breached their stated conditions.

Policy violations

Compound risks

Risks that no single change creates alone — only visible when changes are analysed in combination.

Escalation queue

Items that require a human decision before the contract can proceed.

Required sign-off status

Current approval gate status across business functions.

How the system works

Contract v(n-1) + v(n) + Prior Approvals → 6 agents → escalation-ready decision pack

Change Detectiondetects clause deltas
Invalidationchecks stale approvals
Routingroutes team review
Policy Checkchecks policy rules
Dependencyfinds cross-clause risk
Decision Packassembles recommendation

Deterministic recommendation logic. LLM writes one paragraph. Everything structural is Python.

Evaluation + Observability

Ground truth written before agents, not after.

Evaluation

Stateful invalidation13/13
Change detection100%
Routing accuracy89%
Policy compliance100%
Compound risk detection2/2
Decision pack structure6/6
Narrative faithfulnessPASS (LLM-as-judge)

Ground truth written before agents, not after.

Observability

Langfuse pipeline waterfall

6 agent spans with per-step cost, latency, and I/O visibility

Pipeline audit record

Pipeline audit: inputs, outputs, metadata

Agent-level detail

Agent detail: prompt, response, tokens

$0.0017/run · 58s total · 6 traced spans · gpt-4o-mini as eval judge

Version comparison

What changed between the v2 review round and this v3 escalation.

Performance metrics

Token usage, latency, and estimated cost per agent. Full run on gpt-4o-mini.

Agent Time (s) Input tok Output tok Est. cost