Milestone
Wk 1–4 Foundation
Wk 5–8 P1 Core Logic
Wk 9–12 P1 UI + Review
Wk 13–16 P2 Core Engine
Wk 17–20 P2 UI + Ship
Phase
P1 · OTIF Compliance Copilot
P2 · Exception Brain
Phase 0
Foundation
Weeks 1–4
Now
Shared Platform — Both Prototypes
  • Architecture decisions (10 ADRs written) ADR
  • PostgreSQL + pgvector schema designed DB
  • Synthetic seed data generators written Seed
  • Data context & lineage maps documented Docs
  • FastAPI project scaffold + Alembic migrations API
  • JWT auth + multi-tenant middleware (tenant_id) Auth
  • Docker Compose dev environment Infra
  • Seed data loaded to local PostgreSQL Test
  • pgvector index + RAG document ingest pipeline RAG
  • "Tenant bleed" integration test suite Test
Phase 1
P1
Core Logic
Weeks 5–8
Next
Compliance Shield — Intelligence Engine
  • OTIFEngine class — check_otif() pure Python Core
  • MCP Tool Server 1: TMS data retrieval MCP
  • MCP Tool Server 2: WMS shipped qty + label data MCP
  • MCP Tool Server 3: Retailer portal chargebacks MCP
  • Celery + Redis async job queue for analysis Queue
  • Claude Agent integration — dispute draft v1 Claude
  • evidence_chain mandatory field (Pydantic) AI
  • ChargebackStateMachine — all 5 state transitions Core
Exception Brain — Pending
Architecture & schema design
running in parallel with P1 build
Phase 2
P1
UI & Review
Weeks 9–12
Queued
Compliance Shield — Frontend
  • OTIF Dashboard — shipment list with verdict filters UI
  • Chargeback detail view + evidence chain display UI
  • AI dispute draft generation UI (async polling) AI
  • Human review gate — approve/edit/reject flow Review
  • Chat copilot sidebar (RAG-grounded Q&A) Chat
  • Multi-retailer routing guide upload + indexing RAG
Exception Brain — Design & Prototype
  • Per-source adapter interfaces designed Design
  • PriorityScorer formula + config schema Design
  • P2 exception UI wireframes (Ops Console) UI
Phase 3
P2
Core Engine
Weeks 13–16
Queued
Compliance Shield — Hardening
  • P1 performance optimisation + caching Perf
  • Audit logging — all chargeback transitions Audit
  • Pilot client onboarding (Walmart US seed) Pilot
Exception Brain — Full Engine
  • 5 source adapters (TMS, WMS, ERP, Telematics, EDI) Adapters
  • APScheduler 15-min ingestion cron Cron
  • PriorityScorer — configurable weights per client Core
  • TriageAgent — Claude tool-calling every 15 min Claude
  • SummaryAgent — daily S&OP brief via Claude Claude
  • pending_notifications queue + Slack/Teams delivery Notify
  • agent_lock deduplication (30-min TTL) Lock
  • 90-day event pruning cron Cron
Phase 4
P2 UI +
Demo Ship
Weeks 17–20
Queued
Compliance Shield — Demo Ready
  • AWS ECS production deployment (ME-South-1) Deploy
  • Investor demo environment with realistic seed data Demo
  • SSO/OIDC enterprise auth (Okta integration) Auth
Exception Brain — UI + Ship
  • Ops Console — exception queue with P1/P2/P3 filter UI
  • Exception detail view + action tracking timeline UI
  • Daily S&OP brief feed — HTML + Slack rendered Brief
  • Client-configurable scoring weights UI Config
  • AWS ECS production deployment Deploy
  • Full investor demo — 4-week live data simulation Demo
Build risks · Mitigation plan
High
Claude API latency for real-time triage
Async job queue prevents blocking. Triage runs on 15-min cycle, not real-time. SLA: results within 2 min of ingestion window close.
High
Multi-tenant data isolation failure
tenant_id enforced at ORM layer with required() validator. "Tenant bleed" integration test checks 5 cross-tenant query patterns before every deploy.
Medium
Retailer portal schema changes break P1
MCP adapters versioned per retailer. Schema validation on ingest returns UNCLASSIFIED rather than crashing. Monthly schema monitoring job.
Medium
Claude hallucination in dispute drafts
Mandatory evidence_chain field. Pydantic rejects output without cited chunks. Human review gate before any submission. No auto-submit path.
Low
pgvector RAG retrieval quality
Namespace by retailer_id + effective_date. Cosine similarity threshold configured per retailer. Monthly embedding quality audit on test queries.
Low
Claude API cost overrun at scale
Prompt caching for routing guide context (stable). Token limits enforced per call. Cost alerting at $50/client/month threshold. Budget caps per tenant.