AI Agent Scoping Assessment · Thirty minutes
The honest grade. The four numbers your CFO will ask. No deck.
Thirty minutes with a senior consultant. We grade your AI agent program against the Maturity Model and the seven capability axes, and we tell you what it would take to operate. You walk away with a one-page picture and an honest answer to the question your board keeps asking: is this working, and at what cost?
What the thirty minutes covers
Maturity grade
Where you actually sit on the five-level ladder
We map your current state against the AI Agent Maturity Model. Curiosity, Pilot, Limited Production, Operated Production, Operating Model. Most teams overrate themselves by 1.5 levels, this conversation tells you which one you are.
Capability scorecard
A 0-3 score on each of the seven axes
Decision boundaries, observability, evaluation, governance, ROI math, integration, and human-in-the-loop. We tell you which axis is the lowest and why that one is the bottleneck.
CFO ROI conversation
The four numbers that close a budget
Hours displaced × loaded rate. Revenue or pipeline contribution. Quality or risk delta. Cost to operate. We name them, ground them in your data, and tell you which one you should not claim yet.
Next 90 days
A no-deck plan you can act on Monday
The two or three moves that lift you one capability level. We tell you which one is fastest, which one is cheapest, and which one is highest leverage. You decide which to fund.
What it is not
- A vendor pitch. We do not sell tools, and we do not chase model releases.
- A 40-slide deck. The conversation produces a one-page summary you keep.
- A "discovery call" disguised as a sales meeting. The grade is the deliverable, not a follow-up appointment.
- A free demo of our agents. We are graders, not vendors. The output is your picture, not ours.
Who this is for
CIOs, COOs, operating partners, and heads of AI who have an agent in production (or close to it) and want an outside grade. The conversation is most useful when there is at least one workflow the agent already touches, and at least one question your board has asked that you have not answered cleanly yet.
If you are pre-pilot and exploring whether to start, that is a different conversation. Start with the Maturity Model pillar and the newsletter, and come back when you have a Level 1 program you want graded.
More in this area
Articles, talks, guides, case studies, and reference artifacts that show up on the same kinds of engagements.
- Operating Model
The AI Agent Maturity Model
Five levels, seven capability axes, and the four numbers your CFO will ask. The honest picture of what it takes to run AI agents in production, and how most teams overrate themselves by 1.5 levels.
Read → - Whitepaper
Starting AI Adoption: A Sequence for Mid-Market Engineering Teams
The order of operations we use with mid-market engineering teams that have been told to ship AI and do not know where to start. Six stages, named exit criteria, the anti-patterns that predict failure, and the first-90-days view that ties architecture, evaluation, and model economics into a coherent adoption sequence.
Read → - Whitepaper
Evaluation Before Shipping: How to Test an AI Application Before It Hits Production
The release-gate playbook for AI features. Covers the five evaluation dimensions, how to build a lean golden set, where LLM-as-judge is trustworthy and where it lies, rollout mechanics with named exit criteria, and the regression suite that keeps a shipped AI feature from quietly rotting in production.
Read → - Whitepaper
Choosing the Right Model (and Knowing When to Switch)
A practical framework for matching LLM model tier to task. Covers the four axes (capability, latency, cost, reliability), cascade routing patterns that cut cost 60 to 80 percent without measurable quality loss, switching costs you did not plan for, and the worked economics at 10K, 100K, and 1M decisions per day.
Read → - Whitepaper
Workflow or Agent? A Decision Framework Before You Architect Anything
Most production 'agents' are workflows that overshot. This paper distinguishes deterministic LLM pipelines from autonomous agents, names the four questions that decide which one to build, and covers the failure modes specific to each path. Includes the 'earned autonomy' principle for promoting workflows to agents only after instrumentation justifies it.
Read → - Whitepaper
The Case for Investing in Testing: A Board-Level Argument for Enterprise Test-Function Capability
Enterprise organizations regularly face the question of whether to invest in their test-function capability, in hiring, in tooling, in automation infrastructure, in process maturity. The question is often answered by default rather than by analysis, and the default is under-investment relative to the economic case. This whitepaper presents the board-level argument for investing in testing, structured around the four business outcomes that robust testing produces, the cost curve that makes early investment asymmetrically valuable, and the specific organizational patterns that distinguish organizations that treat testing as strategic from those that treat it as overhead.
Read →
Where this leads
Services and products that typically come next.
- Service · AI
AI & Data Governance
Building AI systems that work in production: architecture, governance, and the failure-mode coverage prototypes hide.
Learn more → - Solution
Risk Reduction & Clear Decisions
Quality programs and decision frameworks that shift risk discussions from anecdote to evidence.
Learn more → - Tool · AI
Goomni
AI voice agent for inbound coverage: appointment scheduling, FAQ handling, intake. Deployed for our own line and for clients.
Learn more →