AI + agents in small business. What changed, what works, what breaks.
A field guide to AI and agents for non-technical veterans and early-stage founders. Delivered at the STRIVE Forum in April 2026. Slides, takeaways, and the two case studies behind the talk.
Abstract
AI keeps changing. The flaws do not.
Most AI advice is written for teams that do not exist in your business. No platform team. No safety reviewer. No QA. You're the one catching the flaws.
This talk skips the hype and walks through the properties of these tools that never go away, the small set of habits that keep you safe, and a Monday drill to start this week.
You will leave with a simple Green / Yellow / Red rule for every task, a single drill to run for two weeks, and a clear view of where these tools help and where they cost you.
Outline
What the talk covers, in order.
What changed in the last 18 months
Large language models stopped being a demo and started being a tool. What actually shifted: cost, quality, and who can pick them up. What did not shift: the failure modes.
The flaws that never go away
Hallucinations, confident wrong answers, drift, silent context loss, prompt injection, and the inability to say "I don't know." Every one of these will still be true in 2030. The upgrade cycle does not fix them.
Green / Yellow / Red
A rule for every task. Green: bad answer costs you nothing. Yellow: bad answer costs you money or time. Red: bad answer creates liability. The color decides how much you verify before shipping.
One Monday drill
Pick one task. Run it with AI for two weeks. Measure time saved and errors caught. Keep what works, drop what does not. The point is not to deploy AI — it is to learn which tasks at your business are actually improved by it.
What this looks like in production
Two real deployments from Rex Black, Inc. engagements: an AI-powered veteran services intake replacing rigid phone scripts, and 24/7 AI phone coverage for a regional services provider. Same framework, different scales.
Key takeaways
Four things to remember.
AI has flaws that never go away
No upgrade fixes them. Someone has to catch them every time. At your scale, that someone is you.
Green, Yellow, Red
Green: bad answer costs you nothing. Yellow: bad answer costs you money or time. Red: bad answer creates liability. The color decides how much you verify.
One Monday drill
Pick one task. Run it with AI for two weeks. Measure time saved and errors caught. Keep what works.
You are the verifier
You do not have a QA team. You do not have a compliance officer. You are both. Build the habits now.
Keep reading
Related pieces.
More for this audience
Articles, guides, and case studies tagged for the same readers.
- Whitepaper
Starting AI Adoption: A Sequence for Mid-Market Engineering Teams
The order of operations we use with mid-market engineering teams that have been told to ship AI and do not know where to start. Six stages, named exit criteria, the anti-patterns that predict failure, and the first-90-days view that ties architecture, evaluation, and model economics into a coherent adoption sequence.
Read → - Whitepaper
Evaluation Before Shipping: How to Test an AI Application Before It Hits Production
The release-gate playbook for AI features. Covers the five evaluation dimensions, how to build a lean golden set, where LLM-as-judge is trustworthy and where it lies, rollout mechanics with named exit criteria, and the regression suite that keeps a shipped AI feature from quietly rotting in production.
Read → - Whitepaper
Choosing the Right Model (and Knowing When to Switch)
A practical framework for matching LLM model tier to task. Covers the four axes (capability, latency, cost, reliability), cascade routing patterns that cut cost 60 to 80 percent without measurable quality loss, switching costs you did not plan for, and the worked economics at 10K, 100K, and 1M decisions per day.
Read → - Whitepaper
Workflow or Agent? A Decision Framework Before You Architect Anything
Most production 'agents' are workflows that overshot. This paper distinguishes deterministic LLM pipelines from autonomous agents, names the four questions that decide which one to build, and covers the failure modes specific to each path. Includes the 'earned autonomy' principle for promoting workflows to agents only after instrumentation justifies it.
Read → - Whitepaper
The Case for Investing in Testing: A Board-Level Argument for Enterprise Test-Function Capability
Enterprise organizations regularly face the question of whether to invest in their test-function capability — in hiring, in tooling, in automation infrastructure, in process maturity. The question is often answered by default rather than by analysis, and the default is under-investment relative to the economic case. This whitepaper presents the board-level argument for investing in testing, structured around the four business outcomes that robust testing produces, the cost curve that makes early investment asymmetrically valuable, and the specific organizational patterns that distinguish organizations that treat testing as strategic from those that treat it as overhead.
Read → - Whitepaper
Deciding When to Bring in External Help: A Framework for Training, Consulting, Staff Augmentation, and Outsourced Testing
Most enterprise decisions to bring in external testing help succeed or fail based on whether the right form of help was selected, not on whether the particular vendor performed well. This whitepaper covers the four categories of external testing help — training, consulting, staff augmentation, and outsourced testing — and the decision framework that matches each form to the problem it solves, with cost, capability, and exit-cost implications for modern enterprise test programs.
Read →
Where this leads
- Service · AI
AI & Data Governance
Building AI systems that work in production: architecture, governance, and the failure-mode coverage prototypes hide.
Learn more → - Solution
Risk Reduction & Clear Decisions
Quality programs and decision frameworks that shift risk discussions from anecdote to evidence.
Learn more → - Tool · AI
Allora
Lead intelligence agent that verifies every claim before it reaches your CRM. Production AI we run ourselves.
Learn more →
Want this talk delivered in-house?
Rex Black, Inc. delivers every talk on this site as a live workshop, a keynote, or a conference session. Tailored to your stack, your team, and your timeline.