Work  /  Internal tooling · AI agents Case study
Internal · Built with Claude Code

Agents that don't think for you, they make you think.

I built four agents for my design team with Claude Code. They don't just hand over answers, they pressure-test the thinking, translate between design, engineering, and the business, and get to a sharper outcome faster.

Synthia, mid-synthesis
M
Synthesize the 8 sessions from the export test.
S
Done, run against our personas. 3 themes, ranked by impact. Top one: 6 of 8 admins hunted for the bulk action. I've drafted the test plan for the fix.
Open synthesisView test plan
Role
Product Designer · builder
Built with
Claude Code
For
Our design team
The cast
Synthia · Leo · Atlas · Scout
Remit
Concept → build → adoption
01 · The stall

Design wasn't the slow part. The gaps were.

The work that stalled my process was rarely the design itself. It was everything around it.

A ticket with no acceptance criteria. A flow I couldn't tell was feasible. A requirement buried on page nine of a PRD. A component nobody had documented. Each one sent me off to chase context, ping an engineer, or guess, and guessing is where rework comes from.

  • Vague ticketsNo acceptance criteria, no definition of done.
  • Feasibility unknownWas this a one-line change or a month? No idea until much later.
  • Requirements missedThe one line in the PRD that changes everything, found too late.
  • Undocumented componentsBuild new, or extend an old one? Nobody could say what would break.
02 · Don't automate the thinking

I didn't want autopilot. I wanted a sharp teammate.

Tools that hand you the answer make you a lazier designer. I'd rubber-stamp whatever came out. So I flipped the brief: the agent does the legwork, but it makes sure I've set the direction first, then it delivers.

Built with Claude Code, each one turns a little context into a real outcome: a synthesis, a timeline, a gap list, a fixed component. A quick check up front, then the work.

M
Write the testing plan for the new export flow.
S
One quick check, what counts as success? … Got it. Plan's ready: 5 tasks, success metrics, and our personas mapped to each.
Open test plan
Fig. 01: a little context in, a real plan out. One quick check, then Synthia maps each task to a persona and hands back the plan.

An agent that does your thinking is just a faster way to be wrong.

The principle behind all four
03 · Meet the team

Four agents. Each asks what I'd rather skip.

They're named, not numbered, because the team talks to them like colleagues. Each owns a corner of the process where I used to lose time.

S
SynthiaUX Research

Already knows our personas cold, that's what makes her fast. Synthesizes findings and drafts testing plans for our internal tools from almost no starting context.

Forces me to answer
What's the one task we're measuring, and what does “done” look like?
L
LeoEngineering

Our in-house engineer. Generates timelines, pressure-tests feasibility, and aligns a design with what's actually buildable.

Forces me to answer
Is this a one-line change or a month? What edge cases haven't you drawn?
A
AtlasProduct

Aligns the work against our PRDs and PR/FAQs, then tells you plainly what the design still misses.

Forces me to answer
The PRD asked for X. Where is it in this flow?
Sc
ScoutDesign System

Reads our component repo and drills into Storybook: checks documentation, estimates the blast radius of changing versus building, solves accessibility violations, and gets the code ready for a clean MR.

Forces me to answer
This already exists. Extend it, or accept what breaks if you fork it?
04 · Too eager

The first versions were too eager.

My early builds over-explained, every answer came wrapped in caveats and clarifying questions. Useful, but exhausting. So I tuned them down: ask only what's needed, then deliver something I can act on.

M
How long to add bulk export to the reports table?
L
Reusing the table component, server-side: about 3 days. Watch the pagination state, budget half a day there. I've split it into tickets.
View ticketsSee assumptions
Fig. 02: Leo, with a timeline you can defend. A real range with the reasoning and risks attached, not just a number.
05 · Designing the questions

The design work was the conversation.

What each agent asks, in what order, when to push and when to back off, that was the craft. And the translation underneath it: Leo speaks engineering, Atlas speaks PRD and stakeholder. After a few rounds I walk into those rooms already fluent in their language.

  • Suggestions, never decisionsThe agent proposes; I decide. I can always override, and it expects me to.
  • It cites its sourceEvery claim points back to the repo, the PRD, or the research, so I can check it.
  • It flags, I judgeScout surfaces the blast radius and the a11y gaps; the call to fix or fork stays mine.
06 · What changed

Faster starts, sharper questions, fewer loops.

The agents didn't take work off my plate so much as change the shape of it. I spend less time chasing context and more time deciding. And the briefs I write are better, because something makes me write them.

4Agents now in the team's daily flow
Fewer“What did you mean?” loops with eng and product
Day oneAn ambiguous ticket gets questioned the moment it lands

The agents don't solve the problem. They make you think.

The whole point
Next case ✦

Self-serve invoices: every invoice one click away.

Read next →