How I became CEO of a synthetic company

May 19, 2026

Article voiceover

0:00

-18:55

A quick warning: this is the first proper piece from The Machine Room, my new section on AI, software, systems and the future of work. If you are here purely for transition, surgery, dating apps, grief and emotional devastation, do not panic. Normal service will resume shortly. This is just the bit of my brain that has spent thirty years building systems finally kicking the cellar door open.

I think Henry is either transgender and forgot to get a new name, or Margo’s created a dodgy image but you get the picture.

I for a long time have been obsessed with my own productivity.

Not in the 5am ice bath, journalling, “what are your top three goals for today?” sort of way. That kind of productivity culture has always felt like it was built for the sort of wanker who listens to three self-help audiobooks before breakfast and thinks sticking butter in coffee counts as a personality.

I mean productivity in the more real sense. At one point I was running three contracts and a trading desk simultaneously. I had to be brutally efficient. Offshore developers, algorithms, multiple meetings, it was mental.

Give me a board. Give me lanes. Give me too many things to do. Give me multiple streams of work, a slightly impossible deadline, five screens, seventy browser tabs, a half-written prompt, a software problem, a Substack draft, a business idea, a dog chewing something he absolutely should not be chewing, and I will start moving work around like a caffeinated air traffic controller.

This is not necessarily healthy. But it is how my ADHD brain likes to works.

I have used KanbanFlow for years. Not just for software. For everything. Work tasks, life admin, watering the plants, eating lunch. And yes, I do mean literally eating lunch, because there is every chance my ADHD brain will become fascinated by some tiny implementation detail at 11:30am and then look up at 4:45pm wondering why I feel faint and hate everyone.

Yes, I have got a Botox appointment in the morning. Slightly concerned it’s too close to the hair appointment, but we will see.

I like seeing work move. Backlog. Tomorrow. Today. In process. Done. Lovely. Beautiful. Civilisation. Tiny rectangular proof that chaos can be bullied into shape.

Then AI arrived and made this much, much worse.

Or better.

Possibly both.

Because AI did not turn me into a productivity monster. I was already that. What AI did was give the monster staff.

The synthetic company

At first this was wonderful.

I could dictate rough intent into one model, get it shaped into a clean prompt, paste that into Claude or Mini-Claude inside VS Code, get something implemented, take the output back for review, have Henry challenge the architecture, have Margo sharpen the writing, hand the result to a tester, then send refined instructions back into VS Code.

And yes, I give them all names. Henry is Anthropic’s Claude, working as my architect. Margo is ChatGPT. She leads on some projects. Mini-Claude is the swarm of little Claude instances living inside Visual Studio Code doing implementation work. But we also have Maths-Claude and Pizzazz-Claude. He lives inside PowerPoint.

It would not be uncommon for me to say daily, “Henry, get Maths-Claude to update the figures for readership and give me all of the highline metrics on sources”. And off Henry would go, building a prompt for Maths-Claude, which is effectively Claude’s anthropic inside Excel, to remote control him.

Round and round.

It was powerful.

It was also completely ridiculous.

Because after a while I realised I was not really doing software development in the old sense. I was not even using AI as “an assistant”, which is the phrase everybody uses because it sounds reassuring and not like we have all started employing invisible digital interns.

I was managing workers.

One AI was acting like an architect. Another like an implementer. Another like a reviewer. Another like QA. Another was turning rough ideas into cleaner work instructions. Another was helping me decide whether the previous one had done anything stupid.

And there I was, sitting in the middle of all this, thumbing things up and down like the CEO of a small, tireless, slightly insane synthetic company.

I had become ridiculously productive, astronomically productive, but, and here’s the problem...

I had become a meat-based integration layer!!!

Which is not what I wanted. I did not get AI assistants so I could be promoted to cut and paste monkey.

The bottleneck moves

The strange thing about AI-assisted work is that the bottleneck moves.

It used to be capacity. There was too much to build and not enough people to build it. That is still true. AI has not magically fixed the world, despite what LinkedIn men in quarter-zips seem determined to believe.

But once you get good at these tools, a different bottleneck appears.

You.

Your judgement. Your attention. Your ability to move context between systems, read the output, decide what matters, generate the next instruction, send it to the right tool, capture the result, review it, preserve the useful artefact, and stop the whole thing wandering off into the nearest ditch.

That is where I found myself.

All these little AI workers, constantly waiting for me.

Not because the work was impossible. Not because every decision needed my genius, although obviously let us not dismiss that possibility too quickly.

They were waiting because the workflow was primitive.

The problem was not that the AIs were useless.

The problem was that I was in the way.

Relay

The idea is simple: stop being the manual message bus between AI systems. Build a cockpit where work exists as actual work, not chat sludge.

Projects.

Missions.

Work Orders.

Artefacts.

Events.

A Dispatch Board.

A Capture Inbox for rough dictated intent wired to the APIs and UIs of the major LLMs I use.

Follow-up Work Orders created from outputs, reviews and decisions.

Because if AI is going to do serious work, the work needs shape. It needs memory, state, roles and boundaries. It needs somewhere to live other than a long vertical chat window full of half-brilliant, half-forgotten nonsense.

A Project is the long-lived area.

Become.

Relay.

Marketing.

Personal admin.

A Mission is a campaign inside it.

A Work Order is a specific delegated task.

An Artefact is the durable output: the prompt, the diff, the review, the decision, the test plan.

This matters because AI work disappears far too easily.

It vanishes into chats. It gets buried inside context windows. It sits in one tool when the next tool needs it.

Three hours later I vaguely remember Henry said something useful about the schema change but cannot remember where, Mini-Claude is waiting, Margo is idle, the Chihuahua is attacking Sheepy in the corner, and I am rummaging through chat history like a raccoon in a bin.

That is not an operating model.

It is an organisational design problem

The real breakthrough was when I stopped thinking about this as a chatbot problem.

It is not.

It is an organisational design problem.

Once you have multiple AI agents doing different kinds of useful work, the question is no longer “how do I write a better prompt?”

It becomes:

“What should the company look like?”

Who is the CEO? Who is the CTO? Who owns the product? Who implements? Who reviews? Who tests? Who can approve what? Who gets overruled? Who has merely sounded clever three times in a row and should still not be allowed anywhere near a production database?

We are not quite there with the modeling for that yet, but we will be very, very soon. The company will even have an org chart!"

I am the owner.

The CEO, if we are going to use the metaphor properly.

Not because I want to ponce around pretending to be important, although I am not entirely immune to that, obviously. But because the human has to remain the strategic owner.

The direction belongs to me.

The risk appetite belongs to me.

The decision to spend money, send an email, alter production configuration, or make an architectural pivot belongs to me.

Underneath, the roles fall out naturally.

Henry decomposes the work, challenges the architecture, and says “no, Stevie, that is stupid”, which is one of the most valuable services any system can provide.

Mini-Claude implements.

Unit Test Goblin writes tests.

Antagonistic Tester attacks the result.

Margo reviews tone and public-facing language.

Release Clerk packages things up.

Above all of that, I want a governance layer.

Not a literal board of retired men in navy suits looking disappointed over glasses of tepid water. A way of asking: did this agent follow the brief? Did it stay in scope? Did it predict my likely judgement? Did the work reduce my burden or create more cleanup?

Because trust cannot be vibes.

Trust cannot be “Margo wrote a lovely paragraph once, therefore Margo can approve a schema migration.”

No.

Absolutely not.

Margo can become trusted on writing tone. MiniClaude can become trusted on controlled code diffs and unit tests. Henry can become trusted on architecture and decomposition.

But trust in one category does not transfer to another.

A good writer does not become a database administrator because she used a nice metaphor.

Trust is not a personality trait.

It is a demonstrated, category-specific capability.

What I want to build is agents that earn delegated authority over time.

Not autonomy in the stupid science-fiction sense.

Delegated authority.

That phrase matters.

An employee in a real company is not “autonomous” in some absolute sense. They are authorised to act within a role, a scope, a budget, a risk boundary and an escalation path.

That is what AI agents need.

Not freedom.

Authority.

Bounded, boring, reversible authority.

What this does to agile

At first this was just my own workflow problem.

Then it started to feel bigger.

Because the more I thought about Relay, the more I wondered what this does to agile.

So what the hell happens to our Agile?

Not in the boring “Agile is dead” way. People have been announcing the death of agile for years, usually immediately before selling you a new framework with a diagram shaped like a distressed octopus.

Agile is not dead.

But some of its assumptions are being disturbed.

Agile was built around human collaboration under uncertainty. A user story is a prompt for a conversation. Talk to each other. Learn as you go. Build working software.

All good things.

But what happens when the “developer” is not quite human?

What happens when some of the work is being done by AI agents that move very quickly, very cheaply, in parallel, without needing lunch, but which are also literal-minded, context-sensitive, overconfident, and occasionally capable of producing beautifully formatted nonsense?

A user story says:

As a user, I want X, so that Y.

Useful.

Human.

Conversational.

A Work Order says:

Here is the objective.

Here is the context.

Here is your role.

Here are the inputs.

Here are the constraints.

Here is what you may do without asking.

Here is what you must not touch.

Here is when to escalate.

Here is the artefact I expect back.

Here is the definition of done.

That is not the same thing.

A Work Order is delegated authority packaged as executable instruction.

This is not a return to waterfall. The point is not to specify the whole system upfront in some giant doomed requirements tomb. The point is to specify enough for the agent to work safely.

Enough context.

Enough boundary.

Enough authority.

Enough escalation logic.

AI also changes the economics of specification.

Historically, detailed requirements were expensive. Humans had to write them, maintain them, read them, or more commonly pretend to have read them. Agile reacted against that, sensibly.

But the AI can now help write the Work Order, decompose the mission, maintain the artefacts, turn decisions into follow-up tasks, compare outputs against constraints, and red-team the result.

The question is no longer:

“How little can we write down and still have a useful conversation?”

It becomes:

“What level of structure allows AI workers to keep moving safely without making the human do clerical supervision all day?”

That is a different question.

And I think it leads to a different model.

Escalation by exception

The phrase “human in the loop” gets thrown around a lot.

It sounds responsible.

It is often useless.

What loop?

Where?

For what decision?

At what level of risk?

If the human has to approve every tiny step, the human becomes the bottleneck. That is exactly the problem I am trying to escape.

In a real company the CTO does not sit beside a developer approving every line of code. The CEO does not personally inspect every email draft.

People work within delegated authority and escalate when they hit something that matters.

That is the model AI needs.

Escalation for genuine decisions only: architectural pivots, public output, email sending, destructive actions, schema changes, new dependencies, secrets, production configuration, unclear requirements, major disagreement between agents, out-of-scope work, cost implications.

Anything that could make my life measurably worse if an AI got clever in the wrong direction.

Everything else should keep moving.

I do not want autonomy everywhere.

I want autonomy where it is boring, bounded and reversible.

The thing I am actually building

This is why Relay matters to me.

It is not a toy.

It is not some productivity dashboard I am building because I got bored and wanted another excuse to avoid doing my tax return.

Although, in the interests of full disclosure, I am also using it to avoid doing my tax return.

It is a response to a real working pattern that emerged naturally once AI became useful enough.

I found myself acting like a CEO. I found myself thumbing up and down decisions made by different AIs. I found myself building prompts that were less like questions and more like management instructions.

I found myself thinking:

If this worker has already shown that it can make this class of decision correctly, why am I approving it again?

Then:

If another agent can review that worker’s output, and a third can test it, and the result falls within an agreed trust category, why does this need to come back to me at all?

Then:

Oh.

Then:

Oh no.

Then:

I am going to have to build the bloody thing, aren’t I?

Because once you see the company model, it is hard to unsee.

The AI infrastructure should look more like a real organisation: CEO, CTO, product owner, board, managers, workers, reviewers, testers, clerks.

Different roles.

Different authority.

Different trust levels.

Different escalation paths.

Different performance history.

Not because we want to LARP corporate structure. God knows there is enough of that already. But because real organisations evolved those structures for reasons.

Decision-making needs hierarchy.

Execution needs delegation.

Risk needs governance.

Trust needs boundaries.

Memory needs artefacts.

I think a lot of technical people using AI seriously are starting to feel the same shape emerging, even if they have not named it yet.

The assistant model is too small.

The chatbot window is too small.

The copy-and-paste loop is too stupid.

The work wants structure.

The agents want roles.

The human wants leverage without becoming a clerk.

Human-led, agent-amplified

I am not interested in the fantasy where AI agents build everything while humans vanish from the loop.

I like judgement. I like taste. I like architecture. I like deciding what matters.

I do not want to be replaced by the machine.

I want the machine to stop asking me to carry messages between its own limbs.

That is different.

The human sets direction.

The orchestrator decomposes.

The workers execute.

The reviewers challenge.

The testers attack.

The system preserves artefacts.

The human is pulled back in only for decisions that actually deserve a human.

That is not waterfall.

It is not classic agile.

It is not full autonomy.

It is something messier and more interesting.

Agile for a team that never sleeps.

Agile for workers that do not need lunch but do need very clear boundaries.

Agile where prompting starts to look like management, artefacts become organisational memory, and trust is earned by demonstrated judgement rather than vibes.

Somewhere between agile, management, software architecture and a tiny synthetic company full of tireless digital weirdos, I think there is a new operating model trying to be born.

Unfortunately, because I am me, I appear to be building it.

Which is exciting.

And also annoying.

Because one of the great tragedies of being a productivity monster is that every time you successfully automate part of your life, you immediately discover an even larger and more deranged system you now feel morally obliged to build.

Still.

At least the staff never sleep.

Jasmine Lewis

May 20

I work in a very similar field and found this fascinating. We’re looking to keep our development team fairly traditional, as in writing code by hand but using AI to empower our non tech staff with the tools to build their own interfaces.

2 replies by Stevie Bennett and others

Michelle Paquette

May 19

I like the overall design of Relay. Reminds me of the Engineering Department and saner bits of Product Marketing back at UstaCorp. (I retired in 2008, long before AI, when I felt I might be aging out of the sooperdooper software staffer role.) I find it interesting that AI agents can take on project management roles over other task specific agents, but it is obvious in hindsight.

The whole architecture of orchestrating agents and task agents linked with standardized protocols and language feels almost biological, reminiscent of how the subunits of a human brain work together.

1 reply by Stevie Bennett

3 more comments...

Discussion about this post

Ready for more?