wGrow
menu
Vibe Coding May 2026 · 8 min

Vibe coding is the on-ramp. Agentic engineering is the runway.

Karpathy named the move from typing code to specifying intent. After twelve months in production with it, here's what stays, what breaks, and what the actual discipline looks like.

In February 2025, Andrej Karpathy posted a now-famous note about a new kind of coding: “fully give in to the vibes, embrace exponentials, and forget that the code even exists.” He called it vibe coding. By the end of that year Collins Dictionary made it Word of the Year.

Most takes since have been some version of “is vibe coding good or bad.” The framing is wrong. Vibe coding is the on-ramp. The thing you have to learn to do once you’re on the highway is something Karpathy himself has been describing through 2026 as agentic engineering: the discipline of shipping production systems where the human’s job is to specify intent, set guardrails, and review output — not to type the code.

We run a Singapore studio that’s been delivering software for eighteen years. Last year we re-tooled around vibe coding. This year we’ve been forced to grow up about it. Here’s the working set of what we’ve kept, what we threw out, and what the discipline actually looks like.

What stays

The leverage is real. A senior engineer with a working agent loop ships, conservatively, three to six times what they used to. That number sounds like vendor copy, but we measure it against a very specific control: tickets in our existing legacy book, where the requirements and codebase haven’t changed. The leverage is real and it’s largest in well-described domains with strong types and good tests.

Specifying intent is now the bottleneck skill. The hardest part of working with an agent is writing the prompt that produces the right pull request. That skill rhymes with technical writing, with product spec writing, and with old-school code review. We’ve shifted hiring rubrics accordingly.

Reading is more important than writing. Reviewing AI-produced code at speed is a real skill. It involves knowing what to spot-check, what to read carefully, and what to throw out without comment. Most of the productivity gain disappears for engineers who haven’t built that muscle.

What breaks

A vibe-coded prototype shipped to prod is a liability with a cute origin story. This is the failure mode we see most often in the market. A demo gets built in two days and then ships, because the vibes were good. Eight weeks later there is no schema documentation, no test coverage, no eval harness, and a quiet pile of TODOs the agent dropped because the human stopped reading.

Code review at agent speed is harder than it looks. When the agent writes ten PRs an afternoon, the review queue becomes the choke point. Many teams handle this by skimming. Skimming is the new technical debt.

Costs stop being free. Agent loops on a real codebase are not the toy-demo cost line. Inference cost, plus the engineer cost of running the loop, plus the cost of reviewing what comes out of it, plus the eventual cost of the bug you didn’t catch. The cost story is real, and prompt caching matters.

What the discipline looks like

This is the part we wish more agencies would publish. Here’s our working set.

1. The eval harness goes in on day one

If a project doesn’t have an eval harness, it isn’t an agentic project; it’s a demo. The eval harness is the thing that lets you change the prompt or the model and know whether you’ve made the system better or worse. Without one, you are coding in the dark and the codebase is getting bigger every day.

We ship an eval template into every new engagement on day one. It costs roughly a day of engineer time. It saves roughly a quarter.

2. Narrow agents, never one big one

Our rule of thumb in the studio:

1 agent : 1 narrow job. 1 crew : 1 outcome. 1 human : 1 final approval.

The single biggest mistake we see new teams make is reaching for a single “ops agent” or “do-everything agent.” Those agents are unlistable, untestable, and impossible to reason about when they go wrong. Splitting one fat agent into three narrow ones is almost always the right move. The eval gets clearer too.

3. The human is in the loop, but they’re not in the diff

Vibe coding tells you to forget the code exists. Agentic engineering tells you to remember that the code exists, but to stop touching it character-by-character. The human’s leverage is in setting up the loop, watching the eval, reviewing the output, and stepping in surgically.

4. Guardrails are written by humans before the agent runs

The agent picks moves inside an envelope. The envelope is written by a senior engineer and reviewed like any other safety-critical code. We treat the envelope as a first-class artifact: it gets committed, reviewed, versioned, and tested.

5. Every public artifact is grounded in raw data

When an agent produces a number — a metric, a report, a status update — that number must be traceable back to a record. No paraphrasing of numerical claims. This is the rule that, more than any other, has saved us from publishing wrong things.

Where Singapore comes in

In January 2026, IMDA published the world’s first Model Governance Framework for Agentic AI at WEF. If you’re building in Singapore, this isn’t optional reading; it’s the procurement standard your gov clients are about to hand you. The framework happens to align almost exactly with the discipline above: narrow scope, human approval, eval, audit. If you’ve built that way from day one, MGF compliance costs you nothing extra. If you haven’t, it costs you a re-architecture.

We have a separate, longer piece on reading the MGF clause-by-clause. Short version: build like an adult and you’re fine.

The on-ramp / runway distinction

The reason we keep using the on-ramp metaphor is this: vibe coding is how you get onto the highway. It’s where the energy is, where new builders are entering the craft, and where speed comes from. Nobody who tells you to “just stop vibe coding” is being honest about the shape of the next decade.

But the on-ramp isn’t the highway. The thing you do at speed, with multiple lanes and other vehicles and the law watching, is agentic engineering. It has rules. They are not romantic. They are the rules that let you keep the keys.

— wGrow studio