The Future of Vibe Coding: What Comes After 2026

14 min read

May 7, 2026

This is the 42nd post in a series I helped shape, and to write the future of vibe coding essay I did the thing you should never do to old work. I went back and read the first post.

It went live in January. Four months ago. I read it last night with a glass of wine, expecting nostalgia, and instead got something closer to vertigo. The assumptions in that piece were already wrong. Not subtly wrong. Foundationally wrong. We talked about prompting like it was a craft you mastered over years. We talked about agents like they were a future tense.

Four months. That is the half-life of a thesis right now.

So when people ask me about the future of vibe coding, I have to be honest. The future is not a 2027 problem. It is a six-week-old problem already happening behind us. The job is not predicting what comes next. The job is catching up to what already arrived while we were busy writing about the last thing.

This is the season finale of our 42-post run. Kai is co-writing it with me because he sees the field differently and because, frankly, I am tired of being the only person willing to say "I don't know" out loud. We will tell you what we got right since January, what we got embarrassingly wrong, where the curve is bending, and where we think the whole thing might break. Then Kai will get spicy. Then I will pull the conversation back to the things that do not change no matter how fast the substrate moves.

Here we go.

What we got right and wrong since January

Let me start with the ego-bruising part. We were wrong about a lot.

We got cloud planning right. The /ultraplan pattern, where you spin up a parallel agent in the cloud just to think about your problem before you touch a keystroke, became default behavior by March. I now refuse to start anything non-trivial without it. The cost is twelve cents and a coffee refill. The savings are an afternoon of misdirection. We called this in February and it landed.

We got routines right too. The killer feature for solo developers turned out to be the boring one: a routine that runs at 6 a.m., checks your inbox, drafts replies, runs your linter, queues up the day's PRs. Not the flashy demo agents. The quiet ones. The ones that work while you sleep. If you want the long version of why these matter, the 50 Claude Code tips post has the receipts.

We got mobile right. In January, building a working iOS prototype with an agent was a weekend project that ended in tears. By April, it was a Saturday afternoon. The gap between I have an idea and I have a TestFlight link collapsed in a way I still find disorienting.

Now the embarrassing column.

We thought IDEs would die. They did not. Cursor and Windsurf still have a viewer audience that loves the cockpit, the chrome, the tactile feedback of an editor that knows your codebase. Our vibe coding tools comparison tracks how each of these environments has evolved, and the numbers tell a coexistence story, not a winner-take-all story. The terminal-first agent is real and growing, but it did not eat the visual workspace. We were wrong. The IDE is still the cathedral; the terminal is the chapel. Both are full on Sunday.

We thought voice would explode. It did not. It is growing, but slower than the demos suggested. Turns out humans actually like typing when they are thinking. Voice works for capture, for driving, for cooking. It does not work for the long, recursive, "wait, no, change line three" rhythm of building software. I still use voice notes. I do not vibe code by voice. Almost no one I respect does either.

The genuine surprise was custom agents. In January they were a power-user toy. By April, midsize companies had org chart entries titled "Agent Engineer." Four months. From hobby to job title. I cannot think of another technology shift in my career that moved from curiosity to compensation that fast.

Where the hockey stick is bending

Predictions are a confidence game, and the right move is to publish your confidence interval next to your forecast. Otherwise you are just doing astrology with a bigger vocabulary. Here is what I think is happening, with my own calibration attached. Disagree loudly.

Always-on agents (Conway-class)

The pattern of an agent that watches your repo, your inbox, your incident channel, and acts on its own without prompting, is moving from research papers into production. Anthropic published a research note on long-horizon agent behavior earlier this year that frames the problem honestly: the hard part is not capability, it is restraint. Knowing when to act and when to leave you alone.

In 2026 these agents still feel weird. You wake up to a Slack message from your own bot saying "I noticed the build failed at 2 a.m., I rolled back the deploy and opened a PR with the fix, please review." You are simultaneously delighted and nervous. By 2027, this will feel like waking up to a dishwasher that finished a cycle. Normal infrastructure.

Confidence: high. Risk: bad agent loops chewing through your bill or your codebase overnight. The horror story of 2027 will be a startup that wakes up to a $40,000 inference bill because their always-on agent got into a fight with another always-on agent. Bet on it.

Mythos-class models for security and code review

Project Glasswing was the opening salvo. Continuous AI audit of large codebases, running on a cadence, surfacing CVEs and architectural smells before a human ever opens the file. By 2027 I think most enterprise codebases run something like this as baseline hygiene. The way we expect a fire alarm in a building, we will expect a security agent in a repo.

Confidence: high. Risk: false positive fatigue. The history of security tooling is the history of teams ignoring alarms because the alarms cried wolf. AI does not solve this. It accelerates it. The teams that win will be the ones who design signal into their audit pipeline, not the ones who turn on every check.

There is a related problem I lose sleep over: patch-the-patch. An AI suggests a fix, another AI reviews it, a third AI deploys it, and now you have three layers of plausible-looking changes nobody fully read. The substrate gets quietly more brittle every week.

Design tool integration (the Figma leak)

You may have seen the leaked Figma roadmap circulating last month. The interesting part was not the new components. It was the seam where designs become directly executable agent specs. Hand a Figma file to an agent, get a working prototype on a real backend, no PM in between as a translation layer.

By 2027 to 2028, I think this is the dominant workflow for early-stage product work. Designers ship apps. Not design files. Apps.

Confidence: medium. The technology is closer than the workflow change. Designers do not currently think in terms of state machines and edge cases, and you can ship a beautiful broken thing very quickly with this loop. The risk is not the absence of capability. The risk is the absence of taste applied to systems, which is a different muscle than taste applied to surfaces.

Enterprise adoption shift

The arc is legible from where I sit. 2025 was pilots and skunkworks, an innovation team in the corner with permission to experiment. 2026 is line-of-business teams quietly using vibe coding for internal tools, often without formal IT approval. 2027 will be the year enterprise IT formally allocates engineering hours to "agent system maintenance" as a line item. Like database administration in the 90s. Suddenly serious. Suddenly everywhere.

Confidence: medium-high. Risk: bureaucracy strangles the wins. The fastest path from amazing to mediocre is a procurement review.

The model performance ceiling question

This is the wildcard for everything else, and the honest answer is: I do not know.

Are we two or three model generations from a plateau? Or ten? The METR productivity paradox research already shows that raw model capability does not translate linearly into developer speed, which complicates the ceiling question further. The scaling laws are still working in 2026, the benchmarks still climb, the cost curves still fall. But the curves are doing different things at different rates. Capability per dollar is improving fast. Capability per training run is improving slower. Capability per researcher hour is hard to measure. The honest macro view from people I trust at the labs is: we have line of sight to one more big jump and then we are guessing.

That uncertainty colors every other prediction in this post. If the ceiling is closer than the consensus thinks, the next two years are mostly about distribution of capability we already have, not new capability. If it is further, all bets get bigger. Latent Space's recent essay on the cost curve is a good read on this if you want a less hand-wavy version.

Kai here: the predictions Priya is too polite to make

I love Priya. I trust her calibration. I also think she is being kind to people who do not deserve kindness, which is to say the part of the industry that has not done its homework.

So here are my spicier predictions. Some I am very confident about. Some I am hedging hard. I will tell you which.

Fifty percent of "junior dev" jobs as currently defined are gone or fundamentally transformed by 2028. Confidence: high. Not because juniors are not valuable. Because the tasks we currently assign to juniors (basic CRUD, glue code, ticket queues) are the exact shape of work agents do well. The juniors who survive will not be doing those tasks. They will be running fleets of agents doing those tasks. That is a different job. The bachelor's-to-employed pipeline as designed in 2018 is broken.

The CS bachelor's degree market shrinks 30% by 2030. Confidence: medium. Universities move slow. But applications to CS programs already softened in the 2025-2026 admissions cycle. If the job market on the other side keeps shifting, the pipeline shrinks. Twenty-year-olds are not stupid. They watch.

At least one major SaaS company gets acquired primarily for its agent fleet, not its product. Confidence: high, on a 24-month timeline. The asset is not the codebase anymore. It is the operational scaffolding around the agents, the prompts, the routines, the institutional knowledge of how this team gets agents to behave. That is acquirable. That is a moat. Watch for the first acquisition where the press release talks about "talent and tooling" and quietly does not mention the product.

"Agentic engineering" becomes a $200k median salary role. Confidence: medium. Compensation is sticky and titles are slow. But the supply-demand gap is real. Read the agentic engineering complete guide if you want the full picture of what this role even is. Karpathy was right when he reframed the term, by the way. Priya and I disagreed about this for weeks. I came around. The full debate is in our coverage of his pivot.

Open source maintenance gets paid. Finally. Confidence: medium-high. The reason is brutal but real: AI audits make the value of well-maintained OSS undeniable in dollar terms. When a Mythos-class scanner can quantify "this library prevents $X in vulnerabilities across Y companies," foundations and enterprises will write the check. The economics finally line up. We covered the broken status quo in the open source tax piece. The fix is coming, but the path through is uglier than the optimists think.

The first major lawsuit over AI-generated code copying private repos lands by 2027. Confidence: high. Not because the labs are reckless. Because the surface area is so large that something will get through, and the discovery process will be public, and the headlines will be ugly. The settlement will set precedent for the next decade. If you are at a frontier lab, your legal team is already having this meeting.

The "agent ops" tooling category becomes a $5B market by 2028. Confidence: medium. Datadog for agents. PagerDuty for agents. Snyk for agents. Each of these gets built. Some of them get bought by the incumbents. Some of them eat the incumbents.

I will be wrong about at least two of those. Probably three. But I will be more right than the people pretending the trend is reversible.

Back to Priya.

What stays the same

Kai is in the future. I want to spend a minute on the present.

Every season finale of every show I love does the same trick: it reminds you what the show was actually about, underneath the plot. The plot was the agents and the models and the routines. The show is, and always was, about how humans build things together.

Here is what does not change.

Taste. Always taste. The ability to look at three working solutions and know which one is right is not a skill agents are about to take from you. If anything, the surplus of working makes taste more valuable, not less. When everyone can build the thing, the question becomes which thing to build. That is taste.

Communication. The senior engineers who win the next decade are the ones who can write a clear specification. Not pretty prose. Clear thinking captured in language. Agents are unforgiving about ambiguity in a way human juniors never were. A vague PRD used to produce a slightly off feature. Now it produces a confidently wrong feature in fifteen minutes. The cost of vague communication just went up, a lot.

Systems thinking. AI is a tool for designers, not a replacement for them. Knowing why a service should be split into three pieces instead of one is still a human judgment. The agent will execute either decision beautifully. You still have to make it.

Trust and relationships. You hire humans you trust. You ship to customers you respect. You build companies with people who pick up the phone at 11 p.m. when something breaks. None of this is being automated. Anyone telling you otherwise is selling something.

The need to ship. AI lowers the cost of building. It does not lower the cost of being wrong. A startup that ships ten wrong features in the time it used to ship two has not gotten faster. It has gotten faster at being wrong. The discipline of shipping the right thing remains the bottleneck. The bottleneck just moved upstream.

If you go back to our definition of vibe coding from January, the through-line is there. Vibe coding was never about giving up on craft. It was about freeing craft from the grunt work that obscured it.

What we should be worried about

I do not want to write a triumphalist finale. The honest map of the next two years includes terrain I genuinely fear.

Hidden monoculture. When 80% of new code starts from similar AI suggestions, mistakes get encoded into the substrate at a level we cannot see. The same null-check pattern, the same auth flow, the same race condition, replicated across millions of repos. The bugs of the 2030s will be inherited from the model checkpoints of 2026. We are baking sourdough with the same starter, and one day we will discover the starter had a problem.

Skill atrophy. The next generation of engineers might never deeply read code. They will write specs, review diffs, accept or reject. The muscle of sitting with a function until you understand it may not get exercised. I do not know what that costs us. I suspect it costs us the people who would have invented the next paradigm, because invention requires the deep read.

Concentration of power. The handful of labs running frontier models accumulate insane leverage. This is structural. There is no friendly version of this. The best we can hope for is multiple frontier labs, open weights at one tier down, and regulatory imagination. The worst version is a cartel. I want to believe in the best version. The historical base rate for "a small group of actors with extreme leverage chooses restraint" is not encouraging.

Verification debt. AI generates faster than we can verify. Code review becomes a rubber stamp. The audit logs say "approved by jane@" and Jane saw the diff for eleven seconds. We are accumulating a debt of unread changes, and the interest payment is the production incident you cannot diagnose because nobody on the team has read this part of the system in nine months.

The shipping but not understanding trap. This is the umbrella worry. Software is becoming a thing we run more than a thing we comprehend. The economic incentives are aligned with shipping. The civilizational incentives are aligned with understanding. The two used to coincide. They are starting to diverge.

What we should be excited about

And yet.

Solo founders shipping products that needed teams of ten. This is happening every week. I know three of them personally. Real revenue. Real users. One person, one laptop, one stubborn idea. If you want to start a company in 2026, the friction to get to a working v1 is the lowest it has ever been in the history of the discipline.

Open source maintenance becoming sustainable. I am going to keep saying this until it is real. The economics are finally lining up. The maintainers who burned out for a decade get to get paid in the next one. I will throw a party.

Software for tiny niches. The thirty-person profession that never got attention from a SaaS founder because the market was too small now gets bespoke tooling. Veterinary anesthesiologists. Civic election officials in counties under 50,000 people. Independent translators of medieval Catalan. The long tail of human work, getting better software, because the cost of building dropped below the cost of caring.

The "tools for thought" era for non-engineers. My mother does not write code. She wrote a small app last month with an agent. It tracks her quilting projects. It is, by professional standards, charmingly bad. By her standards, it is a small revolution. Multiply that by a billion people who never thought of software as a thing they could shape.

Faster medical, scientific, civic software. The categories that were starved of engineering talent for decades, because the talent went to ad tech and finance, are getting attention again. Slowly. Imperfectly. But the math has changed. A small team in a hospital can ship a real tool now. That is enormous.

A specific prediction for May 2027

It is bad form to write a predictions post without putting a number on the table, so here are mine. One year out. Mark this page. Come back in May 2027 and tell me where I missed.

By May 2027, the average solo developer using Claude Code or its successor will ship 3 to 5x more shippable software per week than they did in May 2026. Not 3 to 5x more lines of code. Three to five times more shippable units. Features that go to production. Bugs that get closed.

Junior developer hiring still happens. But the bar shifted. The new entry-level expectation is "you have shipped at least two vibe-coded projects in the wild, with users, and you can talk about what broke." Portfolio over pedigree. We are returning to the apprenticeship model, just with new tools. If you want to understand what that career path looks like in practice, our piece on vibe coding careers maps the emerging roles and the skills that differentiate candidates.

By end of year 2027, more than 50% of the Fortune 500 has at least one routine running in production. Quietly. Often without a press release. The CIO knows. The board does not. The line-of-business owner who set it up gets a small bonus.

The term "vibe coding" is still around in 2027. It softened from a slogan into a category. The term "agentic engineering" is dominant for the professional discipline. Both terms coexist, the way "the web" and "the internet" coexist now.

And the honest hedge: I am wrong about at least one of these. Probably more. Predictions are a way of marking your beliefs publicly so reality can correct you in front of an audience. That is the whole game.

What we are doing this week

Priya: I am refactoring my routines portfolio. I have eleven of them. Three are not earning their keep. They run on a schedule, they cost money, and they have not produced a useful artifact in a month. Retiring them on Friday. The rest get a tune-up. Routines, like gardens, need pruning. The ones that grew well last season are not the ones that will grow well next.

Kai: I am building a custom agent for a CVE I want to patch in an open-source library that nobody else seems to care about. It is unglamorous. The maintainer answered my email in two days, which is itself a small miracle. If the patch lands, I will write about it. If it does not, I will probably also write about it. Either way, the thing gets safer.

Both: we are going to keep reading other people's posts. The field moves faster than any one person can track. The discipline of reading is the one I worry most about losing in myself, so I am building it as a habit. An hour a day. Other writers, other repos, other perspectives. If you are not doing this, please start. I am not joking. We do not get to coast.

The season finale

Forty-two posts. January to May. I did not think we would get here.

If you have been with us since the first one, thank you. If you joined at post 17 because you were debugging at 1 a.m. and the search engine sent you to us, thank you. If this is the first piece you are reading and somehow you started at the end, welcome. Go back to post one. It is wrong about most things and that is part of the fun.

The series was never a destination. It was a record of a field changing under our hands while we tried to write about it honestly. Forty-two stops on a moving train. The train keeps going.

What comes next does not get written by us. It gets written by the people building the routines, training the agents, shipping the tiny tools for the tiny audiences, hiring the unusual juniors, refusing the easy answer. You.

We will keep showing up to write about it. We will keep being wrong in interesting ways. We will keep marking our confidence intervals because that is the only way to argue in good faith.

The conversation continues. The next chapter is yours to draft.

See you in the comments.

Share on X LinkedIn