The Real Cost of Vibe Coding: 6 Months of API Bills

James ParkJames ParkSenior Developer Advocate
Omar HassanOmar HassanDevTools Analyst
10 min read

Six months. Ten apps. Four App Store submissions. One number nobody publishes honestly: how much it actually cost.

We kept a shared spreadsheet the whole way through because neither of us trusted the vibe coding cost conversations we were reading online. Most of them stop at a screenshot of a dashboard. "I spent $400 on Claude Code this month and it was worth it." Worth what? Compared to what? The receipts get posted, the napkin math never does.

So here is ours. Two builders, one joint workflow, a single Anthropic org on the billing side, and a habit of writing down every feature we shipped next to what it cost to ship it. The goal of this post is not to brag about efficiency or panic about burn. It is to reframe the unit economics entirely, because cost per token is the wrong number. Cost per shipped feature is the only one that pays the bills.

A restaurant does not measure food cost in dollars per gram of flour. It measures food cost as a percentage of the plate that walks out to the table. If you are measuring your AI bill in tokens, you are counting flour.

Let us walk through it.

The Setup

Ten iOS apps across six months, from October 2025 through March 2026. Four made it to App Store submission, two got approved on the first try, one got rejected twice and pulled, and the rest are still in TestFlight purgatory where most honest indie work lives. The apps ranged from a 4-screen habit tracker to a fairly gnarly AI-assisted meal planner with a RevenueCat subscription, SwiftData persistence, and a Siri shortcut extension that absolutely did not want to cooperate.

The stack was deliberately boring. SwiftUI and SwiftData on device, RevenueCat for subscriptions, CloudKit for the one app that needed sync, and the patterns we learned the hard way from the first ten apps. The reason we kept it boring is that every piece of novelty in your stack is a piece of novelty your AI has to reason about, and reasoning costs money. A known pattern is cheap. An unknown pattern is a meter running.

The tools: Claude Code as the main driver, Cursor for drive-by edits during pairing, and occasional Bolt runs for UI exploration on weekends. We stayed mostly on Claude Sonnet for implementation and bumped to Opus when we were stuck. We deployed with the asc CLI we built ourselves to automate App Store Connect, which, notably, was itself vibe-coded in a single afternoon and saved us probably $200 in agent time over the six months because we stopped making the model navigate the App Store Connect web UI in our heads.

We picked these tools after running the full tool comparison back in February. Nothing since has changed our default.

Raw Monthly Bills, In Order

Here are the actual totals. Claude Code usage on the API, Cursor Pro for both of us, Bolt when we used it, plus a small OpenAI bill for one app that used GPT-4o-mini for on-device-ish inference via a wrapper.

MonthClaude (API + Max)Cursor (2 seats)BoltOpenAITotal
Oct 2025$312$40$0$8$360
Nov 2025$548$40$0$14$602
Dec 2025$419$40$20$11$490
Jan 2026$847$40$0$22$909
Feb 2026$611$40$20$17$688
Mar 2026$1,210$40$0$31$1,281
Total$3,947$240$40$103$4,330

Four thousand three hundred and thirty dollars across six months. Call it $722 a month on average, split between two people, which works out to $361 each per month, or roughly twelve dollars a day per person. That is about the price of a decent lunch in Dubai, which feels like a number worth sitting with for a second.

The shape of the curve tells a story. October was cautious. November we got confident and pushed hard on two apps in parallel. December dropped because one of us was traveling and neither of us wanted to be the person expensing $600 of API credits while not shipping. January was the spike that ruined our average: $847 of Claude bills alone. The cause was not ambition. The cause was one feature, rebuilt three times, on one app, over one long weekend. We will get to that.

February recovered because we had learned from January. March went up again because we shipped two submissions back to back, which meant a lot of "the reviewer said X, fix Y without breaking Z" work, and that loop is expensive.

Cost Per Feature Shipped, Not Per Token

Here is where the napkin math gets interesting.

Across the six months we shipped, by our own count, 168 features. We are defining a feature as a unit of work that a user could reasonably describe in one sentence: "add dark mode," "let me export my streak as an image," "remember the last meal I picked so it autofills next time." Not commits. Not tickets. Shipped capabilities that made it into an app a real person could use.

$4,330 divided by 168 features is $25.77 per feature. Call it twenty-six bucks.

A gym measures its real cost per visit, not per member. If you pay $80 a month and go twice, your cost per visit is $40, and the gym is laughing at you. If you pay $80 a month and go twenty times, it is $4, and you are the one laughing. The gym does not care about the headline price. It cares about utilization. Cost per visit is the only honest number, and cost per feature is ours.

Twenty-six dollars a feature. That is the figure we kept coming back to. For context, a mid-senior iOS contractor in a reasonable market bills somewhere between $90 and $150 an hour. If a feature takes a contractor two hours (which is optimistic for almost anything involving SwiftData and a RevenueCat paywall), that is a floor of $180 for the same unit of work. We are paying $26 and putting in our own time as the supervising taste layer.

That ratio is the whole argument. The AI is not free. But it is doing roughly seven times a contractor hour's worth of shipped output per dollar spent, which is the same kind of leverage a good espresso machine gives a cafe: the beans still cost money, but the throughput per minute changes what your business can be.

And then there is the distribution, which is where it gets ugly.

Where The Money Went

If every feature cost $26 we would be writing a very different post. The median feature cost maybe fifteen dollars. But the mean was dragged up hard by a handful of sinkhole days.

The five cheapest apps averaged $11 per feature shipped. These were the ones where we knew the pattern cold: a habit tracker, a workout timer, a recipe saver, a focus timer, and a quiet little journaling app. All SwiftData, all local, all using UI patterns we had built five times already. The AI was basically autocomplete with ambition. A few good prompts per session, clear scope, ship by Sunday.

The three middle apps averaged $28 per feature. These had one novel element each: one used CloudKit sync, one had a complicated onboarding, one had an in-app camera flow with Vision framework edge cases. The novelty tax was real but manageable.

The two expensive apps averaged $61 per feature. One of them ate almost $900 by itself. It was the meal planner, and the sinkhole was the subscription paywall flow, which we rebuilt three times because we kept misremembering how RevenueCat's latest SDK handled introductory offers. The model confidently wrote three versions of wrong code and we confidently accepted them, because at 1am on a Saturday your taste drops faster than your token budget does.

The lesson is not "AI is expensive." The lesson is that the distribution is bimodal. You get the cheap shipped features or you get the sinkhole. There is very little in between. When you feel yourself slipping into the sinkhole (rebuilding the same flow, asking the model to fix what it just broke, accepting diffs without reading them), the only correct move is to stop and walk away. Every additional prompt you send while frustrated is a dollar fifty with a negative expected value.

We did not stop often enough. That is the real lesson of this post.

The One Project That Justified The Entire 6 Months

Out of the ten apps, one made the whole experiment profitable. We are not going to name it because it is still too small to brag about and too important to jinx, but here is the math.

It is a niche productivity app with a $4.99 monthly subscription. It launched in late January. By the end of March it had 340 paying subscribers. Monthly recurring revenue at the end of March: about $1,690. Six-month development cost, allocated honestly: roughly $1,200 of our $4,330 total, because that one app ate a disproportionate share of the agent time.

So the app cost $1,200 to build over six months, and it is now generating $1,690 a month. Payback period: less than a month from where it was in March. Annualized, that is $20,000 of revenue from a $1,200 build cost, which is the kind of ratio that makes the entire rest of the ledger irrelevant.

The recording studio analogy is the one we keep using. You book a week of studio time for $5,000 and record ten songs. Nine of them go nowhere. One of them licenses to a car commercial and pays for the studio, the rent, and your next album. The cost was never really about the ten songs. It was about buying enough swings to find the one.

That is what the six months were. Nine swings to find one song. The bill makes sense as tuition for the tenth app, not as a line item for any of the other nine.

This is the part that the public cost conversations get systematically wrong. Glide's recent breakdown of vibe coding cost quotes averages per project. Zapier's own vibe coding cost post frames it as a per-user economic decision. Those are both useful, but they both miss the portfolio dynamic. You are not buying one app. You are buying the option to discover which of your ideas actually works, and the cost per option is low enough that the expected value of the portfolio beats the cost of the sum.

What We Would Do Differently To Cut The Bill In Half

If we had known in October what we know now, our six-month total would be closer to $2,100 instead of $4,330. Here is where the fat is, in descending order of impact.

Stop mid-sinkhole. Every single hundred-dollar day we had was a day we knew by hour two that something was off and kept pushing anyway. If we had hard-capped each session at $40 of agent spend, with a forced walk-away when we hit it, we would have saved at least $600 across the six months. That is a third of a rent payment we set on fire because we wanted to "just fix this one thing."

Prompt with the known-good pattern first. When you open a new app and your first prompt is "build me a settings screen," the model starts from a blank slate. When your first prompt is "build me a settings screen using the exact structure of the settings screen in HabitTrack, which is in this repo at this path, with these five fields instead," you save maybe 40% of the tokens and avoid almost all the rebuild-what-you-broke loops. A skilled plumber does not re-invent the joint for every sink. They have three patterns and pick the right one. We should have been that plumber sooner.

Use Sonnet for 90% and Opus only when genuinely stuck. Our January spike was partly because we were using Opus as a default out of laziness. Sonnet can do almost everything we needed at maybe a quarter the cost. Barrack's honest cost math post makes this point well, and in retrospect we should have read it in November.

Write the deployment scripts first. Every hour the model spent "remembering" how to ship to TestFlight was an hour of pure waste. When we finally built the asc CLI in February, our per-submission deployment cost effectively went to zero. We should have done that in week one, not week seventeen.

Batch the review-fix loops. When Apple rejects your binary, the instinct is to fix it right now. The cheaper move is to collect the next day's fresh attention, re-read the reviewer's note properly, and make a single clean pass. Frantic fix-and-resubmit loops at 11pm cost us roughly $180 across two apps and bought nothing but tired mornings.

Half the bill would have been achievable. We are not bitter about the other half. It bought us the taste to know which half was waste.

The Break-Even Thought Experiment

Here is the final napkin, and it is the one we want you to actually run for yourself.

Assume you are shipping 20 features a month at our blended $26-per-feature rate. That is $520 a month. For comparison: one hour of a senior iOS contractor at $150 an hour gets you, optimistically, one small feature. Twenty hours of contractor time in a month at that rate is $3,000.

$520 versus $3,000 for the same shipped output. The AI bill is roughly seventeen percent of the contractor bill. Put differently: if vibe coding shipped even one-fifth as much as a contractor for the same money, it would still break even. In our experience it shipped much more than that, because the taste-layer bottleneck (us) was the constraint, not the typing-layer bottleneck (the agent).

But the honest version of the math has to include time. The two of us spent, together, maybe 60 hours a week on these apps across the six months. That is about 1,560 hours of human time for $4,330 of tool spend, or roughly $2.77 of tool spend per human hour worked. If you value our time at even a modest $50 an hour, the labor cost was $78,000 and the tool cost was five and a half percent of the labor cost. The tools are not what made this expensive. We are what made this expensive, because we are the most expensive thing in the room by two orders of magnitude.

Which brings us to the uncomfortable conclusion we did not want to write. The question is not "can I afford the API bill." For almost any professional reader of this post, you can. The question is "is my attention worth what I am spending it on." The tool bill is the cheap part. The opportunity cost of six months of your best thinking is the expensive part, and no dashboard shows you that number.

So here is what we want to ask you, and this is the actual end of the post. Pull up your last three months of AI tool bills. Count the features you actually shipped against them. Compute your own cost per feature. Is it $10? $50? $200? And more importantly: if you could cut it in half, would you ship twice as much, or would you just pay yourself the difference and ship the same amount? We genuinely do not know which answer is the right one for you. We are still figuring out the right one for us. Tell us what yours looks like. We will be comparing notes in the comments.

Stay in the loop. Get weekly tutorials on building software with AI coding agents. Speak to the community of Builders worldwide.

Free forever, no spam. Tutorials, tool reviews, and strategies for founders, PMs, and builders shipping with AI.

Learn More