Claude Code /batch: Migrate Entire Codebases in Parallel

Omar HassanOmar HassanDevTools Analyst
9 min read

The ticket had been sitting in our backlog since the Obama administration. Okay, slight exaggeration. Since early 2023. But if you have ever stared down a "migrate 200 React class components to function components with hooks" ticket on a Friday afternoon, you know the feeling. That ticket and I had a relationship. A long, unhappy one. Every quarter, someone on the team would pick it up, grind through twelve components, hit a weird componentDidUpdate edge case, cry quietly into their mechanical keyboard, and push it back to the icebox.

Then Claude Code shipped /batch in v2.1.63. The claude code batch command landed in late February and I spent about eight seconds deciding which backlog item I wanted to throw at it. I had been tracking the parallel agents infrastructure for months, mostly as a curiosity. Running two or three agents at once was cute. A command that spawns twenty or thirty background agents, each in its own worktree, each opening its own PR? That was a different species of tool.

I want to be honest about how this went, because I have read too many "AI did my job in five minutes" posts that left out the ugly parts. This is a war story, not a sales pitch. We migrated 200 components in 45 minutes of wall-clock time. We also merged a bad PR that broke production for 11 minutes. Both things are true. Both things matter.

If you have a migration you have been avoiding, this is the post I wish someone had written me before I ran claude code batch against a production codebase.

What /batch actually does

Think of /batch as an assembly line foreman. You hand it a brief. It walks the factory floor, counts the stations, figures out which parts need which treatments, and assigns one worker to each station. The workers never talk to each other. They never share a workbench. They each finish their part and drop it on the output conveyor, which in this case is a pull request against your main branch.

The mechanics, roughly: /batch takes a natural-language brief and decomposes it into what the official docs call independent units of work. Between 5 and 30 of them, depending on scope. For each unit, Claude Code spins up a background agent inside an isolated git worktree, runs your project bootstrap (install, typecheck, whatever your CLAUDE.md tells it to run), completes the unit, runs tests, and opens a PR tagged with the batch ID. If a unit fails, the agent retries up to twice before surrendering that unit back to you with a detailed failure report.

The claude code batch command is not magic. It is aggressive parallelization on top of infrastructure that already existed. What shipped in 2.1.63 was mostly the decomposition logic, the worktree pooling, and the PR orchestration. The individual agents doing the work are the same agents you have been running locally for a year.

The decomposition step is the interesting part. When you run /batch, Claude reads your brief, scans the relevant parts of the codebase, and produces a plan document before it spawns anything. The plan lists every unit it intends to create, what files each unit will touch, and what success looks like. You approve the plan, or you edit it, or you kill it. Nothing spawns until you say go. This matters. I have watched teammates approve plans without reading them and I have watched those same teammates later wonder why they had 23 open PRs all modifying the same stylesheet.

Assembly line foreman. Not magic foreman. You still have to read the shift plan.

The real migration

Here is the exact command I ran. I have redacted nothing except the repo name.

bash
/batch "Convert all class components in src/components/ to function components with hooks. Preserve existing prop types, existing tests, and existing behavior. Use React.memo for components that currently extend PureComponent. Each component gets its own PR. Do not touch components in src/components/legacy/."

That last sentence, do not touch components in src/components/legacy, saved me about four hours of pain. Always fence the migration. Always.

Claude Code came back 90 seconds later with the plan:

Batch plan: batch_8f2c1a
Units identified: 214
  - 198 class components to migrate
  - 12 PureComponent -> React.memo
  - 4 HOC wrappers requiring manual review (flagged)
Worktree pool: 20 concurrent
Estimated wall time: 35-55 minutes
Estimated tokens: 4.2M
Approve plan? [y/n/edit]

Two hundred fourteen units. Four flagged for manual review, which I appreciated because those were the weird legacy HOCs I had already planned to handle myself. I edited the plan to drop those four (manual work, not batch work), approved the remaining 210, and let it rip.

The wall-clock time ended up being 47 minutes. The claude code batch migrations dashboard that opens in your terminal is oddly soothing to watch. You see little progress bars stacked vertically, each one representing an agent mid-work. Unit 032: TypingIndicator: running tests. Unit 033: UserAvatar: opening PR. Unit 034: NavBar: retrying after typecheck failure. It is the closest thing to watching a factory floor I have experienced without leaving my apartment.

The napkin math on cost: roughly 4.4M tokens consumed (the estimate was close), which at current pricing landed around $62 for the full run. Compare that to my hourly rate times 60 hours of sequential human migration work. The math is not complicated.

What happened in those 47 minutes

I made coffee. I answered Slack messages. I opened the PR dashboard in GitHub and watched PRs roll in at a rate of roughly four per minute for the first fifteen minutes, then tapering.

But underneath the calm, things were happening that I want to describe in detail, because the coordination model is the part most people misunderstand about claude code codebase migration work.

Each agent ran in a fully isolated worktree under .batch/worktrees/batch_8f2c1a/unit_NNN/. None of the agents could see each other's changes. This is crucial. It means two agents cannot step on each other by editing the same file, because each agent has its own copy of the world. The tradeoff is that if two units logically touch the same file, you are going to get merge conflicts later. More on that in a minute.

Each agent ran my project's standard checks. The repo's CLAUDE.md told it to run pnpm install, pnpm typecheck, and pnpm test -- --related. Agents that hit typecheck failures would attempt to fix them in place. Agents that hit test failures would either adjust the implementation or, if tests were clearly testing implementation details that no longer applied, surface the test for human review rather than modifying it. This behavior is configurable but the default is conservative, which I like.

The four HOC wrappers I dropped from the plan got flagged early in the run as related to other units, and three agents paused themselves waiting for those to resolve. Claude Code detected the dependency automatically by scanning imports. I manually unblocked them by telling the batch controller "proceed without HOC updates, I will handle HOC callers afterward" and the three agents resumed. That interaction alone, the fact that agents can pause and wait on signals rather than just failing, is the thing that separates /batch from a dumb shell script spawning a hundred processes.

At minute 32, the controller reported 187 PRs open, 8 retrying, 15 remaining. At minute 41, I had 198 green PRs and 12 that had failed twice. At minute 47, the run terminated. Final tally: 198 successful PRs, 12 failed units dropped back to my queue for manual work.

Twelve failures out of 210. Roughly 94% success rate. Better than my personal success rate on a Monday.

Reviewing 200 PRs without losing your mind

This is where most people get stuck. Opening 198 pull requests into a shared review queue is, at first glance, indistinguishable from opening a denial-of-service attack on your engineering team. Your reviewers will quit. Your CI system will cry. Your Slack will turn into an unreadable blur of #pull-requests notifications.

Here is the review strategy that worked. I stole most of it from our 50 Claude Code tips post, then extended it for batch-scale work.

First, I used the --labels flag on the batch command to tag every generated PR with batch:8f2c1a and review:mechanical. Mechanical means the change follows a known pattern that should be reviewed for correctness of the pattern application, not for architectural decisions. Our team already had conventions for mechanical reviews (smaller checklist, faster turnaround, different reviewer rotation).

Second, I grouped PRs by risk class using a quick script that read each PR's diff size and touched files. We ended up with three buckets: tiny (under 30 lines changed, 144 PRs), medium (30 to 150 lines, 48 PRs), and large (over 150 lines, 6 PRs). Tiny got one reviewer and a two-hour SLA. Medium got two reviewers. Large I reviewed myself, carefully, with a coffee.

Third, and this is the part nobody talks about: I turned on pattern review mode. Instead of reading each PR top-to-bottom, I read the first five PRs in detail to validate the pattern, then spot-checked the rest. If the first five showed Claude Code had correctly handled componentDidMount, componentWillUnmount, and setState callbacks, I trusted that it had handled them correctly in PR 173. Spot checks do not catch everything. I will return to this in the failure section, because we did merge a bad one.

Fourth, we merged in waves. The tiny bucket went first, 20 PRs per wave, with 15 minutes between waves to watch error rates in production. Medium bucket next, 10 per wave. Large bucket last, one at a time, like landing a plane.

Total review time across three engineers: about 6 hours. Total wall-clock from "I ran /batch" to "all 198 PRs merged to main" was about 9 hours, including lunch and one fire drill about an unrelated incident.

The sample PR descriptions that Claude Code generated were, honestly, better than most human-written ones. Here is one verbatim:

## Migrate NotificationBanner to hooks

- Converted class to function component
- Replaced `componentDidMount` with `useEffect(..., [])`
- Replaced `componentWillUnmount` cleanup with useEffect return
- Replaced `this.setState` with `useState` (3 state fields)
- Preserved existing PropTypes
- Tests unchanged, passing locally

Risk: low
Related units: 087, 112 (consume NotificationBanner)
Batch: batch_8f2c1a, unit 043

Related units. The agent figured out which other components consumed NotificationBanner and flagged them, so the reviewer could check those PRs together. That kind of cross-unit awareness in an isolated-worktree model is what I did not expect and what made me trust the system further than I otherwise would have.

Where it broke

I promised honesty. Here are the four places claude code batch did not behave the way I wanted.

Shared utility files. Two units both decided, independently, that they needed to refactor a helper function in src/utils/format.ts. Different refactors. Both correct in isolation. Impossible to merge cleanly. I had to manually reconcile. Lesson: fence shared utilities out of the brief, or handle them in a dedicated unit first.

The one production bug. PR 134 migrated a component that used this.setState with a callback to read the updated state. The agent converted it to useState but did not correctly handle the "state update is async and your read happens before it commits" semantics. Tests passed because tests did not cover that specific interaction. It shipped. Broke a signup flow for 11 minutes at 2am. Reverted, fixed manually, re-shipped. Spot checks do not catch semantic bugs. If your migration touches user-facing flows, add integration tests before you batch.

Typescript generic preservation. Four components had generic props like Component<T extends Record<string, unknown>>. Three agents preserved the generics correctly. One did not. The TypeScript compiler did not flag it because the call sites used looser types. Found it in review. Fixed manually.

Token cost overrun on one unit. One component had 900 lines of legacy code and somehow consumed almost 10% of the total token budget on its own. The agent kept retrying subtle test failures. At some point you have to pull the ripcord. We did. I migrated that one by hand in 40 minutes.

None of these are dealbreakers. All of them are reasons to supervise the assembly line, not trust it blindly. A foreman who never walks the floor is not a foreman, they are a mascot.

When to use /batch versus when not to

Use /batch when the work is genuinely decomposable. Class to hooks, which is the canonical example described in the official React hooks migration guide. Jest to Vitest. Enzyme to React Testing Library. Moment to date-fns. Old router API to new router API. Adding a null-check to 400 callsites. Renaming a shared constant across a monorepo.

Do not use /batch when work requires cross-cutting architectural decisions. You cannot batch-migrate state management from Redux to Zustand, because every component decision affects every other component decision. You cannot batch-migrate a monorepo from Yarn to pnpm, because the change is fundamentally atomic.

Do not use /batch when your test suite is weak. The whole system depends on tests catching regressions. If your coverage is 40%, /batch will ship 40% of the bugs it introduces. Fix the tests first.

Do not use /batch on a Friday afternoon. Not because of the tool. Because you will want to be near a computer when the PRs start landing.

The backlog has gotten smaller

Our team has run seven batch migrations since that first one in late February. Test library swap, two dependency upgrades, a lint rule rollout, a date library migration, an accessibility audit with mechanical fixes, and one failed attempt at a Redux refactor that I called off after the plan preview showed me how wrong my instincts had been.

The backlog has gotten smaller. Not gone. Some things still need a human sitting quietly with a cup of coffee and a 40-inch monitor and an afternoon of uninterrupted thought. But the mechanical work, the move this whole line of parts to the new assembly station work, has stopped being the thing we dread on Fridays. It has become the thing we do between meetings.

If you have a migration you have been avoiding, I would love to hear how your first /batch run goes. What you fenced. What broke. What surprised you. The assembly line is humming on a lot of teams now and the interesting patterns are still emerging. I am still figuring this out. So are you. Let's compare notes.

Stay in the loop. Get weekly tutorials on building software with AI coding agents. Speak to the community of Builders worldwide.

Free forever, no spam. Tutorials, tool reviews, and strategies for founders, PMs, and builders shipping with AI.

Learn More