Block AI Agents from Merging PRs Accidentally

6 min read

Mar 2, 2026

--dangerously-skip-permissions is one of those flags that does exactly what the name says.

You hand Claude Code the keys. No confirmation prompts. No "are you sure?" dialogs. The agent reads files, writes files, runs commands, calls APIs, and pushes code. All of it, uninterrupted. For long-running autonomous sessions, especially when combined with auto mode, it is genuinely powerful. You set a task, walk away, and come back to finished work.

The problem is that "no confirmation prompts" includes gh pr merge.

An AI agent with write access to your GitHub repo and no permission guardrails can create a PR and merge it. In the same session. Without a human ever looking at the diff. This is not a theoretical risk. It is the natural endpoint of an agentic workflow that lacks a merge boundary. If you want a broader checklist of what can go wrong when AI agents ship code unsupervised, the vibe coding security checklist covers the full surface area.

One line of bash fixes it.

The Wrapper That Saves You

Credit to @elvissun for this elegant solution. The idea: create a gh wrapper script that intercepts any pr merge command and exits before it reaches the real GitHub CLI.

Step 1: Create the wrapper at ~/.local/bin/gh

bash

#!/bin/bash
# Wrapper to block AI agents from merging PRs
 
if [[ "$*" == *"pr merge"* ]]; then
    echo "🚫 Blocked: merging requires human approval"
    exit 1
fi
 
exec /opt/homebrew/bin/gh "$@"

Step 2: Add to ~/.bashrc (or ~/.zshrc)

bash

export PATH="$HOME/.local/bin:$PATH"

Then make it executable and reload your shell:

bash

chmod +x ~/.local/bin/gh
source ~/.bashrc

That is the entire fix. Now when any process tries to run gh pr merge, it gets blocked with a clear message before anything touches your repo. Every other gh command passes through to the real binary unchanged.

Test it:

bash

$ gh pr merge 327
🚫 Blocked: merging requires human approval

Works exactly as expected.

Why This Matters More Than You Think

When you run Claude Code with --dangerously-skip-permissions, you are making a deliberate trade: you give up safety rails in exchange for uninterrupted autonomous work. That is a reasonable trade for the right tasks. Refactoring a module, writing tests, scaffolding a feature, updating documentation. These are all operations where interrupting the agent to ask permission would be friction without safety benefit.

Merging is different.

Merging is irreversible in a way that file edits are not. A bad file edit gets caught in review. A merged PR is in your main branch. It may have already triggered a deployment. The CI/CD pipeline has moved on. Rolling it back requires a revert commit, another review cycle, another merge. The blast radius of an accidental merge is an order of magnitude higher than the blast radius of an accidental file edit. The Moltbook breach is a cautionary tale of what happens when an agent-driven workflow ships vulnerable code without human review at the merge boundary.

The wrapper creates a hard boundary. The AI agent can do everything: create branches, write code, open PRs, comment on issues, push commits. The one thing it cannot do is close the loop on its own work without a human in the decision. That asymmetry is exactly right.

The Deeper Problem: GitHub's Permission Model

This wrapper is a fix for a structural gap in how GitHub Personal Access Tokens work today.

PATs currently bundle write and merge into the same permission scope. If an agent has enough access to create a PR, it has enough access to merge one. There is no "create PRs but not merge them" option. That is a real limitation, and it is the reason this wrapper needs to exist in the first place.

The right fix is for GitHub to separate these permissions at the token level. An agent should be able to hold a PAT that allows full repository write access - pushing branches, creating PRs, commenting, managing labels - but explicitly cannot merge. That permission would require a separate human-held credential.

Until that exists, the wrapper is the pragmatic solution. It is not perfect. A sufficiently determined or confused agent could potentially find a workaround. But it handles the accidental case, which is by far the most likely failure mode.

Fitting This Into Your Vibe Coding Workflow

If you are using Claude Code for autonomous development sessions, here is how this fits into a safe setup:

The agent creates the PR. You get notified via GitHub (or Slack, or email, or whatever your setup is). You review the diff, ideally using a multi-agent PR review workflow for thoroughness. You merge it. The agent continues with whatever comes next.

This is the correct division of labor. The AI handles the creative and mechanical work: writing code, writing tests, structuring the commit, writing the PR description. The human handles the judgment call: is this ready to go into main?

That boundary does not slow you down meaningfully. PR review takes minutes. What it prevents is the scenario where an agent, in the course of a long autonomous session, decides that its own work is ready, merges it, and triggers a deployment before you have looked at anything. That scenario is not hypothetical. It is what --dangerously-skip-permissions plus GitHub write access makes possible.

This wrapper also pairs naturally with Claude Code's CLAUDE.md configuration system. You can document in your CLAUDE.md that merge is a human-only operation. The CLAUDE.md instruction tells the AI not to attempt it. The wrapper enforces it at the system level even if the instruction is ignored. Defense in depth. For an even stronger enforcement layer, you can use hooks to intercept git operations before they execute, adding a programmatic check alongside the wrapper.

If you are running Claude Code inside a GitHub Actions CI/CD pipeline, the same principle applies: ensure the action's token lacks merge permissions, and use branch protection rules as the final backstop.

If you use zsh instead of bash, add the same export PATH line to your ~/.zshrc and run source ~/.zshrc. The wrapper script itself works identically regardless of shell.

One More Thing

There is a version of this conversation that treats it as a reason to avoid --dangerously-skip-permissions entirely. I do not think that is the right conclusion.

Autonomous AI coding sessions are genuinely valuable. The vibe coding workflow gets more powerful the fewer times you need to interrupt the agent to grant permissions for routine operations. The flag exists for good reason.

The answer is not to give up the power. It is to understand where the actual risk lives and put a targeted control there. Merging is the risk. The wrapper controls it. Everything else stays fast.

Add it to your .bashrc. You will not regret it.

And GitHub: please give us separate merge permissions in PATs. The community has been asking. The use case is obvious. The agents are ready.

FAQ

What is --dangerously-skip-permissions in Claude Code?

It is a flag that runs Claude Code without confirmation prompts. Normally, Claude Code asks for permission before executing potentially impactful actions like running shell commands or editing files. With --dangerously-skip-permissions, all of those confirmations are skipped, allowing the agent to work autonomously without interruption. It is designed for trusted, automated environments.

How does the gh wrapper block PR merges?

The wrapper is a bash script saved to ~/.local/bin/gh that intercepts all gh commands. If the arguments match pr merge, it prints a blocked message and exits with a non-zero status code before passing anything to the real GitHub CLI. For every other command, it passes through unchanged via exec /opt/homebrew/bin/gh "$@".

Does this break any other gh commands?

No. The wrapper only intercepts commands that contain pr merge in the arguments. All other gh commands, including gh pr create, gh pr view, gh repo, gh issue, and everything else, pass through to the real binary unchanged.

What if I am using zsh instead of bash?

Add the same export PATH="$HOME/.local/bin:$PATH" line to your ~/.zshrc and run source ~/.zshrc. The wrapper script itself works with any shell since it is just a bash script that gets called by the shell, not by the shell itself.

Why does GitHub not already have separate merge permissions in PATs?

Currently, GitHub's PAT permission model bundles repository write access together, meaning a token with write permissions can both create and merge PRs. There is no fine-grained scope that allows PR creation but blocks merging. This is a known gap that becomes more significant as AI agents get write access to repositories. The wrapper is the current workaround until GitHub adds this permission separation.

Should I use this with Claude Code's CLAUDE.md?

Yes, use both. Document in your CLAUDE.md that merging requires human approval, so the agent knows not to attempt it. The wrapper then enforces this at the system level as a hard block, even if the agent were to ignore the instruction. Two layers of protection are better than one.

Is this fix specific to Claude Code?

No. The wrapper blocks any process from running gh pr merge, regardless of what triggered it. It works as a safety net for Cursor, Codex, Goose, or any other AI coding agent that has access to your shell environment and GitHub CLI.

Share on X LinkedIn

Block AI Agents from Merging PRs Accidentally

The Wrapper That Saves You

Why This Matters More Than You Think

The Deeper Problem: GitHub's Permission Model

Fitting This Into Your Vibe Coding Workflow

One More Thing

FAQ

Related posts

50 Claude Code Tips After 6 Months of Daily Use

Claude Code Auto Fix: Stop Babysitting Failing Pull Requests

Claude Code Auto Mode: Safe Automation Guide