My Claude code review workflow catches things I always miss manually
Manual code review is essential but fatiguing. By the tenth PR of the day, I start approving things I should catch. I built a Claude-based pre-review script that runs before I open the PR in GitHub — it flags security issues, inconsistent API design, and missing error handling that tired human reviewers (including me) routinely miss. Here is the exact workflow.
What Claude catches that humans miss when tired
After running this for three months, these are the issues Claude reliably finds that slip through human reviews:
- Missing input validation on new API endpoints
- Inconsistent error response shapes across a PR
- N+1 query patterns in new code
- Hardcoded secrets or credentials in examples
- Missing
awaiton async calls (TypeScript) - Race conditions in concurrent code
The review script
#!/usr/bin/env python3
# ai-review.py
import subprocess
import sys
import anthropic
client = anthropic.Anthropic()
REVIEW_SYSTEM = """You are a senior software engineer doing a thorough code review.
Review the provided diff for:
1. SECURITY: Injection vulnerabilities, hardcoded secrets, missing auth checks,
sensitive data in logs, IDOR vulnerabilities
2. CORRECTNESS: Missing error handling, unhandled edge cases, off-by-one errors,
missing null checks, race conditions
3. PERFORMANCE: N+1 queries, missing indexes implied by new queries,
unnecessary re-renders (React), blocking I/O in async code
4. API DESIGN: Inconsistent naming, breaking changes, missing validation,
inconsistent error response shapes
5. TESTS: Missing tests for new behavior, tests that don't cover edge cases
Format your response as:
## Issues Found
For each issue:
**[SEVERITY: HIGH/MEDIUM/LOW]** [Category]: Description
File: `filename:line_number`
```
relevant code snippet
```
Suggestion: How to fix it
## Looks Good
Brief note on what is well-implemented.
If no issues found in a category, skip it.
Focus on real problems, not style preferences."""
def get_diff(base: str = "main") -> str:
result = subprocess.run(
["git", "diff", f"{base}...HEAD", "--unified=5"],
capture_output=True, text=True
)
return result.stdout
def review_diff(diff: str, focus: str | None = None) -> str:
system = REVIEW_SYSTEM
if focus:
system += f"
Focus extra attention on: {focus}"
# Truncate large diffs
if len(diff) > 60_000:
diff = diff[:60_000] + "
[Diff truncated at 60K chars]"
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=2000,
system=system,
messages=[{
"role": "user",
"content": f"Review this git diff:
{diff}"
}]
)
return response.content[0].text
if __name__ == "__main__":
base = sys.argv[1] if len(sys.argv) > 1 else "main"
focus = sys.argv[2] if len(sys.argv) > 2 else None
diff = get_diff(base)
if not diff.strip():
print("No changes to review")
sys.exit(0)
print("Running AI code review...
")
review = review_diff(diff, focus)
print(review)
Adding it as a git hook
Run the review automatically before every push:
# .git/hooks/pre-push
#!/bin/bash
echo "Running AI code review..."
python3 ~/bin/ai-review.py main
echo ""
echo "Review complete. Press Enter to continue push, Ctrl+C to cancel."
read
chmod +x .git/hooks/pre-push
Focused reviews by file type
For PRs with specific concerns, I use focused prompts:
# Focus on security for auth-related changes
python3 ~/bin/ai-review.py main "authentication, authorization, and JWT handling"
# Focus on database performance for schema changes
python3 ~/bin/ai-review.py main "database query patterns, missing indexes, lock implications"
# Focus on React patterns
python3 ~/bin/ai-review.py main "React hook rules, unnecessary re-renders, missing keys"
Integrating with GitHub Actions
Run reviews automatically on every PR:
# .github/workflows/ai-review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run AI review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
pip install anthropic
git diff origin/main...HEAD > /tmp/diff.txt
python3 scripts/ai-review.py /tmp/diff.txt > /tmp/review.txt
cat /tmp/review.txt
- name: Post review as comment
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const review = fs.readFileSync('/tmp/review.txt', 'utf8');
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: '## 🤖 AI Code Review
' + review
});
What I learned from three months of this
The most valuable thing the AI review does is not catch bugs — it is prompting me to think about why I made a decision. When Claude flags something as potentially problematic, even if I disagree, the act of articulating why I am right (or realizing I am wrong) improves the code.
The false positive rate is about 20%. Claude will occasionally flag something that is intentional or already handled elsewhere. I scan the review, dismiss the non-issues, and focus on the real findings. It takes about 5 minutes and has caught at least one significant issue per week.