Using Claude to turn complex code into team documentation
← Back
April 4, 2026Claude7 min read

Using Claude to turn complex code into team documentation

Published April 4, 20267 min read

Every engineering team has that file — the one nobody touches because nobody fully understands it. The original author left two years ago. The comments say "optimised for performance" but not what it is optimised for or why. New engineers avoid it. Bugs hide in it. I spent a day using Claude to document our three worst offenders. By the end, we had readable explanations good enough to add to our internal wiki and use during onboarding.

The documentation gap

Engineers are good at writing code. They are less consistent at explaining it. Not out of laziness — the problem is that the mental model lives in the author's head. When you wrote the code, the context was obvious. Six months later, it is not. For the next engineer, it was never obvious.

Claude can bridge this gap because it reads code without preconceptions. It does not know what the function was supposed to do — it only knows what it does. That forces the explanation to start from what is actually there, not what the author intended.

Prompt 1: Explain this function to a senior engineer

The first prompt I use is the most direct:

prompt
Here is a function from our codebase. Explain it to a senior engineer who has not seen it before. Cover:
1. What it does at a high level (1-2 sentences)
2. The algorithm or approach it uses
3. Any non-obvious design decisions and why they might have been made
4. Edge cases it handles (or does not handle)
5. What could go wrong and under what conditions

```python
def process_events(events, window_size=300, overlap=0.2):
    sorted_events = sorted(events, key=lambda e: e['timestamp'])
    windows = []
    i = 0
    while i < len(sorted_events):
        window_start = sorted_events[i]['timestamp']
        window_end = window_start + window_size
        window = [e for e in sorted_events if window_start <= e['timestamp'] < window_end]
        windows.append(window)
        advance = int(window_size * (1 - overlap))
        next_ts = window_start + advance
        while i < len(sorted_events) and sorted_events[i]['timestamp'] < next_ts:
            i += 1
    return windows
```

Claude's output covers the sliding window logic, the overlap parameter semantics (20% overlap means consecutive windows share events), the O(n²) worst case on the inner list comprehension, and the fact that events arriving out of order within a window boundary are handled correctly but events with timestamps before the current window start are silently dropped.

That last point — silent data loss on late-arriving events — is exactly the kind of thing nobody documented because the original author knew about it and considered it acceptable. A new engineer would not.

Prompt 2: Explain this to a junior engineer

The same function, different audience:

prompt
Explain the function above to a junior engineer who knows Python but is not familiar with data processing patterns. Use an analogy if it helps. Avoid jargon. Then show a concrete worked example: what happens when you call process_events with this input:

events = [
  {"id": 1, "timestamp": 0},
  {"id": 2, "timestamp": 100},
  {"id": 3, "timestamp": 250},
  {"id": 4, "timestamp": 400},
  {"id": 5, "timestamp": 500},
]
window_size=300, overlap=0.2

This prompt produces a different document: shorter, uses the analogy of a sliding spotlight illuminating part of a timeline, and walks through each iteration of the loop with the actual values. I use this version in the onboarding wiki. New engineers read it before touching the data pipeline for the first time.

Prompt 3: Architecture explanation from a module

For larger files, I give Claude the whole module and ask for an architectural overview:

prompt
Here is our payments module (450 lines). Generate:
1. A one-paragraph summary of what this module is responsible for
2. A list of the public API functions and what each one does (one sentence each)
3. The key data flow: what comes in, what transformations happen, what goes out
4. External dependencies (libraries, services, databases) it touches
5. Any global state or side effects

Format this as Markdown so I can paste it into our wiki.

The Markdown output goes straight into Confluence or Notion. I do a quick review pass — Claude occasionally misreads intent on ambiguous code — and then it is live documentation.

Prompt 4: Onboarding guide for a service

When a new engineer joins the team, I generate a targeted onboarding guide:

prompt
A new backend engineer is joining the team and will own the notification service. Here is the service's main file and its config:

[paste code]

Generate a "getting started" guide for them covering:
1. What this service does and why it exists
2. How to run it locally (infer from the code what env vars are needed)
3. The three most important files to read first and why
4. Common tasks they will do regularly (based on the function names)
5. Things that are not obvious from the code that they should know

Section 5 is the most valuable. Claude identifies things like "the retry logic assumes idempotent handlers — if a handler is not idempotent, duplicate deliveries will cause bugs" or "the batch size is hardcoded to 100 but there is a comment suggesting this should be configurable." These are the landmines that only experienced engineers know about. Now they are written down.

Reviewing what Claude generates

Claude gets it right about 85% of the time. The failure modes are:

  • Inferring intent incorrectly: Claude sometimes says "this function does X" when the original intent was Y and X is a bug. Always review explanations of critical paths with the original author if they are still around.
  • Missing business context: Claude cannot know that a specific magic number (say, 86400) is intentionally set to one day because of a legal requirement in a particular market. Add a comment before running the prompt if the code has business context baked in.
  • Overstating completeness: Claude sometimes says "handles all edge cases" when it clearly does not. I always do a final pass asking "what edge cases does this NOT handle?"

The documentation workflow

My process for documenting a complex file:

  1. Paste the file into Claude with a summary prompt (Prompt 3 above)
  2. Review the output, add corrections as follow-up messages
  3. Generate the junior-friendly version for onboarding (Prompt 2)
  4. Ask Claude to identify anything that looks like a bug or unintended behaviour
  5. Paste the final Markdown into the wiki, linking from the file's README

One complex file documented: about an hour. One complex service documented: a day. The return on that investment — measured in fewer "hey, what does this code do?" Slack messages and faster onboarding — pays back in the first month.

The three files nobody wanted to touch? They have been touched twice this quarter. The new engineer who onboarded last month touched one of them in her second week. That would not have happened without the documentation.

Share this
← All Posts7 min read