Documentation as Infrastructure

Documentation is infrastructure. It's not a nice-to-have. It's a critical part of how systems work. When documentation is current and accurate, engineers onboard faster, incidents resolve quicker, and systems evolve smoothly. When documentation is outdated, every engineer wastes time deciphering code and making wrong assumptions.

Bad documentation is worse than no documentation. Outdated docs are actively harmful—engineers follow them and hit roadblocks. No docs means engineers will read code to understand, which is slow but at least accurate.

Why This Matters

Documentation is how knowledge scales. One engineer knows how something works. Document it and a hundred engineers know. Without documentation, knowledge lives in individual heads. When people leave, knowledge leaves with them.

Onboarding speed depends on documentation. A new engineer needs to understand: the architecture, how to run the system locally, how deployments work, what APIs exist, how to add features in common patterns. Without docs, they ask experienced engineers. With docs, they read and ask targeted questions.

Systems evolve with fewer breakages when interfaces are documented. If you change an API contract and the contract is documented, you know what consumers depend on. If there's no documentation, consumers break silently.

Incident response is faster with good runbooks. During an incident, you don't have time to figure out how things work. You need a document that says "if X happens, do Y."

Types of Documentation

API documentation describes how to use your APIs. What endpoints exist? What parameters do they take? What do they return? What are the error cases? Tools like Swagger/OpenAPI let you generate docs from code annotations.

/api/users/{id}:
  get:
    summary: Get a user by ID
    parameters:
      - name: id
        in: path
        required: true
        schema:
          type: string
    responses:
      200:
        description: User found
        content:
          application/json:
            schema:
              type: object
              properties:
                id: { type: string }
                name: { type: string }
                email: { type: string }
      404:
        description: User not found

YAML

Generated API docs stay in sync with code. When the endpoint changes, the doc changes automatically.

Architecture documentation describes how systems work together. How do microservices communicate? What databases exist and why? What are the critical paths? How do systems scale?

A README at the repository root is minimum. More detailed architecture lives in an ARCHITECTURE.md or DESIGN.md file.

# System Architecture

## Overview
Our system consists of three main services:
- User Service: handles authentication and profiles
- Order Service: manages orders and payments
- Notification Service: sends emails and SMS

## Data Flow
1. User creates order via API
2. Order Service validates and stores order
3. Order Service publishes OrderCreated event
4. Notification Service subscribes to event and sends confirmation email

Markdown

Runbooks are procedures for operations: how to deploy, how to handle common incidents, how to scale systems. They're written for humans under stress, so they need to be clear and step-by-step.

# Deployment Runbook

## Prerequisites
- SSH access to production
- Deployed artifact in S3

## Steps
1. SSH into production server: `ssh prod.example.com`
2. Back up current code: `cp -r /app /app.backup.$(date +%s)`
3. Download new code: `wget https://s3.amazonaws.com/artifacts/v2.1.0.tar.gz`
4. Extract: `tar -xzf v2.1.0.tar.gz`
5. Restart service: `systemctl restart app`
6. Verify: `curl http://localhost:3000/health`

Markdown

ADRs (Architectural Decision Records) document why decisions were made. Why did we choose PostgreSQL over MongoDB? Why did we pick microservices over monolith? ADRs capture reasoning so future engineers understand context.

# ADR: Use PostgreSQL for primary database

## Status: Accepted

## Context
We need a primary database for user and order data. Considered PostgreSQL, MongoDB, and DynamoDB.

## Decision
Use PostgreSQL.

## Rationale
- ACID guarantees critical for financial data
- Relational schema matches our domain (users, orders, items)
- Mature ecosystem and strong community
- Sequel ORM handles common patterns
- MongoDB's flexibility not needed; schema is stable

## Consequences
- Setup and maintenance slightly more complex than MongoDB
- Schema changes require migrations
- Can't easily do horizontal sharding (not needed at current scale)

Markdown

Onboarding guides walk new engineers through their first week. How to set up dev environment? How to run tests? How to deploy a change? What conventions exist?

# Onboarding Guide

## Day 1: Environment Setup
1. Clone the repo
2. Install dependencies: `npm install`
3. Create .env file from .env.example
4. Run migrations: `npm run migrate`
5. Start dev server: `npm run dev`
6. Verify: open http://localhost:3000

## Day 1: Code Tour
1. Read ARCHITECTURE.md
2. Browse src/features to see how features are structured
3. Open a random file and see if you understand it
4. Ask questions if something is unclear

## Day 2: Make a Change
1. Pick a small issue or feature
2. Create a branch
3. Make changes
4. Run tests and lint
5. Open a pull request
6. Address review feedback
7. Merge and deploy to staging

Markdown

Inline code comments explain non-obvious logic. Not the obvious ("increment counter") but the subtle ("wait for user confirmation because this operation is irreversible").

Docs-as-Code

Docs should live in your repository alongside code, not in a separate wiki or Confluence space. Code lives in git. Docs should too.

When docs are in the repo:

They're versioned with code. If you check out an old commit, you get the docs for that version.
They're reviewed in pull requests. Someone can't update code without updating docs.
They're stored in plain text (Markdown), making them searchable and mergeable.
They're accessible to everyone who has repo access.

A basic structure:

repo/
  README.md (overview, how to run locally)
  ARCHITECTURE.md (system design)
  CONTRIBUTING.md (how to contribute)
  CHANGELOG.md (what changed in each version)
  docs/
    API.md (API documentation)
    RUNBOOKS.md (operational procedures)
    ADR/ (architectural decision records)
      001-use-postgres.md
      002-microservices.md

Text

Tools like MkDocs or Docusaurus can generate a nice website from Markdown docs. Your docs live in code but display as a proper website.

Keeping Docs Current

The hardest part. Documentation that's out of date is worse than no documentation. Strategies to keep docs current:

Document alongside code. When you change code, change docs. Make it part of the definition of done.

Review docs in PRs. Just like code reviews, doc changes get reviewed. Someone verifies accuracy.

Make docs easy to update. If docs are hard to find or hard to edit, they won't get updated. Keep them in the repo. Use plain text. Make them discoverable.

Link to docs from code. If an API's documentation lives in a separate file, add a comment in the code pointing to it:

// See docs/API.md for GET /api/users/{id} specification
app.get('/api/users/:id', (req, res) => { ... });

javascript

Flag outdated docs. If you know something's outdated, mark it:

> **⚠️ OUT OF DATE**: This API is deprecated. Use the v2 endpoint instead.

Markdown

Deprecate explicitly. When you change something, document the change in CHANGELOG.md or a migration guide.

Require docs for new features. If a feature doesn't have documentation, it's not done.

Documentation and AI Agents

Documentation becomes even more important in AI-assisted development. When agents write code, explicit documentation helps them understand context and produce code that fits your system.

An agent that reads your architecture docs and runbooks will generate code that follows those patterns. Documentation is the contract between humans and AI.

At Bitloops, we've found that teams with comprehensive documentation integrate AI-generated code most smoothly. The documentation acts as the "specification" that agents follow.

FAQ

How much documentation is too much?

Enough so a competent engineer can understand and use the system. Not so much that you have ten files describing the same thing. Most systems need: README, ARCHITECTURE, API docs, and operational runbooks. Everything else is context.

Should we generate docs from code or write them manually?

Both. API docs can be generated from code annotations. Architecture and runbooks must be written manually because they require judgment.

How often should we update documentation?

Whenever code changes that affects the documentation. Don't batch updates. Update as you go.

Who should write documentation?

The person who knows the system best. Often the engineer who implemented it. Sometimes a tech writer. Sometimes the team collaboratively.

Should we use Confluence or docs in the repo?

Docs in the repo. Confluence is fine for temporary notes and discussions, but authoritative docs should be version-controlled.

How do we handle documentation for internal systems?

Same as external systems. Good documentation helps internal teams too. Maybe more important—internal systems tend to have less pressure to keep docs updated.

What if documentation is too technical for non-engineers?

Write multiple versions. Technical documentation for engineers. Overview documentation for non-engineers. Both have value.

Primary Sources

Robert Martin's handbook on writing clean, well-documented code. Clean Code
Google's engineering practices on documentation standards and code commenting. Google Eng Practices
The Pragmatic Programmer's approach to effective technical documentation. Pragmatic Programmer
Steve McConnell's comprehensive guide to code documentation and clarity. Code Complete
John Ousterhout's philosophy on designing systems that are easy to document. Philosophy of Design
Google SRE practices for operational documentation and runbooks. SRE Workbook

Why This Matters

Types of Documentation

Docs-as-Code

Keeping Docs Current

Documentation and AI Agents

FAQ

How much documentation is too much?

Should we generate docs from code or write them manually?

How often should we update documentation?

Who should write documentation?

Should we use Confluence or docs in the repo?

How do we handle documentation for internal systems?

What if documentation is too technical for non-engineers?

Primary Sources

More in this hub

Get Started with Bitloops.