Skip to content
Bitloops - Git captures what changed. Bitloops captures why.
HomeAbout usDocsBlog
ResourcesAI Memory & Reasoning CaptureLocal-First AI Memory Architectures: SQLite + HNSW for Code Context

Local-First AI Memory Architectures: SQLite + HNSW for Code Context

Keep your AI's memory local—SQLite plus vector indexes on your machine. You get privacy, no vendor lock-in, offline-first access, and full control. The trade-off is worth it: your codebase knowledge stays yours.

15 min readUpdated March 4, 2026AI Memory & Reasoning Capture

Opening Definition

A local-first memory architecture stores AI semantic context directly on your machine, scoped to individual repositories, with no network dependency. The typical implementation uses SQLite for structured metadata (what got committed when, who wrote it, what it's about, captured in committed checkpoints) and HNSW (Hierarchical Navigable Small World) vector indexes for semantic similarity search. The entire knowledge store lives in a hidden directory tied to your repository hash, isolated from other projects and inaccessible to cloud services.

This design inverts the default assumption in modern AI systems. Rather than "sync everything to a cloud platform, query from there," it's "keep everything local, only sync if you explicitly choose to." You get privacy by default, work offline without friction, and maintain complete control over your semantic context. The tradeoff is that you manage more infrastructure yourself, but for code-specific use cases, the tradeoff is worth it.


Why Local-First Matters

The case for local-first AI memory rests on three practical concerns that most engineers face.

Privacy and control. When you send your codebase's semantic context to a cloud service, you're creating a permanent record of your design decisions, architectural choices, past mistakes, and lessons learned. You're also creating a liability: if the service is breached, that context is exposed. If the service changes terms or gets acquired, your data follows a path you didn't choose. Local-first means you retain complete control. Your semantic memory never leaves your machine unless you decide to back it up or share it.

No network dependency. Cloud services have uptime guarantees, not 100% uptime. When the service has an outage, your agent loses access to semantic context. When network is unavailable (offline work, poor connectivity), cloud-backed agents degrade. Local-first means your AI agent works as well offline as online. It's resilient to network failures and doesn't depend on SLAs from third parties.

Operational simplicity and cost. Running a cloud service at scale requires versioning, API compatibility, handling different clients, ensuring data consistency across regions. These are real costs, and they get passed to users. Local-first doesn't eliminate these concerns (you still need to manage your local index), but it moves the tradeoffs to where you have more control. You pay in storage space and compute time, not in recurring service fees.

Performance characteristics. Local queries are fast. Querying SQLite and doing vector similarity search in HNSW runs in milliseconds. Cloud queries go over network, add latency, and incur rate limits. For code context retrieval (where agents might need to fetch context many times per session), local-first is simply faster.

Ecosystem lock-in avoidance. Cloud vector database providers (Pinecone, Weaviate, Qdrant) are viable, but they all have different APIs, different pricing models, different data formats. Building your semantic context in a hosted service locks you in. Local-first with an open-source index (HNSW) leaves you free to migrate to a different backend if your needs change.

These concerns aren't theoretical. Real teams struggle with compliance requirements (healthcare, finance), offline work patterns (traveling developers, remote areas), and cost management at scale. Local-first addresses all of them.


The Technical Foundation: SQLite + HNSW

Local-first AI memory for code typically uses a two-tier storage strategy.

SQLite stores the structured data: what was committed when, commit messages, file changes, author information, references to code locations. SQLite is chosen because it's reliable (ACID transactions), queryable (full SQL), self-contained (single file, no server required), and ubiquitous (every machine has it). You can run complex queries to find all commits by a certain author, all changes to a specific file, all commits in a time range. The schema is simple: commits table, code references table, semantic annotations table. For a typical codebase, the SQLite database is small (tens of MB even for large projects).

HNSW (Hierarchical Navigable Small World) is a vector index algorithm that makes semantic similarity search fast. When you ask "find commits similar to this query," HNSW converts both the query and the stored commits to vectors, then finds the nearest neighbors in vector space. HNSW is a brilliant algorithm: it builds a hierarchical graph that's fast to search even with millions of vectors. For code context, it's critical because keyword search (find commits mentioning "rate limiting") is too literal; semantic search (find commits about throttling, backpressure, request queueing) is what agents need.

The specific implementation stores the vector index as a serialized data structure on disk. When your system starts, it loads the index into memory. As new commits arrive, you update the index (append-oriented; you don't modify old vectors, you add new ones). The index includes vectors for:

  • Commit messages
  • Code diffs (summarized into vectors)
  • Code comments extracted from files
  • Design documents and decision records
  • Issue tracker content (if available)
  • Agent reasoning traces (if captured)

The vectors are generated by a semantic encoder (typically a small, open-source embedding model like nomic-embed-text or sentence-transformers). The encoder runs locally; no API calls required.


Repository Scoping: Isolation and Boundaries

A crucial design decision: each repository gets its own independent memory store, isolated from others.

The file structure looks like this:

/path/to/repo/.bitloops/
├── repo.hash          # SHA256 of the repo's remote URL or local path
├── memory/
│   ├── semantic.db    # SQLite database
│   ├── vectors.hnsw   # HNSW index
│   ├── embeddings.bin # Cached embeddings
│   └── metadata.json  # Schema version, created date, stats
Bash

The repo.hash ensures that if you clone the same repository in two places, both get the same memory store. This prevents duplication and keeps you in sync if you work across multiple checkouts. The hidden directory means the memory store doesn't clutter your working tree and is ignored by version control.

This scoping has important consequences:

  1. No cross-contamination. Work on project A doesn't influence the semantic memory for project B. Each codebase's history and context are separate.
  2. Backup and privacy isolation. You can back up or delete the memory for one project without affecting others. You can share your codebase with a colleague and choose whether to share the semantic memory.
  3. Team or individual mode. A team can build a shared memory store (store the .bitloops directory in a shared location or sync it), or each developer can maintain their own. The architecture supports both patterns.
  4. Reproducibility. The repository hash acts as a stable identifier. If you move the repo or clone it elsewhere, the memory store follows based on the hash, not the filesystem path.
  5. Archival. When you're done with a project, you can archive the entire memory store with it. Years later, when you revisit old code, the semantic context is still there.

Building the Memory Store: Processing Git History

The memory store is built by scanning the full git history and processing commits, diffs, and code.

Initial indexing. When you run bitloops index (or equivalent) for the first time on a repository, the system:

  1. Walks the entire commit graph (all branches, all history)
  2. Extracts commit metadata (timestamp, author, message, file changes)
  3. Extracts code diffs (what changed in each commit)
  4. Scans the working tree for comments and docstrings
  5. Generates semantic embeddings for all extracted content
  6. Stores metadata in SQLite and vectors in HNSW
  7. Records the indexed commit hash, so later runs can resume from where they left off

For a typical medium-sized project (20k commits, 5 years of history), initial indexing takes 5-15 minutes depending on machine speed and the complexity of diffs.

Prioritization. Not all commits are equally important for semantic memory. The indexing process prioritizes:

  • Recent commits over old ones (recent decisions are more relevant)
  • Commits with long, detailed messages over one-liners
  • Commits that touch critical files (identified by frequency or path patterns)
  • Commits that introduce new functions or classes over those that fix typos

This prioritization isn't just efficiency; it's a semantic choice. Older decisions are less relevant to current work. Detailed commit messages do encode more understanding. By weighting the index toward recent, detailed, impactful commits, the system makes semantic search more useful.

Incremental updates. After initial indexing, the system tracks which commits are already indexed. When new commits arrive (you push, pull, merge), a post-commit hook or scheduled task runs bitloops index --incremental, which:

  1. Finds commits since the last indexed commit
  2. Processes only those new commits
  3. Updates the SQLite database and HNSW index
  4. Completes in seconds (since it's not re-processing history)

This keeps the memory store fresh without expensive full re-indexing.


Maintaining the Index: Post-Commit Hooks and Incremental Updates

The memory store must stay synchronized with the actual git history. This is handled through post-commit hooks and periodic updates.

Post-commit hook approach. After each commit, a git hook runs:

#!/bin/bash
# .git/hooks/post-commit
bitloops index --incremental --async
Bash

The --async flag means the hook doesn't block the commit (it spawns a background process). The --incremental flag means only the new commit is processed. This keeps latency down (developers don't wait) and keeps the index current.

Scheduled updates. For teams using a shared memory store, you might run:

# Run every 30 minutes
bitloops index --incremental
Bash

This polls for new commits and keeps the shared index fresh. If multiple developers commit simultaneously, the incremental indexing handles it (HNSW updates are append-only for new vectors; the algorithm is designed for concurrent updates).

Conflict resolution. Git rebase and force-push can cause issues: commits that were indexed get rewritten, the commit hash changes. The system handles this by detecting when indexed commits are no longer in the current history and marking them as archived (not deleted, archived, so the reasoning is preserved but not active). When old commits reappear (in another branch), they're un-archived. This preserves semantic memory across rebases without corrupting the index.

Storage growth. SQLite and HNSW indexes grow with commit history. A typical project accumulates ~1-2 MB per 1000 commits. A 10-year-old project with 50k commits might have a 50-100 MB memory store. This is cheap; most development machines have plenty of storage. If storage is a concern, the system can prune old vectors (keep structured metadata, archive embeddings), trading retrieval speed for storage.


Querying: How Agents Access Semantic Context

Once indexed, the memory store is queryable. Agents (and humans) can ask questions that retrieve relevant context.

Exact queries. Find commits by author, date range, file path, or commit message keywords:

SELECT * FROM commits
WHERE author = 'alice@example.com'
AND timestamp > '2024-01-01'
AND file LIKE '%auth%'
ORDER BY timestamp DESC;
SQL

Semantic queries. Find commits similar to a given question or topic:

Query: "How did we handle rate limiting?"
Result: [
  (commit abc123, message: "implement token bucket for rate limiting", similarity: 0.92),
  (commit def456, message: "discuss backpressure in review", similarity: 0.87),
  (commit ghi789, message: "refactor throttling logic", similarity: 0.85)
]
YAML

The system converts the query to a vector, searches HNSW for nearby vectors, and returns results ranked by similarity. This is fast (10-50ms for a million-vector index) and semantically meaningful.

Hybrid queries. Combine structure and semantics:

Find commits touching "rate limiter" files
that are semantically similar to "distributed throttling"
by developers on my team
in the last 6 months
Text

The system narrows by structure (date, author, file), then ranks by semantic similarity. This combines the precision of exact queries with the flexibility of semantic search.


Tradeoffs: Local-First vs Cloud-Hosted

Both approaches have merit. Understanding the tradeoffs helps you choose.

Local-first (SQLite + HNSW):

  • Pros: Privacy (no network), fast (local queries), offline (works anywhere), cost (no per-query fees), control (your data), simplicity (single file to back up)
  • Cons: Setup (you build the index), storage (your disk), updates (you manage incremental indexing), no built-in sharing (you coordinate team access)

Cloud-hosted (Pinecone, Weaviate, Qdrant, etc.):

  • Pros: Minimal setup (API key, start querying), sharing (multi-user by default), scaling (handled by provider), features (built-in filtering, metadata, analytics)
  • Cons: Privacy (data on their servers), cost (per-query or per-month fees), dependency (service outages), lock-in (API-specific)

Practical guidance:

  • Solo developer or small team, code is proprietary: Local-first. You avoid cloud costs and keep sensitive context private.
  • Large team, many repositories, need cross-team visibility: Cloud-hosted. The management overhead of maintaining local indexes for many teams is substantial, and the shared queryability is valuable.
  • Offline work is critical (traveling, remote areas): Local-first. Cloud-hosted requires network.
  • Rapid experimentation with the memory layer: Local-first. You can modify the indexing strategy or vector embeddings without API compatibility concerns.
  • Compliance-critical (healthcare, finance, government): Local-first by default. Some cloud providers offer compliance certifications, but local gives you auditable control.

Many teams use hybrid approaches: local-first as the default, with optional cloud sync for backup or team access. The architecture should support both.


Implementation Details: The Knowledge Store Schema

Here's what the local memory store actually looks like.

commits table:

id (integer, primary key)
hash (text, unique) — git commit hash
timestamp (datetime)
author (text)
message (text) — full commit message
summary_vector (blob) — HNSW-encoded vector of message
diff_vector (blob) — HNSW-encoded vector of code changes
protobuf

code_references table:

id (integer, primary key)
commit_id (foreign key)
file_path (text)
function_name (text)
line_number (integer)
code_snippet (text) — context around the change
snippet_vector (blob)
Text

annotations table:

id (integer, primary key)
commit_id (foreign key)
type (text) — 'comment', 'docstring', 'issue_reference', 'adr', etc.
content (text)
content_vector (blob)
Text

index_metadata:

id (integer, primary key)
last_indexed_commit (text) — commit hash where indexing stopped
last_indexed_timestamp (datetime)
vector_count (integer)
embedding_model (text) — name and version of embedding model used
hnsw_params (text) — JSON of HNSW configuration
Text

The vector columns store the serialized HNSW index entries. The metadata ensures you can detect when the index was built and with which embedding model, so you can handle model upgrades.


Integration with Post-Commit Hooks and CI/CD

In practice, memory indexing fits into your development workflow.

Local development:

  • Developer commits code
  • Post-commit hook runs bitloops index --incremental --async
  • Indexing happens in background
  • Developer continues work
  • Next time they query for context, fresh data is available

Team workflow with shared memory:

  • Developer pushes to origin
  • CI/CD runs bitloops index --incremental on the server
  • Updated index is stored in shared location (S3, Git LFS, shared filesystem)
  • Team members periodically pull the latest index
  • All team members get the benefit of aggregated semantic memory

Large codebase with scheduled indexing:

  • Run bitloops index --full nightly (off-peak)
  • Incremental updates happen on commits
  • Nightly full reindex ensures consistency and handles any missed commits
  • If indexing ever gets out of sync, full reindex recovers

The goal is invisible synchronization: developers shouldn't think about the memory store, it should just be current and available.


AI-Native Perspective

Local-first memory is essential for practical AI agents working with code. Agents need fast, reliable access to context. Cloud-hosted memory introduces latency, cost, and privacy concerns that don't make sense for code-specific use cases. An agent analyzing code in a proprietary codebase shouldn't have to route that analysis through a third-party API.

Bitloops' local-first approach (SQLite + HNSW, repository-scoped, incremental updates) recognizes that code context is fundamentally different from general semantic search. It's structured (git history, file relationships), sensitive (proprietary code), and dynamic (constantly changing). A memory architecture designed for code starts with local-first as the default, not as an afterthought.


FAQ

What if my repository is huge (millions of commits)?

Initial indexing takes longer (hours instead of minutes), but it's still feasible. HNSW is designed for millions of vectors. The bottleneck is semantic encoding (generating embeddings), not vector indexing. You can optimize by sampling commits (index every Nth commit initially) or using a faster (less accurate) embedding model. Incremental updates remain fast regardless of total size.

How do I back up the memory store?

Copy the .bitloops directory. It's a self-contained directory with SQLite and HNSW files. You can tar it, push it to S3, or sync it with any backup tool. You might exclude very old indexed commits if storage is tight.

What if I delete a repository and later re-clone it?

If you clone from the same remote, the repo.hash matches and you can restore the memory store from your backup. If it's a fresh clone from a different path, the hash is different; you'll be starting with an empty memory store. This is by design—the hash ensures isolation, preventing memory from being mixed across different repositories.

Can I share the memory store with my team?

Yes. Store .bitloops in a shared location (S3, Dropbox, shared network drive) or check it into Git (though it's large). Each developer can have their own copy, or a single shared copy. If shared, use file-level locking to avoid concurrent writes to the HNSW index. Better: have a CI/CD process build the index and push updates to the shared location; developers pull periodically.

What embedding model should I use?

For code, popular choices are nomic-embed-text (small, fast, open-source), all-MiniLM-L6-v2 (versatile), or code-search-ada (GitHub's OpenAI model). The model choice trades off speed (smaller is faster), quality (larger can be more accurate), and privacy (open-source runs locally). Start with nomic-embed-text for a good balance.

Can I switch embedding models mid-way?

Yes, but you'll need to re-embed (re-generate vectors for all stored content). The system can handle this: mark vectors as using old model, re-index with new model, gradually replace old vectors with new ones. It's not instant (re-embedding takes time), but it's possible without losing data.

How does incremental indexing handle rebases or force-pushes?

Rebases rewrite commits, changing their hashes. The system detects this (indexed hashes are no longer in the git log) and archives the old data (marks it inactive but preserves it for reasoning). When commits reappear (in another branch or after a recovery), they're un-archived. Force-pushes are handled similarly. This preserves semantic memory across history rewrites without corrupting the index.

Can the local memory store get corrupted?

Unlikely. SQLite has strong ACID guarantees and recovers from crashes. HNSW is append-only in the active index; new data doesn't modify old vectors. If corruption does occur, you can rebuild from git history (re-index). A periodic integrity check (bitloops verify --index) scans for inconsistencies and reports them.


Primary Sources

  • Hierarchical navigable small world graphs for efficient nearest-neighbor search locally. HNSW
  • Manifesto for building software that keeps user data local while enabling collaboration. Local-first Software
  • Reference guide to Git internals and commit structure for understanding history storage. Pro Git Book
  • Meta's library for fast similarity search and clustering of embeddings at scale. FAISS
  • Self-contained SQL database for local persistent storage of memory and context. SQLite
  • Vector database with HNSW indexing for semantic search over local memory. Qdrant

Get Started with Bitloops.

Apply what you learn in these hubs to real AI-assisted delivery workflows with shared context, traceable reasoning, and architecture-aware engineering practices.

curl -sSL https://bitloops.com/install.sh | bash