Pre-Commit and CI Validation for AI Code: The Two-Stage Enforcement Pipeline

The Two-Stage Problem

Your AI agent just generated a feature. You're excited to see it work. But how do you know if it violates your architectural constraints, naming conventions, or security requirements? Wait for CI to fail? That's slow. Ask the agent to regenerate until it passes? That's wasteful.

The answer is a two-stage validation pipeline. Stage one: pre-commit validation that runs locally, on your machine, right after code generation. Fast feedback, immediate fixes. Stage two: comprehensive CI validation that gates code from reaching main. Thorough, auditable, gated.

This is how you make AI code generation productive instead of just fast.

Why Two Stages?

You could use just CI validation—it's comprehensive and happens in a controlled environment. The problem: CI validation is slow. A full suite of architectural checks, security scans, and integration tests might take 5-15 minutes. If your AI agent generates a constraint violation, you wait 10 minutes to find out, then regenerate, and wait again. The feedback loop is glacially slow.

You could use just pre-commit validation—it's fast and local. The problem: you can't check everything locally. You can't run a full integration test suite before commit. You can't access the production database to verify certain constraints. And if you skip checks in pre-commit, violations make it to CI anyway.

Pre-commit and CI together give you the best of both:

Pre-commit: Fast feedback loop. You generate code, check it immediately (takes seconds), find problems, regenerate if needed. The agent can try multiple approaches and you validate each one interactively. It's like pair programming with instant feedback.

CI: Comprehensive validation. The full test suite, security scans, architectural analysis, integration checks—everything that matters but would be too slow to run on every developer's machine. CI is the gate that ensures nothing broken reaches main.

They're not redundant; they're complementary. Pre-commit gives you speed. CI gives you rigor.

What Pre-Commit Validation Checks

Pre-commit validation prioritizes speed. It should run in under 5 seconds on code that's just been generated. This means checking only rules that are fast to evaluate.

Naming conventions. "All domain events end with 'Event'. All queries implement IQuery. All validators end with 'Validator'." These are regex-fast checks. Pre-commit can validate this in milliseconds.

Syntax validation. The code compiles. No obvious syntax errors. This is usually free—you've probably already run a compiler or linter.

Obvious constraint violations. "Classes in the controller package cannot import from the db package." Pre-commit can check this by scanning imports in the generated code. It doesn't need to see the whole dependency graph; just the immediate imports in the new code.

Pattern detection. "Any code that writes to the audit log must use the AuditLogger class, not direct database writes." Pattern matching is fast. Pre-commit can look for database insert statements in audit-related code and flag them.

Missing required elements. "All public API endpoints must have a corresponding test." Does the new endpoint have a test? Pre-commit checks.

Security red flags. "No hardcoded credentials. No eval() calls. No SQL string concatenation." These are token-pattern checks. Pre-commit can catch them in seconds.

Custom rule violations. Whatever rules you've defined as "must not" for generated code. If you have a custom rule that says "billing operations must go through the BillingService," pre-commit can check that the generated code uses BillingService, not direct database calls.

What pre-commit validation doesn't do: it doesn't run the full test suite, doesn't scan the entire codebase for dependency cycles (those need the full graph), doesn't run security penetration testing, doesn't integrate with external services. It can't because it needs to be fast.

The key insight: pre-commit checks rules that are local to the generated code. CI checks rules that require system-wide context.

What CI Validation Checks

CI validation is comprehensive. It runs on the entire codebase and can take as long as needed because it gates the main branch. A developer commits code, CI runs (sometimes in parallel to save time), and if anything fails, the commit doesn't merge.

Full architectural validation. Now you can check the entire dependency graph. Are there circular dependencies? Does the new code introduce any? Does it violate module isolation across the entire system? Does it respect layer boundaries even when considering all existing code?

Cross-module impact analysis. The new code touches Module A. Does Module B, which depends on Module A, still work? Have you accidentally broken something you weren't thinking about?

Integration tests. Does the new feature actually work when integrated with the rest of the system? Are there race conditions? Database constraints? API contract violations?

Security scanning. Static analysis for common vulnerabilities. Dependency audits for known CVEs. Credential detection to ensure no secrets leaked. This is comprehensive and needs the full codebase context.

Performance regression analysis. Did the new code introduce a performance issue? CI can run benchmarks and compare to baseline.

Compliance validation. If you operate under compliance constraints (SOX, HIPAA, GDPR), CI can check that code meets those requirements.

Documentation validation. If code requires documentation (APIs, security-sensitive operations), does the generated code include it?

Coverage analysis. The new code—is it tested sufficiently? CI knows the coverage baseline and can enforce a minimum.

The pattern: CI checks rules that require system-wide context or comprehensive testing. If pre-commit had to check all this, it would take minutes. CI can afford to take minutes because it's a gate, not a feedback loop.

Configuring Pre-Commit Validation

Pre-commit validation is typically a script or set of scripts that run locally after code generation. The goal is immediate feedback.

A basic setup looks like:

# Pre-commit validators
validators:
  - name: naming_conventions
    type: regex
    rules:
      - pattern: "Event$"
        targets: "domain/events/*.ts"
      - pattern: "Query$"
        targets: "application/queries/*.ts"
    fail_on_mismatch: true

  - name: import_constraints
    type: static_analysis
    rules:
      - source: "controllers/"
        forbidden_targets: ["db/", "infrastructure/"]
    fail_on_violation: true

  - name: security_patterns
    type: pattern_matching
    rules:
      - forbidden_pattern: "hardcoded.*password"
      - forbidden_pattern: "eval\\("
    fail_on_match: true

  - name: required_tests
    type: file_existence
    rules:
      - if_file_matches: "features/*.ts"
        then_require_file: "features/*.test.ts"
    fail_if_missing: true

Bash

Configuration is declarative: define rules, specify which validators to run, decide whether violations fail the whole validation or just warn. For AI-generated code, you typically fail on violations—if the agent generated something that violates a naming convention, it should regenerate.

The output should be specific. Not "validation failed" but:

FAIL: Naming convention violation in domain/events/UserCreated.ts
  Rule: Domain events must end with 'Event'
  Found: UserCreated (missing 'Event' suffix)
  Fix: Rename to UserCreatedEvent
  Context: Domain events are the contracts for cross-domain communication.
           The naming convention makes this intent explicit.

Text

The human (or AI) can see exactly what's wrong and why.

Configuring CI Validation

CI validation is more complex because it's comprehensive. It typically involves multiple stages in parallel.

# CI validation pipeline
stages:
  - name: quick_checks
    timeout: 1m
    parallel: true
    validators:
      - lint (syntax, naming)
      - type_checking (if applicable)
      - dependency_audit (known vulnerabilities)

  - name: architecture_validation
    timeout: 5m
    validators:
      - dependency_graph_analysis (circular deps, layer violations)
      - module_isolation_checks
      - architectural_pattern_enforcement

  - name: security_scanning
    timeout: 10m
    parallel: true
    validators:
      - static_security_analysis (SAST)
      - secret_detection
      - dependency_vulnerability_scanning (CVEs)

  - name: testing
    timeout: 20m
    parallel: true
    validators:
      - unit_tests (with coverage threshold)
      - integration_tests
      - contract_tests (if using API contracts)

  - name: quality_gates
    timeout: 5m
    validators:
      - coverage_analysis (minimum 70%)
      - performance_regression (maximum 5% regression)
      - documentation_completeness

Python

Stages run in order. If any stage fails, the pipeline stops and the commit doesn't merge. Stages can run in parallel to save time (quickchecks and architecturevalidation in parallel; testing can be parallel across multiple test suites).

The key configuration decision: what's a blocking failure vs. a warning? Coverage below 70%—blocking. Coverage between 60-70%—warning. Performance regression of 10%—blocking. Regression of 3%—warning. This depends on your risk tolerance.

The Feedback Loop

This is the secret to AI code generation being productive instead of just fast: the feedback loop.

AI agent generates code → Pre-commit validation runs (5 seconds) → Violations surface with context → Agent or human understands why it failed → Regenerate or manually fix → Re-validate (5 seconds) → Passes → Commit → CI validation runs (20 minutes) → Comprehensive checks pass → Merge.

The pre-commit loop is tight: feedback in seconds, not minutes. The agent can try multiple approaches, and each one gets validated immediately. This makes the agent better—it learns your constraints through rapid iteration.

Compare to the alternative: AI agent generates code → You commit it → CI validation runs 20 minutes later → You find out it violates architecture → You regenerate → CI runs again → Repeat. This is slow and demoralizing.

Pre-commit makes the feedback loop tight enough that AI code generation becomes interactive.

Practical Example: A Validation Pipeline for Order Management Feature

Your AI agent is implementing a new order cancellation feature. Here's what validation catches at each stage.

Pre-commit (agent generates OrderCancellation use case):

PASS: Naming convention (ends with 'UseCase')
PASS: Import constraints (doesn't import from controllers or infrastructure)
FAIL: Missing test file
  Expected: order-cancellation.use-case.test.ts
  Context: All use cases must have tests to ensure domain logic is correct.
  Fix: Add test file

Agent regenerates, creates the test file.

PASS: All pre-commit checks
Code committed locally.

javascript

CI stage 1 - Architecture validation:

PASS: Dependency graph analysis
PASS: Module isolation (order module doesn't import from billing module)
FAIL: Layer violation detected
  OrderCancellation -> OrderRepository -> OrderDataSource
  But OrderCancellation is in application layer, OrderDataSource is infrastructure.
  Violation: Application layer accessing infrastructure layer directly.
  Context: Application layer must access infrastructure through repository interfaces,
           not concrete implementations.
  Fix: OrderCancellation should depend on OrderRepository interface, not concrete implementation.
       This is likely an import statement issue—check that you're using the interface, not the class.

YAML

Code doesn't merge. Developer (or AI if rerunning) sees the specific issue and fixes it.

CI stage 2 - Security scanning:

PASS: No hardcoded credentials
PASS: No SQL injection vulnerabilities
PASS: No unencrypted sensitive data

YAML

CI stage 3 - Testing:

PASS: Unit tests (82% coverage)
PASS: Integration tests with database
FAIL: Contract test with PaymentService
  OrderCancellationEvent emitted with amount: -50 (negative cancellation amount)
  PaymentService expects amount >= 0
  Context: Financial amounts must never be negative. This is a domain invariant.
  Fix: Ensure OrderCancellation always emits events with non-negative amounts.
       Check that the use case correctly calculates the refund amount.

YAML

Again, specific context. The developer knows exactly what's wrong, why it's wrong, and what domain rule was violated.

This is the validation pipeline in action: catching issues layer by layer, providing specific feedback at each stage, preventing bad code from reaching main.

Why Both Stages Are Necessary

Some teams skip pre-commit and run only CI. "If it's all checked in CI, why do we need pre-commit?"

The answer is feedback velocity. If you're waiting 20 minutes for CI feedback on code you just generated, you're not going to iterate quickly. You'll generate code, commit it, and either work on something else while CI runs or just move on. This means you're not catching issues early.

With pre-commit, you catch issues in seconds. The feedback is so fast that you iterate interactively. You generate code, see an issue, regenerate, validate, and commit—all in the span of a few minutes. This is fundamentally more productive.

Some teams skip pre-commit and just do careful code review. "If humans review it, do we need validators?"

The answer is consistency and scale. A human can miss a constraint violation if they're reviewing 50 pull requests. A validator never misses it. And with AI agents, you might have dozens or hundreds of AI-generated code outputs per day. Code review doesn't scale. Validators do.

The right answer: use both. Pre-commit for speed and interactivity. CI for rigor and scale.

Integrating with Your AI Agent Workflow

If you're using an AI agent to generate code, the validation pipeline becomes part of the generation workflow.

The agent generates code → Validators run → If pass, output code; if fail, surface violations to agent or user → If agent, agent regenerates and re-validates; if user, user fixes and re-validates.

This creates a feedback loop that improves code quality before it enters the repository. The agent learns your constraints through repeated validation failures and successes. Over time, violations become rarer due to the compounding quality improvement loop, where each caught violation becomes context for future generations.

Tools like Bitloops integrate validation directly into the generation workflow: constraints are first-class, violations surface with context about why the rule exists, and the agent can be directed to regenerate code that violates constraints. This turns AI code generation from "generate something quick" into "generate something that fits our architecture."

The practical upside: you get faster code generation (because constraints prevent bad directions early) and better code quality (because every output is validated before commit).

FAQ

What if pre-commit validation is taking too long?

You're probably checking too much in pre-commit. Move slower checks to CI. Pre-commit should be <5 seconds. If it's taking longer, you're defeating the purpose—fast feedback.

Can pre-commit validators run without a full build?

Yes. Pre-commit validators should be lightweight: regex checks, static analysis of the generated file, pattern matching. They shouldn't require compiling the entire codebase or running the build system.

What if a pre-commit check is incorrect or too strict?

You can disable it for specific cases, but track those exceptions. If you're disabling a check frequently, that's a sign the check is too strict or doesn't match your actual needs. Review and adjust.

How do we handle pre-commit validation for AI agents that don't have a human running validation?

The validation becomes part of the agent's generation loop. Generate code → run validators → if fail, regenerate; if pass, output. This turns the agent into a feedback loop that improves code quality autonomously. The tradeoff: the agent might need multiple attempts to generate passing code, but the code it outputs is vetted.

What if CI validation finds something pre-commit missed?

That's valuable information. It means your pre-commit validators don't cover something important. Either make pre-commit more comprehensive (if it's still fast) or accept that CI will catch it. The goal is to catch obvious violations early; CI catches subtle ones.

Can we parallelize CI validation stages to save time?

Absolutely. Run quickchecks, architecturevalidation, and security_scanning in parallel. They don't depend on each other. Testing might depend on architecture validation passing (if you're running integration tests), so order that correctly. Parallel stages can cut total validation time significantly.

What's the minimum set of validations we should run?

Pre-commit: naming conventions, obvious constraint violations, security red flags. CI: full architecture analysis, test coverage, security scanning. You can add more, but these are the floor. Anything less and you're not really validating.

How do we avoid "validation theater"—running validators that nobody pays attention to?

Keep validators focused on rules you actually care about enforcing. Remove validators that don't catch real issues. Make violations obvious and specific. If developers are ignoring validation failures, it's either because the validator is wrong or the rule doesn't matter—fix one or remove the validator.

Primary Sources

Framework for governing AI systems with validation and enforcement requirements. NIST AI RMF
Supply chain security framework with levels for artifact validation and testing. SLSA Framework
NIST secure software development framework with validation and testing practices. NIST SSDF
SOC 2 criteria for designing change management and validation controls. SOC 2 AICPA
OWASP security risks for large language model applications to validate against. OWASP Top 10 LLM
OpenSSF scorecard for evaluating continuous integration security practices. OpenSSF Scorecard

The Two-Stage Problem

Why Two Stages?

What Pre-Commit Validation Checks

What CI Validation Checks

Configuring Pre-Commit Validation

Configuring CI Validation

The Feedback Loop

Practical Example: A Validation Pipeline for Order Management Feature

Why Both Stages Are Necessary

Integrating with Your AI Agent Workflow

FAQ

What if pre-commit validation is taking too long?

Can pre-commit validators run without a full build?

What if a pre-commit check is incorrect or too strict?

How do we handle pre-commit validation for AI agents that don't have a human running validation?

What if CI validation finds something pre-commit missed?

Can we parallelize CI validation stages to save time?

What's the minimum set of validations we should run?

How do we avoid "validation theater"—running validators that nobody pays attention to?

Primary Sources

More in this hub

Get Started with Bitloops.