Playbook/Spec-Driven Development

Framework · Available

Spec-DrivenDevelopment

When AI agents can generate thousands of lines of code in seconds, the question is no longer “can we build it?” but “should we build it, and how exactly should it work?” SpecDD puts specification at the center of the SDLC.

KHBy Krishna Halaharvi, Engineering Leader22 min read

Why Now

The bottleneck has moved.

The Agentic Shift

AI coding assistants and agent swarms have fundamentally changed software delivery. Code generation is nearly free. What's expensive now is clarity of intent: knowing exactly what to build, how it should behave, and what constraints it must respect.

Without clear specifications, AI agents produce plausible-looking code that drifts from business requirements. Teams waste cycles on rework. The spec becomes the control plane for agentic engineering.

The New Bottleneck

Before AI: Implementation was the bottleneck

Writing code took weeks. Reviews took days. Deployment was manual.

After AI: Specification is the bottleneck

Code can be generated in minutes. The constraint is knowing what to generate.

Principles

Six ideas the framework rests on.

Think Before You Build

Writing a spec forces structured thinking about requirements, constraints, edge cases, and acceptance criteria before any code is written. The cheapest time to find a design flaw is on paper.

Shared Understanding

A spec is a contract between author, team, reviewers, AI agents, and stakeholders. It ensures everyone — human and machine — is building toward the same outcome.

End-to-End Traceability

Every code change traces back to a spec, which traces back to a ticket, which traces back to a business objective. This chain is auditable and supports compliance.

Review Before Implementation

Reviewing a spec is 10–50x cheaper than reviewing implemented code. Catching a wrong approach in a spec takes minutes; catching it in production takes days.

Living Documentation

Specs are checked into the repository alongside the code they describe. They evolve with the codebase. They are the single source of truth for why a system works the way it does.

Agent-Ready Artifacts

Well-written specs become prompts for AI agents. Clear requirements with RFC 2119 language (SHALL, SHOULD, MAY) give agents unambiguous instructions to work from.

Maturity Model

Where is your org on the curve?

Orgs move through five levels. Most Series A to C teams land between Emerging and Defined. An honest read of your team's artifacts places you on this curve and names the next level's specific indicators.

Level 2

Defined

Standard established. Templates exist. Specs required for major changes.

Key Indicators

Spec directory in repos
Templates available
>50% of feature MRs have specs

The 6-Week Pilot

Proven path from Emerging to Defined

The fastest way to move up a level without boiling the ocean. I've run this playbook across teams from 10 to 200+ engineers.

Weeks 1–2

Foundation

•Select 1–2 willing teams
•Introduce standard and templates
•Write specs for in-flight features

Weeks 3–4

Iteration

•First spec reviews
•Gather feedback on templates
•Iterate on process

Weeks 5–6

Assessment

•Measure rework rates and review times
•Compare against baseline
•Present findings to leadership

Newsletter

Want the next chapter when it drops?

Short dispatches between chapters. Field notes, not summaries.

The Framework

Everything you need to roll this out.

Not every change needs a full specification. Match effort to risk and complexity. The goal is intentionality, not bureaucracy.

REQUIREDAlways write a spec

New feature or capability
API contract changes (new or modified endpoints)
Database schema migrations
Architecture or infrastructure changes
Security-sensitive changes (auth, encryption, access control)
Cross-team integrations

OPTIONALUse judgment

Bug fixes (isolated, well-understood) — ticket description may suffice
Configuration changes (feature flags, thresholds)
Dependency updates (unless breaking changes)
Cosmetic/copy changes

SPIKE FIRSTInvestigate before specifying

If you're unsure about technical feasibility, write a spike ticket first. Spikes produce a findings document, not a spec. The spec comes after the spike confirms the approach is viable.

Rule of thumb: If you're unsure whether a change requires a spec, err on the side of writing one. A lightweight spec takes 30 minutes. Rework from a misunderstood requirement takes days.

A spec that meets this standard must include specific elements. Specs missing required elements should not be approved.

Required

Ticket Reference

Every spec links to exactly one work item. One-to-one mapping.

Context / Background

Why does this change exist? What problem does it solve?

Requirements

Numbered, testable requirements using RFC 2119 language (SHALL, SHOULD, MAY).

Technical Design

How will this be implemented? Architecture decisions, data models, API contracts.

Acceptance Criteria

Concrete, measurable criteria for completion. Checkboxes that can be verified.

Out of Scope

Explicitly states what this spec does not cover. Prevents scope creep.

Conditional

Dependencies

Required if work depends on other teams or pending work.

Migration / Rollback Plan

Required for database migrations, infrastructure changes.

Security Considerations

Required for changes touching auth, data access, or PII.

Performance Considerations

Required when change may affect latency or throughput.

Alternatives Considered

Recommended. What other approaches were evaluated?

Open Questions

Recommended. Unresolved questions needing input during review.

Requirements should be unambiguous enough that any engineer — or AI agent — would implement them the same way. Use RFC 2119 language for precision.

Keyword

Meaning

When to Use

SHALL

Absolute requirement. Non-negotiable.

Core functionality. If this doesn't work, the feature is broken.

SHALL NOT

Absolute prohibition.

Security constraints, compliance requirements.

SHOULD

Recommended but not blocking.

Best practices, UX improvements that enhance but aren't critical.

SHOULD NOT

Discouraged but not prohibited.

Known anti-patterns, legacy approaches.

MAY

Optional. Implementation team decides.

Nice-to-haves, future enhancements, optimizations.

Bad Requirements

“The system should be fast”
“Handle errors appropriately”
“Support authentication”
“Be secure”

Good Requirements

“The system SHALL return search results within 200ms at p95”
“The system SHALL return HTTP 400 with error code INVALID_INPUT when validation fails”
“The system SHALL authenticate requests using OAuth2 Bearer tokens”
“The system SHALL NOT include user PII in log output”

The Test: If two different engineers (or AI agents) would implement the requirement differently based on their personal interpretation, the requirement is not specific enough.

Every significant change should be traceable through five links. When something breaks in production, traceability lets you answer: What changed? Why was it changed? Who approved it?

Work Item

Ticket in your project tracker

Contains: Business objective, user story, priority, acceptance criteria

Links to: Specification (forward link)

Specification

Spec file in repository

Contains: Technical design, requirements, constraints, out of scope

Links to: Ticket (back), Merge Request (forward)

Merge Request

PR/MR in your VCS

Contains: Code changes, test additions, review comments, approvals

Links to: Spec (back), Ticket (back), CI Pipeline (forward)

CI Pipeline

Build/test pipeline

Contains: Test results, security scan, build artifacts, coverage

Links to: MR (back), Deployment (forward)

Deployment

Production release

Contains: Release artifact, environment, timestamp, rollback capability

Links to: CI Pipeline (back), Ticket closing

Bidirectional: Every link connects forward (to the next stage) and backward (to the previous stage). This means you can trace from a ticket to its deployed code, or from a production change back to the business requirement that drove it.

The Novel

10+ pages for a simple feature

Match spec length to complexity. A simple CRUD endpoint needs 1–2 pages. If your spec is longer than the code will be, simplify.

The Handwave

"Implement per standard patterns"

Name the specific files, functions, or patterns to follow. Link to them. Don't assume the reader knows what you consider 'standard'.

The Wishlist

50 requirements, no prioritization

Use SHALL/SHOULD/MAY to prioritize. If you have 50 SHALL requirements, your scope is too large — split the spec.

The Solution-First Spec

Describes implementation, skips the problem

Always lead with Context. If your reader doesn't understand why this change exists, they can't evaluate whether your solution is appropriate.

The Clone

Copy-paste from work item ticket

The ticket is the what. The spec is the how. The spec adds technical design, constraints, acceptance criteria, and edge cases that don't belong in a ticket.

The Orphan

Spec exists but nobody references it

Link the spec from your ticket. Reference it in MR descriptions. Keep it updated during implementation. A spec nobody reads is a spec nobody benefits from.

The Fossil

Approved spec never updated when requirements changed

When requirements change, update the spec first, then the code. The spec is the source of truth. Stale specs are worse than no specs.

Missing Out of Scope

No boundaries defined

Always include Out of Scope. It's the most underrated section. It prevents the #1 cause of spec/implementation misalignment: unspoken assumptions about boundaries.

When AI agents are part of your delivery workflow, specs become even more critical. They serve as the instruction set for agent orchestration.

Specs as Prompts

A well-written spec with RFC 2119 requirements becomes a natural prompt for code generation agents. The spec's Requirements section tells the agent exactly what to build. The Technical Design section provides implementation constraints. Out of Scope prevents the agent from over-engineering.

Guardrails for Agents

Specs define boundaries. An agent without a spec will hallucinate requirements, invent edge cases, and add features nobody asked for. The spec is the governor that keeps agent output aligned with actual business needs.

Verifiable Output

The Acceptance Criteria section becomes a checklist for validating agent-generated code. Each criterion should be independently verifiable. Automated tests can be generated directly from the spec's requirements.

Human-in-the-Loop

Spec review remains human. Even when agents write code, humans approve the design. The spec review is the control point where engineering judgment is applied. Implementation can be delegated; design ownership cannot.

The Agentic SDLC

In an AI-augmented workflow, the software development lifecycle shifts:

Human writes specHuman reviews specAgent generates codeHuman reviews codeDeploy

Not every spec needs every section. Match effort to risk and complexity.

Complexity

Spec Time

Sections

Example

Small

30–60 min

Context, Requirements, Acceptance, Out of Scope

New utility function, simple endpoint

Medium

1–2 hours

All required + relevant conditional

New feature module, API with multiple endpoints

Large

2–4 hours

All sections including alternatives, migration

New service, architecture change

Complex

4–8 hours

Full spec with sub-specs per component

Platform migration, cross-service integration

Sample Specifications

Templates you can steal today.

Three production-grade specs from real engagements — a feature, an API, and a migration. Copy them, strip the specifics, and drop them into your repo.

Feature Spec

New user authentication flow with OAuth support

# Feature Spec: OAuth 2.0 Authentication

## Context / Background
The application currently supports only email/password authentication. Users have requested social login options to reduce friction during signup. This spec covers adding Google and GitHub OAuth providers.

## Requirements

**REQ-1:** The system SHALL support Google OAuth 2.0 authentication.
**REQ-2:** The system SHALL support GitHub OAuth 2.0 authentication.
**REQ-3:** The system SHALL link OAuth accounts to existing users if email matches.
**REQ-4:** The system SHALL NOT allow duplicate accounts with the same email.
**REQ-5:** Users SHOULD be able to unlink OAuth providers from settings.
**REQ-6:** The login page MAY display provider-specific branding.

## Technical Design

### Authentication Flow
1. User clicks "Sign in with Google/GitHub"
2. Redirect to provider's OAuth consent screen
3. Provider redirects back with authorization code
4. Backend exchanges code for access token
5. Fetch user profile from provider
6. Create or link user account
7. Issue session token

### Database Changes
- Add `oauth_providers` table:
- `id` (uuid, primary key)
- `user_id` (uuid, foreign key to users)
- `provider` (enum: 'google', 'github')
- `provider_user_id` (string)
- `created_at` (timestamp)

### API Endpoints
- `GET /auth/oauth/:provider` - Initiate OAuth flow
- `GET /auth/oauth/:provider/callback` - Handle OAuth callback
- `DELETE /auth/oauth/:provider` - Unlink provider

## Acceptance Criteria
- [ ] User can sign up with Google
- [ ] User can sign up with GitHub
- [ ] Existing user with matching email is linked, not duplicated
- [ ] User can unlink provider if they have password set
- [ ] Error shown if unlinking would leave account with no auth method

## Out of Scope
- Apple Sign In (planned for future iteration)
- Enterprise SSO / SAML
- Multi-factor authentication for OAuth accounts

## Security Considerations
- Store only provider user ID, not access tokens
- Validate OAuth state parameter to prevent CSRF
- Rate limit OAuth callback endpoint

Keep Reading

SpecDD is one chapter. Here's the rest.

Back to Playbook

Implement SpecDD

Ready to bring intentionality to your engineering org?

I help teams implement Spec-Driven Development as part of their AI transformation. From framework design to team training.

Let's Talk