Stage 6: AI Project Practice and Production Deployment

12 core principles for building production-ready LLM-powered software, focusing on solving key engineering challenges in AI application from prototype to production.

ProductionAILLMEngineeringAIAgentsGitHubTextFreeEnglish

12 Factor Agents - 12 Principles for Building Reliable LLM Applications

Overview

12 Factor Agents is a principled methodology for building production-grade LLM-powered software, developed by Dex Horthy and the HumanLayer team.

Background and Motivation

Why are these principles needed?

After speaking with over 100 SaaS builders (primarily technical founders), the authors observed that most AI Agent development encounters a "70-80% quality bottleneck":

  1. Rapid Initial Progress: Quickly reaching 70-80% functionality using existing frameworks.
  2. Quality Bottleneck: Discovering that 80% is not good enough for customer-facing features.
  3. Framework Limitations: Breaking through 80% requires reverse-engineering frameworks, prompts, processes, etc.
  4. Starting from Scratch: Ultimately having to abandon the framework and start over.

Core Insights

  • Most successful production-grade "AI Agents" are not truly that "agentic."
  • They are primarily well-engineered software systems that cleverly leverage LLMs at key points.
  • The fastest way to get high-quality AI software to customers is to adopt small, modular concepts and integrate them into existing products.

12 Core Principles

Factor 1: Natural Language to Tool Calls

Natural Language to Tool Calls

The core superpower of LLMs is converting natural language into structured data (JSON). All Agent operations should begin with this fundamental conversion.

// Core Pattern
const response = await llm.generateToolCall(userMessage, availableTools);
// response is structured JSON, not free-form text

Factor 2: Own your prompts

Own your prompts

Treat prompts as critical, version-controlled code assets, not one-off inputs.

  • Manage prompts within your application codebase.
  • Support systematic testing, deployment, and rollback.
  • Ensure consistency, reproducibility, and controlled evolution.

Factor 3: Own your context window

Own your context window

Actively and intentionally manage the LLM's context window.

  • Ensure the context window only contains information directly relevant to the current task.
  • Prune unnecessary data to prevent "context drift."
  • Improve LLM performance and reduce token usage.

Factor 4: Tools are just structured outputs

Tools are just structured outputs

Treat "tools" as structured JSON outputs from the LLM, and validate them.

// Tool calls are just JSON schema validation
const toolCall = validateToolCall(llmOutput, toolSchema);
const result = await executeValidatedTool(toolCall);

Factor 5: Unify execution state and business state

Unify execution state and business state

Ensure the LLM's internal "execution state" remains consistent with the application's actual "business state."

  • Prevent the LLM from operating based on outdated or incorrect information.
  • Avoid hallucinations or invalid operations.
  • Persist execution state alongside business state.

Factor 6: Launch/Pause/Resume with simple APIs

Launch/Pause/Resume with simple APIs

Design LLM agents with clear programmatic interfaces that support lifecycle management.

interface AgentControl {
  launch(config: AgentConfig): Promise<AgentInstance>;
  pause(instanceId: string): Promise<void>;
  resume(instanceId: string): Promise<void>;
}

Factor 7: Contact humans with tool calls

Contact humans with tool calls

Handle human interaction as first-class tool calls.

  • Route high-risk steps for human review.
  • Human feedback returns to the system as structured input.
  • Seamless human-agent collaboration workflows.

Factor 8: Own your control flow

Own your control flow

Keep control flow in plain code or your own workflow engine.

  • Run explicit OODA loops (Observe-Orient-Decide-Act).
  • Use convergence heuristics instead of nested prompts.
  • Avoid letting the LLM manage complex control flow.

Factor 9: Compact Errors into Context Window

Compact Errors into Context Window

Compress error information into the next prompt to close the feedback loop.

  • Enable the LLM to learn and recover from errors.
  • Provide structured error information.
  • Support self-healing capabilities.

Factor 10: Small, Focused Agents

Small, Focused Agents

Build small, single-purpose agents rather than large, monolithic chatbots.

// Good practice: Focused, small agents
class EmailSummaryAgent {
  async summarize(email: Email): Promise<Summary> { /* ... */ }
}

class PriorityClassificationAgent {
  async classify(email: Email): Promise<Priority> { /* ... */ }
}

Factor 11: Trigger from anywhere, meet users where they are

Trigger from anywhere, meet users where they are

Trigger agents from where users are already working: CLI, webhooks, cron, etc.

  • Integrate into existing workflows.
  • Multiple triggering mechanisms.
  • User-friendly access points.

Factor 12: Make your agent a stateless reducer

Make your agent a stateless reducer

The application manages state, and the Agent remains stateless.

// Stateless Agent Pattern
class StatelessAgent {
  step(state: State): Promise<State> {
 
  }
}

Technical Architecture Patterns

Core Loop Simplification

Every Agent essentially boils down to:

const prompt = "Instructions for next step selection";
const switchFunction = (json) => routeToFunction(json);
const context = manageWhatLLMSees();
const loop = whileNotDone();

Distinction from Traditional DAGs

  • Traditional DAG: Software engineers code every step and edge case.
  • Agent Method: Give the Agent a goal and a set of transformations, letting the LLM decide the path in real-time.
  • Reality: The most successful implementations cleverly use the Agent pattern within a larger deterministic DAG.

Applicable Scenarios

Best Suited Scenarios

  • Structured tasks requiring natural language understanding.
  • Workflows requiring human-agent collaboration.
  • Complex multi-step business processes.
  • Production environments requiring high reliability.

Unsuitable Scenarios

  • Simple deterministic tasks (direct code is better).
  • Mission-critical tasks requiring 100% accuracy.
  • Extremely resource-constrained environments.

Implementation Suggestions

Progressive Adoption

  1. Start Small: Choose one principle relevant to a current challenge.
  2. Implement and Observe: See the improvement.
  3. Gradually Add: Then add another principle.
  4. Continuous Optimization: Continuously refine and adjust.

Recommended Tech Stack

  • Language: TypeScript (author's recommendation, also supports Python and other languages).
  • Key Library: BAML (for schema-aligned parsing).
  • Deployment: Supports cloud-native and Kubernetes.

Related Resources

Official Resources

Further Reading

Summary

12 Factor Agents provides a systematic approach to building truly production-grade LLM applications. It emphasizes not building more magical frameworks, but applying better software engineering practices to LLM capabilities. Remember the core philosophy: Agents are software; treat them as such, and they will reward you with reliability, maintainability, and capabilities unmatched by competitors.