LangChain Evolution Part 3: Production-Ready Agentic Systems

In Part 1, we built an intelligent agent. In Part 2, we gave it a flexible, graph-based brain with LangGraph. Now, we face the final frontier: making our agent truly production-ready.

A production system isn’t just about getting the right answer; it’s about what happens when things go wrong. How do you handle long-running tasks that might fail? How do you scale from a single agent to a team of specialized agents? And how do you ensure your system is resilient and observable?

This post covers three critical patterns for graduating your LangGraph agent from a prototype to a robust, production-grade system.

The New Challenge: From a Smart Agent to a Resilient System

Our graph from Part 2 is powerful, but it has two major weaknesses in a production environment:

It’s ephemeral: If the server restarts or the process crashes midway through a 30-minute generation task, all progress is lost. The entire graph has to start from scratch.
It’s a monolith (again): While the workflow is flexible, the agent itself is still a single, monolithic entity trying to do everything. As we add more tools and complexity, the agent’s “brain” (its routing logic) becomes increasingly convoluted.

We’ll solve these problems by introducing persistence and multi-agent collaboration.

A multi-agent system with persistent state.

Click to zoom

1. Persistence and Checkpointing: Never Lose Your Work

Long-running agentic workflows are prone to failure. The LLM might return a malformed response, an external API could time out, or the server could crash. Without persistence, these failures are catastrophic.

In LangGraph, this is handled by a Checkpointer. You can think of it as a production-grade version of the IAgentStateRepository pattern we designed in Part 1. It automatically saves the graph’s state after every node execution.

How it Works in Practice

Instead of a simple in-memory store, we can use something more robust like by creating our own checkpointer that uses a PrismaStateRepository.

import { PrismaClient } from "@prisma/client";
import { BaseCheckpointer, Checkpoint } from "@langchain/langgraph";
import { RunnableConfig } from "@langchain/core/runnables";
import { PrismaStateRepository } from "./persistence/stateRepository"; // Your existing code!

// Create a custom checkpointer that leverages our existing repository
class PrismaCheckpointer extends BaseCheckpointer {
  private repo: PrismaStateRepository;

  constructor(prisma: PrismaClient) {
    super();
    this.repo = new PrismaStateRepository(prisma);
  }

  // Load the state and wrap it in the format LangGraph expects
  async get(config: RunnableConfig): Promise<Checkpoint | undefined> {
    const thread_id = config.configurable?.thread_id;
    if (!thread_id) return undefined;

    const savedState = await this.repo.load(thread_id);
    if (!savedState) return undefined;

    // Reconstruct the checkpoint object from your saved state
    const checkpoint: Checkpoint = {
      v: 1,
      ts: savedState.updatedAt.toISOString(), // Assuming you save this
      channel_values: savedState.channel_values, // The core state
      channel_versions: savedState.channel_versions, // Timestamps for each channel
    };
    return checkpoint;
  }

  async put(config: RunnableConfig, checkpoint: Checkpoint): Promise<void> {
    const thread_id = config.configurable?.thread_id;
    if (!thread_id) return;

    // Save the relevant parts of the checkpoint to your Prisma model
    await this.repo.save({
      projectId: thread_id,
      // You would map checkpoint values to your Prisma schema here
      ...checkpoint.channel_values,
      updatedAt: new Date(checkpoint.ts),
    });
  }
}

// 1. Initialize our robust persistence layer
const prisma = new PrismaClient();
const checkpointer = new PrismaCheckpointer(prisma);

// 2. Compile the graph with the checkpointer
const app = graph.compile({ checkpointer });

// 3. Invoke with a unique ID for the project/thread
const thread = { configurable: { thread_id: "project-abc-123" } };
await app.invoke({ /* ... */ }, thread);

The Payoff:

Resilience: If a step fails, you can fix the issue and resume the graph from the last successful state. No more re-running the entire workflow.
Long-Running Tasks: You can now build agents that take hours or even days to complete a task, confident that their progress is safe.
Full Auditability: Every step and state change is now logged in your database, providing a complete audit trail of the agent’s run.
Asynchronous Workflows: This is the big one. A user can now kick off a task via an API call, and the agent can chug away in the background. But how does the user know when it’s done? This leads us to our next critical pattern…

2. Real-Time Observability with Server-Sent Events (SSE)

A persistent, background agent is great, but from a user’s perspective, it’s a black box. Did it start? Is it stuck? Is it finished? To create a great user experience, we need to stream the agent’s progress back to the frontend in real-time.

This is the perfect use case for Server-Sent Events (SSE). Unlike WebSockets, SSE is a simple, one-way channel from the server to the client, designed specifically for this kind of progress update.

Polling vs. SSE vs. WebSockets

Click to zoom

Hooking SSE into LangGraph

We can create an SSE endpoint that taps directly into our agent’s persistent state. Every time the agent’s state is updated in the database (thanks to our PrismaCheckpointer), we can push that update to the client.

import { NextRequest } from 'next/server';
import { prisma } from '@/lib/prisma'; // Your prisma client

export async function GET(req: NextRequest, { params }: { params: { thread_id: string } }) {
  const { thread_id } = params;

  const stream = new ReadableStream({
    async start(controller) {
      const encoder = new TextEncoder();

      const sendUpdate = (data: any) => {
        controller.enqueue(encoder.encode(`data: ${JSON.stringify(data)}\n\n`));
      };

      // For simplicity, this example polls the database for changes. In a high-performance
      // production environment, you would replace this with a more efficient mechanism like
      // **Postgres LISTEN/NOTIFY**, a Redis Pub/Sub channel, or a dedicated message queue
      // to push updates instantly without constant polling.
      const interval = setInterval(async () => {
        const state = await prisma.project.findUnique({ where: { id: thread_id }});
        if (state) {
          sendUpdate({
            currentStep: state.currentStep,
            lastDecision: state.agentDecisions ? state.agentDecisions.slice(-1)[0] : null
          });
        }
      }, 2000); // Check for updates every 2 seconds

      req.signal.addEventListener('abort', () => {
        clearInterval(interval);
        controller.close();
      });
    },
  });

  return new Response(stream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache, no-transform',
      'Connection': 'keep-alive',
    },
  });
}

The Payoff:

Live User Feedback: The UI can show a live feed of the agent’s actions: “Generating requirements…”, “Analyzing diagrams…”, “Task complete!”.
Decoupled Architecture: The agent process doesn’t need to know or care about the client connection. It just writes to the database. The SSE endpoint is a separate, read-only view into that state.
Enhanced Debugging: You can watch the agent’s progress in real-time in your own admin dashboard, making it much easier to spot when things get stuck.

For a deep dive into building robust SSE endpoints in Next.js, check out my complete guide: Real-Time Notifications with Server-Sent Events (SSE) in Next.js.

3. Multi-Agent Collaboration: The Team of Experts

As our agent’s responsibilities grow, it becomes a bottleneck. A single agent trying to be an expert at research, diagramming, coding, and security is an anti-pattern. The solution is to create a team of specialized agents and a master orchestrator agent to delegate tasks.

This is where LangGraph truly shines. A “graph” can be one of the nodes inside another graph.

Building a Multi-Agent System

Define Specialist Agents: Create separate, focused graphs for each specialty.
- ResearchAgent: An expert at using search tools and synthesizing information.
- CodeAgent: An expert at writing and reviewing code.
- DiagramAgent: An expert at creating diagrams from technical specs.
Create an Orchestrator Agent: This is a higher-level graph whose job is not to do the work, but to delegate it.

// In the Orchestrator graph definition

// 1. Create the specialist agents (each is a compiled LangGraph app)
const researchAgent = createResearchAgent();
const codeAgent = createCodeAgent();

// 2. Define nodes that delegate tasks
async function delegateToResearch(state: AgentState): Promise<Partial<AgentState>> {
  const researchResult = await researchAgent.invoke(state.researchTask);
  return { research: researchResult };
}

async function delegateToCode(state: AgentState): Promise<Partial<AgentState>> {
  const codeResult = await codeAgent.invoke(state.codingTask);
  return { code: codeResult };
}

// 3. Add these delegation nodes to the orchestrator graph
graph.addNode("research", delegateToResearch);
graph.addNode("coding", delegateToCode);

// 4. The orchestrator's edges are routers that decide which agent to call next
graph.addConditionalEdges("start", (state) => {
  if (state.nextAction === "research") return "research";
  if (state.nextAction === "coding") return "coding";
  return "end";
});

The Payoff:

Scalability & Maintainability: Each agent is simple and focused. You can update the CodeAgent without any risk of breaking the ResearchAgent.
Improved Performance: Specialist agents are better and faster at their specific tasks.
Parallel Execution: The orchestrator can potentially delegate tasks to multiple agents to work in parallel, dramatically speeding up the overall workflow.

4. Advanced Error Handling & The Strategy Pattern

In a production system, not all tasks are created equal. Generating a simple Mermaid diagram is less critical and requires a different AI model than generating a legally-sensitive security analysis document. A one-size-fits-all approach to tools is inefficient and risky.

This is where the Strategy Pattern comes in, allowing us to configure the “strategy” for each tool at runtime.

From Static Tools to Configurable Strategies

In our first agent, every tool used the same model settings. Now, we can create a ModelFactory that allows us to assign different models, temperatures, or even providers (e.g., GPT-4 for analysis, Claude for writing) to different tools.

import { BaseChatModel } from "@langchain/core/language_models/chat_models";

export class ModelFactory {
  // ... constructor and createModel implementation ...

  getPlanningModel(): BaseChatModel {
    // Use a powerful, expensive model for high-level reasoning
    return this.createModel({ modelName: "gpt-4-turbo", temperature: 0.2 });
  }

  getDiagramModel(): BaseChatModel {
    // Use a cheaper, faster model with low temperature for consistent syntax
    return this.createModel({ modelName: "gemini-flash", temperature: 0.1 });
  }

  getRefinementModel(): BaseChatModel {
    // Use a model great at creative writing for refinement tasks
    return this.createModel({ modelName: "claude-3-sonnet", temperature: 0.7 });
  }
}

// When creating agents, we pass in the factory
const modelFactory = new ModelFactory();
const codeAgent = createCodeAgent({ model: modelFactory.getPlanningModel() });
const diagramAgent = createDiagramAgent({ model: modelFactory.getDiagramModel() });

The Payoff:

Cost & Performance Optimization: Use the right tool for the job. Fast, cheap models for simple tasks; powerful, expensive models for complex reasoning.
Increased Quality: Tailor the model and its parameters (like temperature) to the specific needs of the task, yielding much higher-quality results.
Flexibility: Easily swap out models or providers for A/B testing or to adapt to new state-of-the-art models without rewriting your agent logic.

Conclusion: The Journey to Autonomous Systems

This three-part series has taken us on a journey:

We first tamed the monolithic prompt by creating a stateful agent.
We then freed the agent from its hardcoded script by building a dynamic graph.
Finally, we’ve made the agent production-ready with:
- Persistent State to ensure resilience.
- Real-Time Observability via SSE for a great user experience.
- Multi-Agent Collaboration for scalability and specialization.
- Configurable Tool Strategies for optimizing cost and quality.

We are no longer just building prompts or even agents. We are designing autonomous systems that can perform complex, long-running tasks with resilience and scalability. The patterns we’ve explored—stateful execution, graph-based workflows, and multi-agent collaboration—are the fundamental building blocks for the next generation of AI applications.

The evolution doesn’t stop here, but you now have the architectural foundation to build truly powerful and intelligent systems with LangChain.

Series Navigation:

Part 1: From Monolithic Prompts to Intelligent Agents
Part 2: From Hardcoded Workflows to Dynamic Graphs
Part 3: Production-Ready Agentic Systems (You are here)

Production-Ready Agent Cheat Sheet

A quick reference for the core patterns needed to build robust, production-grade agentic systems.

1. Core Production Patterns

Pattern	Purpose	Key Idea
Persistence (Checkpointers)	Survive crashes and resume long-running tasks by saving state to a database.	Extend `BaseCheckpointer` to connect to your DB. Use a unique `thread_id` to track and resume each run.
Multi-Agent Collaboration	Break down complex problems by delegating tasks to a team of specialized agents.	An “Orchestrator” agent delegates work to “Specialist” agents, which are often graphs themselves.
Model Strategy Pattern	Optimize cost, speed, and quality by using different models for different tasks.	A `ModelFactory` provides the right model for the job (e.g., GPT-4 for reasoning, a cheaper model for formatting).
Resilience & Error Handling	Prevent system failure from transient errors (e.g., API timeouts, rate limits).	Wrap tool calls in resilience patterns like retries (with backoff), fallbacks, and circuit breakers.

2. Multi-Agent Collaboration Architectures

Pattern	Use Case	How it Works
Hierarchical	An orchestrator delegates tasks to specialists.	An orchestrator graph calls specialist graphs as nodes.
Sequential	Agents work together in a pipeline.	Agent A’s output becomes Agent B’s input.
Parallel	Multiple agents work at the same time to speed things up.	Use `Promise.all()` to run agents concurrently and combine their results.
Competitive	Multiple agents race to find the best solution.	The first agent to complete the task “wins,” and the others are cancelled.

3. Best Practices Checklist

Persistence: Use checkpointers for any task that runs longer than 30 seconds.
Specialization: Break down large, monolithic agents into smaller, focused specialists.
Cost Optimization: Use the right model for each task. Don’t use your most expensive model for everything.
Error Handling: Implement retries with exponential backoff for any unreliable API calls.
Graceful Degradation: Have fallback models or strategies ready for when a primary service fails.
Monitoring: Log all major state transitions, agent decisions, and errors.
Security: Always validate and sanitize LLM outputs before they are used in sensitive operations.

Related Reading:

Pedro Alonso

LangChain Evolution Part 3: Production-Ready Agentic Systems

The New Challenge: From a Smart Agent to a Resilient System

1. Persistence and Checkpointing: Never Lose Your Work

How it Works in Practice

2. Real-Time Observability with Server-Sent Events (SSE)

Hooking SSE into LangGraph

3. Multi-Agent Collaboration: The Team of Experts

Building a Multi-Agent System

4. Advanced Error Handling & The Strategy Pattern

From Static Tools to Configurable Strategies

Conclusion: The Journey to Autonomous Systems

Production-Ready Agent Cheat Sheet

1. Core Production Patterns

2. Multi-Agent Collaboration Architectures

3. Best Practices Checklist

Ready to Build with LLMs?

Related Articles

LangChain Evolution Part 2: From Hardcoded Workflows to Dynamic Graphs

LangChain Evolution Part 1: From Monolithic Prompts to Intelligent Agents

Building Your AI-Powered Sales Engine Part 2: From Lead Generation to Client Acquisition

AI Automation Business for Developers - Part 1

The New Challenge: From a Smart Agent to a Resilient System

1. Persistence and Checkpointing: Never Lose Your Work

How it Works in Practice

2. Real-Time Observability with Server-Sent Events (SSE)

Hooking SSE into LangGraph

3. Multi-Agent Collaboration: The Team of Experts

Building a Multi-Agent System

4. Advanced Error Handling & The Strategy Pattern

From Static Tools to Configurable Strategies

Conclusion: The Journey to Autonomous Systems

Production-Ready Agent Cheat Sheet

1. Core Production Patterns

2. Multi-Agent Collaboration Architectures

3. Best Practices Checklist

Ready to Build with LLMs?

Get Your Free LLM Cheat Sheet

Related Articles

LangChain Evolution Part 2: From Hardcoded Workflows to Dynamic Graphs

LangChain Evolution Part 1: From Monolithic Prompts to Intelligent Agents

Building Your AI-Powered Sales Engine Part 2: From Lead Generation to Client Acquisition

AI Automation Business for Developers - Part 1

Get Your Free LLM Cheat Sheet