Table of Contents
In this post, I want to show you how to build a reliable system for getting structured data from language models. We’ll focus on making sure the JSON responses are valid and fixing them when they’re not. The best part? We’ll do this using Ollama and the DeepSeek 8b model, which can run completely locally on a MacBook Pro or a PC with a decent GPU. Some benefits of this approach are:
- Language models can give us data in a structured format we can use in our applications
- We can automatically fix errors when the model gives us bad JSON
- The system is type-safe, which helps catch bugs early
- We can retry failed requests with helpful feedback
- Everything runs locally - perfect for development and debugging
If you haven’t worked with Zod before, it’s a TypeScript-first schema validation library that helps us define and enforce data shapes. Think of it like a strict security guard that checks if data matches exactly what we expect. We’ll use it to make sure our AI responses are in the correct format, and it will help us catch and fix any mismatches.
To follow along, you’ll need:
- Node.js installed
- Basic TypeScript knowledge (we’ll use types to make our code more reliable)
- Ollama installed (see my guide at Ollama setup post)
- The
deepseek-r1:8b
model pulled in Ollama (ollama pull deepseek-r1:8b
)
1. The Problem We’re Solving
Let’s say we’re building a content moderation system. We want to analyze user comments and decide if they’re safe to post. The language model needs to tell us several things:
- Is the content toxic?
- How likely is it to be spam?
- What category of content is it?
- Should we approve it, reject it, or review it?
Language models are great at understanding and analyzing text, but they can be inconsistent with their output format. We need to make sure we get data in a structure our application can use.
Here’s the format we want:
interface ModerationResult {
toxic: boolean;
spamLikelihood: number; // between 0 and 1
contentCategory: string;
recommendedAction: 'approve' | 'reject' | 'review';
confidence: number;
explanation: string;
}
But language models, including local ones like DeepSeek, don’t always give us perfect JSON. Here are some common problems:
// Missing quotes around strings
{
toxic: false,
spamLikelihood: 0.2,
contentCategory: blog_comment, // Should be "blog_comment"
recommendedAction: "approve",
confidence: 0.95,
explanation: "Looks like a regular comment"
}
// Wrong data types
{
"toxic": "false", // Should be a boolean, not a string
"spamLikelihood": "low", // Should be a number
"recommendedAction": "APPROVE", // Wrong format
"confidence": 95, // Should be between 0 and 1
"explanation": "Looks like a regular comment"
}
2. Building Our Solution
Let’s build this step by step. First, we’ll define what valid data looks like using Zod:
import { z } from 'zod';
const ModerationSchema = z.object({
toxic: z.boolean(),
spamLikelihood: z.number()
.min(0)
.max(1)
.describe('How likely is this spam (0-1)'),
contentCategory: z.string(),
recommendedAction: z.enum(['approve', 'reject', 'review']),
confidence: z.number()
.min(0)
.max(1),
explanation: z.string()
.min(1)
.max(500)
});
type ModerationResult = z.infer<typeof ModerationSchema>;
Now let’s create our processor class that will:
- Talk to Ollama
- Get JSON from its responses
- Check if the JSON is valid
- Try again if something goes wrong
Here’s the complete implementation:
import { z } from 'zod';
/**
* OllamaModerationProcessor handles content moderation using local LLM inference.
* It processes text content and returns structured moderation decisions with built-in
* error recovery and validation.
*/
class OllamaModerationProcessor {
private maxRetries: number;
private retryDelay: number;
/**
* Initialize the processor with retry settings
* @param options Configuration for retry behavior
*/
constructor(options: { maxRetries?: number; retryDelay?: number } = {}) {
this.maxRetries = options.maxRetries ?? 3;
this.retryDelay = options.retryDelay ?? 1000;
}
/**
* Makes HTTP requests to the Ollama API
* @param prompt The text prompt to send to the model
* @returns Raw response string from the model
*/
private async callOllama(prompt: string): Promise<string> {
try {
const response = await fetch('http://localhost:11434/api/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'deepseek-r1:14b',
prompt: prompt,
stream: false
})
});
if (!response.ok) {
throw new Error(`Ollama API error: ${response.statusText}`);
}
const data = await response.json();
return data.response;
} catch (error) {
throw new Error(`Failed to call Ollama: ${error.message}`);
}
}
/**
* Extracts valid JSON from the model's response text
* Uses regex to find JSON objects even if surrounded by additional text
* @param text Raw response from the model
* @returns Parsed JSON object
*/
private extractJSON(text: string): any {
// Regular expression to find JSON object, handling nested structures
const match = text.match(/{(?:[^{}]|{(?:[^{}]|{[^{}]*})*})*}/);
if (!match) {
throw new Error('No JSON object found in response');
}
try {
return JSON.parse(match[0]);
} catch (error) {
throw new Error(`Failed to parse JSON: ${error.message}`);
}
}
/**
* Utility function to pause execution
* Used between retry attempts to avoid overwhelming the API
*/
private async delay(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
/**
* Main method to moderate content
* Includes retry logic and error handling
* @param content The text content to moderate
* @returns Validated moderation result
*/
async moderateContent(content: string): Promise<ModerationResult> {
// Initial prompt template with example
let currentPrompt = `
Analyze this content:
${content}
Give me a JSON object with:
- toxic: true/false for toxic content
- spamLikelihood: number from 0-1
- contentCategory: what kind of content this is
- recommendedAction: "approve", "reject", or "review"
- confidence: number from 0-1
- explanation: why you made this decision
Here's an example of the exact format I need:
{
"toxic": false,
"spamLikelihood": 0.1,
"contentCategory": "blog_comment",
"recommendedAction": "approve",
"confidence": 0.95,
"explanation": "This appears to be a legitimate comment discussing the topic"
}
Only give me valid JSON that matches this format.`;
// Retry loop with exponential backoff
for (let attempt = 1; attempt <= this.maxRetries; attempt++) {
try {
const response = await this.callOllama(currentPrompt);
const jsonResponse = this.extractJSON(response);
const validatedResponse = ModerationSchema.parse(jsonResponse);
return validatedResponse;
} catch (error) {
if (attempt === this.maxRetries) {
throw new Error(`Failed after ${this.maxRetries} tries: ${error.message}`);
}
// Update prompt with error information for next attempt
currentPrompt = `
Your last response had an error: ${error.message}
Please fix it and give me valid JSON.
Original request:
${currentPrompt}`;
await this.delay(this.retryDelay * attempt);
}
}
throw new Error("Something went wrong");
}
}
3. Real-World Examples
Let’s see how this works in practice. First, we’ll create our processor:
const processor = new OllamaModerationProcessor();
Example 1: Normal Comment
const result1 = await processor.moderateContent(
"Great article! I learned a lot about Docker containers."
);
console.log('Example 1 result:', result1);
The output for sample 1:
{
"toxic": false,
"spamLikelihood": 0.1,
"contentCategory": "blog_comment",
"recommendedAction": "approve",
"confidence": 0.95,
"explanation": "This is a genuine and positive comment contributing constructively to the discussion."
}
The validation passes and the JSON is generated by Ollama. Deepseek 8b used locally. No need to ask the model to fix it.
Example 2: Spam Comment
const result2 = await processor.moderateContent(
"BUY NOW!!! Cheap watches r0lex at amazingdeals123.biz"
);
console.log('Example 2 result:', result2);
For example 2 also the JSON output is generated correctly at once:
{
"toxic": false,
"spamLikelihood": 0.9,
"contentCategory": "promotion",
"recommendedAction": "review",
"confidence": 0.85,
"explanation": "The content is promotional and appears to be a commercial advertisement. It uses urgency and uppercase letters typical of spam, but it's not inherently toxic."
}
Example 3: Error and Retry
Sometimes the model gives us invalid JSON. Although it’s rare with good prompting, I removed from the prompt the response fields and sample JSON to make the model fail, then feed back in the error as a prompt. Here’s how our system handles it:
📝 Starting content moderation...
Content to moderate: Great article! I learned a lot about Docker containers.
🔄 Attempt 1 of 3
🚀 Calling Ollama API...
Prompt length: 130 characters
Making API request to http://localhost:11434/api/generate
✅ API request successful
Response length: 1768 characters
Raw response from Ollama: <think>
Alright, so the user provided an analysis of their experience reading an article on Docker containers and wants only valid JSON in a specific format.
First, I need to understand what exactly they're asking for. They mentioned "valid JSON" matching a particular structure. Looking at their example response, it's structured with keys like "article_rating", "content_summary", etc.
I should extract the main points from their content: learning about Docker containers, finding the article helpful and well-written, recommending it to others, being satisfied with the information, and wanting to apply what was learned.
Next, I'll map these points into the JSON structure they provided. Making sure each key corresponds correctly and the values are accurate based on their analysis.
I should also ensure that the JSON syntax is correct, with proper use of quotes and commas, avoiding any trailing commas or syntax errors.
Finally, present this JSON response clearly, so it's easy for them to integrate into whatever system they're using.
</think>
{
"article_rating": "5/5",
"content_summary": {
"topic": "Docker Containers",
"key_points": [
"Learned about the core concepts of Docker containers.",
"Understood their usage in application development and deployment.",
"Appreciated the efficiency and scalability benefits."
],
"overall_impression": "Extremely informative and well-written article. Highly recommend to anyone interested in Docker technology."
},
"personal_reaction": {
"engagement_level": "Highly engaged",
"satisfaction": "Very satisfied with the information provided.",
"action_plan": "Plan to implement Docker containers in future projects based on learned knowledge."
}
}
🔍 Attempting to extract JSON from response...
Input text length: 1768 characters
Found JSON pattern, attempting to parse...
✅ Successfully parsed JSON
Extracted JSON: {
"article_rating": "5/5",
"content_summary": {
"topic": "Docker Containers",
"key_points": [
"Learned about the core concepts of Docker containers.",
"Understood their usage in application development and deployment.",
"Appreciated the efficiency and scalability benefits."
],
"overall_impression": "Extremely informative and well-written article. Highly recommend to anyone interested in Docker technology."
},
"personal_reaction": {
"engagement_level": "Highly engaged",
"satisfaction": "Very satisfied with the information provided.",
"action_plan": "Plan to implement Docker containers in future projects based on learned knowledge."
}
}
✨ Validating response against schema...
❌ Attempt 1 failed: [
{
"code": "invalid_type",
"expected": "boolean",
"received": "undefined",
"path": [
"toxic"
],
"message": "Required"
},
{
"code": "invalid_type",
"expected": "number",
"received": "undefined",
"path": [
"spamLikelihood"
],
"message": "Required"
},
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": [
"contentCategory"
],
"message": "Required"
},
{
"expected": "'approve' | 'reject' | 'review'",
"received": "undefined",
"code": "invalid_type",
"path": [
"recommendedAction"
],
"message": "Required"
},
{
"code": "invalid_type",
"expected": "number",
"received": "undefined",
"path": [
"confidence"
],
"message": "Required"
},
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": [
"explanation"
],
"message": "Required"
}
]
📝 Updating prompt with error information...
⏳ Delaying for 1000ms before retry...
🔄 Attempt 2 of 3
🚀 Calling Ollama API...
Prompt length: 1210 characters
Making API request to http://localhost:11434/api/generate
✅ API request successful
Response length: 1323 characters
Raw response from Ollama: <think>
Okay, so the user sent me some data, but the initial response had errors because of invalid types. My job is to figure out what's wrong and fix it.
Looking at the error messages, I see that several properties are missing or undefined. The first error says "toxic" is undefined; it expects a boolean. Then there's "spamLikelihood" expecting a number, "contentCategory" needing a string, "recommendedAction" should be one of 'approve', 'reject', or 'review', and both "confidence" and "explanation" are missing.
The user provided a content: "Great article! I learned a lot about Docker containers." So, I need to analyze this. Is it toxic? Probably not. Spam likelihood is low, maybe 0. Maybe content category is something like 'article' or 'educational'.
Recommended action should be 'approve' since the content seems positive. Confidence is how sure I am, so around 0.95. Explanation would briefly say why.
Putting it all together into the correct JSON structure. Let me make sure each field matches what's expected.
</think>
Here is the valid JSON response:
json
{
"toxic": false,
"spamLikelihood": 0,
"contentCategory": "article",
"recommendedAction": "approve",
"confidence": 0.95,
"explanation": "The content is a positive review of Docker containers, which is relevant and non-spam."
}
🔍 Attempting to extract JSON from response...
Input text length: 1323 characters
Found JSON pattern, attempting to parse...
✅ Successfully parsed JSON
Extracted JSON: {
"toxic": false,
"spamLikelihood": 0,
"contentCategory": "article",
"recommendedAction": "approve",
"confidence": 0.95,
"explanation": "The content is a positive review of Docker containers, which is relevant and non-spam."
}
✨ Validating response against schema...
✅ Validation successful
✅ Example 1 result: {
toxic: false,
spamLikelihood: 0,
contentCategory: 'article',
recommendedAction: 'approve',
confidence: 0.95,
explanation: 'The content is a positive review of Docker containers, which is relevant and non-spam.'
}
A Note on Troubleshooting
If you run into issues, first ensure you’re running the latest version of Ollama (ollama --version
), as older versions might not support newer models like DeepSeek-8B. The most common problems are usually related to setup: make sure Ollama is running with ollama serve
, and that you’ve successfully pulled the model with ollama pull deepseek-r1:8b
. When the model first loads, responses might take 15-30 seconds, but subsequent calls are much faster. For JSON parsing issues, our code’s error handling and retry mechanism should handle most edge cases automatically, but you might need to adjust the retry count or delay if you’re getting inconsistent results.
Advanced Tips for Better Results
While our base implementation works well for most cases, here are some advanced patterns you might consider:
-
Response Correction: You can add pre-validation cleanup steps to handle common issues like string booleans (“true” vs true) or percentages in the wrong format (95 vs 0.95). This makes your system more resilient to minor model output variations.
-
Smart Retries: Instead of just retrying with the same prompt, you can analyze what parts of the response were valid and specifically ask the model to fix the invalid parts. For example, if only the ‘confidence’ field is wrong, you can focus the retry prompt on just fixing that field.
-
Context-Aware Rules: Consider adjusting your validation rules based on the input content. For instance, you might want stricter spam checking for content containing URLs, or you might accept lower confidence scores for very short inputs.
-
Error Pattern Learning: Keep track of the most common validation errors you see. This can help you improve your base prompt or add specific pre-validation fixes for recurring issues.
The beauty of using Zod for validation is that you can start simple and gradually add these improvements as you learn what kinds of errors your specific use case encounters most often.
Learning From Errors
One of the most powerful aspects of this system is its ability to learn from failures. When the model provides invalid JSON, our retry mechanism doesn’t just try again blindly - it includes specific feedback about what went wrong. This creates a feedback loop where each retry attempt becomes more focused and effective.
For example, if the model consistently formats boolean values as strings (like “true” instead of true), we can update our prompt to explicitly warn against this pattern. The key is to collect and analyze these validation errors over time, helping us refine our prompts and improve the system’s reliability.
Building Trust in LLM Outputs
What we’ve built here goes beyond just getting JSON from a language model - it’s a pattern for making LLM outputs reliable enough for production use. By combining Zod’s strict validation with Ollama’s local inference and our retry mechanism, we’ve created a system that can recover from failures and learn from its mistakes.
Running this locally with DeepSeek-14B makes it perfect for development and testing. You get quick iteration cycles, complete privacy, and no API costs. While you could adapt this code to work with API-based models like Claude or GPT-4, having everything run locally makes development much more efficient.
The principles we’ve covered - strict validation, intelligent retries, and error feedback - can be applied to any scenario where you need structured data from LLMs. Whether you’re building a content moderation system, a data extraction pipeline, or any other LLM-powered tool, this pattern helps bridge the gap between the creative capabilities of language models and the strict requirements of production systems.