Building on our previous explorations of prompt engineering and structured outputs with Zod, let’s take the next step in enhancing LLM capabilities. While RAG helps us extend an LLM’s knowledge, tools allow us to give it real agency - the ability to actually do things in response to user requests.

1. Why Tools Matter for Business

Before diving into technical details, let’s understand why tools are valuable:

Real-time Data: Your AI can access live data like prices, inventory, or market conditions.
System Integration: Connect AI with your existing business systems (CRM, ERP, etc.).
Automation: Perform actions automatically (send emails, create tickets, update records).
Cost Savings: Reduce manual work by letting AI handle routine tasks.
Accuracy: Get precise, current information instead of outdated training data.
Extensibility: Easy to add new capabilities as your needs grow without retraining the LLM.

1.1 Real-World Examples

Customer Service Integration: A bot that checks order status, delivery tracking, and inventory levels in real-time, providing accurate updates without human intervention
Calendar Management: AI assistant that not only checks calendar availability but also understands meeting room capacity, participant time zones, and company scheduling policies
Sales Support: Tools that combine inventory checks, pricing rules, customer history, and shipping calculations to provide accurate quotes instantly

1.2 How Tools Work

Think of tools like apps on your phone:

Tool Definition: Like an app’s description in the store
Parameters: What information the tool needs (like a form to fill)
Execution: The actual work (like pressing “Send”)
Response: What you get back

tools_diagram

For example, a weather tool might:

Define itself as “get_weather” with a description of checking weather conditions
Require parameters like “location” and optional “units” (celsius/fahrenheit)
Call a weather API with those parameters
Return a formatted response with temperature, conditions, and timestamp

2. Beyond Static Knowledge

We’ve already seen how RAG can extend an LLM’s knowledge base with custom data. But what about when we need the LLM to:

Access real-time data that changes frequently
Perform calculations or data transformations
Interact with external APIs
Execute actions in response to user requests

This is where tool calling comes in - it’s the bridge between an LLM’s reasoning capabilities and actual functionality in your application.

3. Tool Calling Architecture

At its core, tool calling is simple: you define functions the LLM can use, and it decides when and how to use them. But building a robust tool system requires careful design. Let’s break it down:

// Core tool interface
interface Tool {
  name: string;
  description: string;
  parameters: zod.ZodObject<any>;
  execute: (params: any) => Promise<any>;
}

// Example weather tool implementation
const weatherTool: Tool = {
  name: 'get_weather',
  description: 'Get current weather for a location',
  parameters: z.object({
    location: z.string(),
    units: z.enum(['celsius', 'fahrenheit']).optional()
  }),
  execute: async (params) => {
    const response = await fetch(`https://api.weather.com/${params.location}`);
    return response.json();
  }
};

3.1 Tool Response Processing

When tools interact with external systems, they often return data in different formats. For example, a weather API might return temperatures in different units, use varying date formats, or include extra fields we don’t need. We need to:

Validate the data matches our expectations
Transform it into a consistent format
Handle missing or incorrect data gracefully

This is where our previous work with Zod becomes particularly valuable. By defining a strict response schema, we can:

Catch API changes early
Ensure consistent data structure throughout our application
Transform data automatically
Provide clear error messages when things go wrong

// Define expected response structure
const WeatherResponse = z.object({
  temperature: z.number(),
  condition: z.string(),
  location: z.string(),
  timestamp: z.date()
});

type Weather = z.infer<typeof WeatherResponse>;

class WeatherTool implements Tool {
  async execute(params: any): Promise<Weather> {
    const rawResponse = await fetch(`https://api.weather.com/${params.location}`);
    const data = await rawResponse.json();
    
    // Validate and transform response
    return WeatherResponse.parse({
      temperature: data.temp,
      condition: data.weather_condition,
      location: params.location,
      timestamp: new Date(data.ts)
    });
  }
}

3.2 Error Handling and Retries

Tools need robust error handling to be reliable in production:

class ResilientTool implements Tool {
  constructor(
    private baseTool: Tool,
    private maxRetries: number = 3,
    private backoff: number = 1000
  ) {}

  async execute(params: any): Promise<any> {
    let lastError: Error;
    
    for (let attempt = 0; attempt < this.maxRetries; attempt++) {
      try {
        const result = await this.baseTool.execute(params);
        return result;
      } catch (error) {
        lastError = error;
        if (!this.isRetryable(error)) throw error;
        await this.sleep(this.backoff * Math.pow(2, attempt));
      }
    }
    
    throw new Error(`Tool execution failed after ${this.maxRetries} attempts`, { cause: lastError });
  }

  private isRetryable(error: any): boolean {
    // Determine if error type warrants retry
    return error.status === 429 || error.status === 503;
  }

  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

4. Tool Orchestration

When building complex applications, you often need tools to work together. Here’s a pattern for tool orchestration:

interface ToolStep {
  tool: string;
  params: Record<string, any>;
  dependencies?: string[];  // IDs of steps this one depends on
}

interface WorkflowDefinition {
  steps: ToolStep[];
}

class ToolOrchestrator {
  private tools: Map<string, Tool>;
  private results: Map<string, any>;

  constructor(tools: Tool[]) {
    this.tools = new Map(tools.map(t => [t.name, t]));
    this.results = new Map();
  }

  async executeWorkflow(workflow: WorkflowDefinition) {
    const executionOrder = this.topologicalSort(workflow.steps);
    
    for (const stepId of executionOrder) {
      const step = workflow.steps.find(s => s.tool === stepId)!;
      const tool = this.tools.get(step.tool)!;
      
      // Resolve any parameter dependencies
      const params = this.resolveParameters(step.params);
      
      const result = await tool.execute(params);
      this.results.set(stepId, result);
    }
    
    return Object.fromEntries(this.results);
  }

  private resolveParameters(params: Record<string, any>): Record<string, any> {
    // Replace any ${stepId} references with actual results
    return Object.fromEntries(
      Object.entries(params).map(([key, value]) => {
        if (typeof value === 'string' && value.startsWith('${')) {
          const stepId = value.slice(2, -1);
          return [key, this.results.get(stepId)];
        }
        return [key, value];
      })
    );
  }
}

4.1 Performance Optimization

Tools can become a performance bottleneck if not handled carefully. Here are some optimization patterns:

Caching

Response Caching: Cache tool responses for frequently requested data that doesn’t change often
Cache Invalidation: Set appropriate TTL (Time To Live) based on how frequently the data changes
Cache Levels: Consider both memory caching for fast access and persistent caching for longer-term storage
Cache Keys: Create cache keys that account for all relevant parameters to avoid incorrect cache hits

Batch Processing

Request Batching: Group similar requests together to reduce API calls
Smart Queuing: Wait a short time for similar requests to arrive before processing
Parallel Processing: Execute independent tool calls simultaneously
Rate Limiting: Respect API limits while maximizing throughput

4.2 Security Considerations

When implementing tools, security should be a top priority:

class SecureTool implements Tool {
  constructor(
    private baseTool: Tool,
    private validator: (params: any) => boolean,
    private sanitizer: (params: any) => any
  ) {}

  async execute(params: any): Promise<any> {
    // Validate input
    if (!this.validator(params)) {
      throw new Error('Invalid tool parameters');
    }

    // Sanitize input
    const sanitizedParams = this.sanitizer(params);

    // Execute with sanitized params
    return this.baseTool.execute(sanitizedParams);
  }
}

// Usage example
const secureWeatherTool = new SecureTool(
  weatherTool,
  (params) => {
    // Validate location format
    return /^[a-zA-Z\s,]+$/.test(params.location);
  },
  (params) => ({
    // Sanitize location
    location: params.location.trim().toLowerCase(),
    units: params.units
  })
);

4.3 Monitoring and Debugging

Tools should be observable in production:

class MonitoredTool implements Tool {
  constructor(
    private baseTool: Tool,
    private metrics: MetricsClient
  ) {}

  async execute(params: any): Promise<any> {
    const start = performance.now();
    
    try {
      const result = await this.baseTool.execute(params);
      
      this.metrics.record({
        tool: this.baseTool.name,
        duration: performance.now() - start,
        status: 'success'
      });
      
      return result;
    } catch (error) {
      this.metrics.record({
        tool: this.baseTool.name,
        duration: performance.now() - start,
        status: 'error',
        error: error.message
      });
      
      throw error;
    }
  }
}

4.4 Testing Tools

Tools need comprehensive testing to ensure reliability:

describe('WeatherTool', () => {
  let tool: WeatherTool;
  let mockApi: jest.Mock;

  beforeEach(() => {
    mockApi = jest.fn();
    tool = new WeatherTool(mockApi);
  });

  it('handles successful api response', async () => {
    mockApi.mockResolvedValue({
      temp: 20,
      weather_condition: 'sunny',
      ts: '2024-02-08T12:00:00Z'
    });

    const result = await tool.execute({ location: 'London' });
    
    expect(result).toEqual({
      temperature: 20,
      condition: 'sunny',
      location: 'London',
      timestamp: expect.any(Date)
    });
  });

  it('handles api errors', async () => {
    mockApi.mockRejectedValue(new Error('API unavailable'));
    
    await expect(tool.execute({ location: 'London' }))
      .rejects
      .toThrow('API unavailable');
  });
});

5. Best Practices

When implementing tools for your LLM system, keep these guidelines in mind:

Start Simple

Begin with basic tools that solve immediate needs
Add complexity only when needed
Test thoroughly before adding more features

Focus on Reliability

Implement proper error handling from the start
Use timeouts to prevent hanging operations
Add logging for debugging
Monitor tool usage and performance

Security First

Validate all inputs before processing
Sanitize data before sending to external systems
Implement rate limiting
Use appropriate authentication
Keep sensitive data secure

Documentation

Document each tool’s purpose and limitations
Include example usage and expected responses
Document error cases and how to handle them
Keep documentation updated as tools evolve

6. Conclusion

This article focused on building robust tools for LLMs. The next step is to explore agents - systems that can autonomously use multiple tools to accomplish complex tasks. Stay tuned for our deep dive into agent architectures and implementation patterns.

Tools transform LLMs from knowledge repositories into systems that can take real actions. By following these patterns and best practices, you can build reliable, performant, and secure tool systems that integrate smoothly with your LLM applications.

Remember: Start simple, focus on reliability and error handling, and gradually add sophistication as your needs grow. The patterns shown here will scale with your application’s complexity.

If you haven’t already, check out our previous articles on prompt engineering techniques and structured output validation - they provide important foundations for working with LLM tools.

Pedro Alonso

Extending LLM Capabilities with Custom Tools: Beyond the Knowledge Cutoff

Table of Contents