Extending LLM Capabilities with Custom Tools: Beyond the Knowledge Cutoff

·9 min read

blog/extending-llm-capabilities-custom-tools

Table of Contents

Building on our previous explorations of prompt engineering and structured outputs with Zod, let’s take the next step in enhancing LLM capabilities. While RAG helps us extend an LLM’s knowledge, tools allow us to give it real agency - the ability to actually do things in response to user requests.

1. Why Tools Matter for Business

Before diving into technical details, let’s understand why tools are valuable:

  • Real-time Data: Your AI can access live data like prices, inventory, or market conditions.
  • System Integration: Connect AI with your existing business systems (CRM, ERP, etc.).
  • Automation: Perform actions automatically (send emails, create tickets, update records).
  • Cost Savings: Reduce manual work by letting AI handle routine tasks.
  • Accuracy: Get precise, current information instead of outdated training data.
  • Extensibility: Easy to add new capabilities as your needs grow without retraining the LLM.

1.1 Real-World Examples

  • Customer Service Integration: A bot that checks order status, delivery tracking, and inventory levels in real-time, providing accurate updates without human intervention
  • Calendar Management: AI assistant that not only checks calendar availability but also understands meeting room capacity, participant time zones, and company scheduling policies
  • Sales Support: Tools that combine inventory checks, pricing rules, customer history, and shipping calculations to provide accurate quotes instantly

1.2 How Tools Work

Think of tools like apps on your phone:

  1. Tool Definition: Like an app’s description in the store
  2. Parameters: What information the tool needs (like a form to fill)
  3. Execution: The actual work (like pressing “Send”)
  4. Response: What you get back

tools_diagram

For example, a weather tool might:

  1. Define itself as “get_weather” with a description of checking weather conditions
  2. Require parameters like “location” and optional “units” (celsius/fahrenheit)
  3. Call a weather API with those parameters
  4. Return a formatted response with temperature, conditions, and timestamp

2. Beyond Static Knowledge

We’ve already seen how RAG can extend an LLM’s knowledge base with custom data. But what about when we need the LLM to:

  • Access real-time data that changes frequently
  • Perform calculations or data transformations
  • Interact with external APIs
  • Execute actions in response to user requests

This is where tool calling comes in - it’s the bridge between an LLM’s reasoning capabilities and actual functionality in your application.

3. Tool Calling Architecture

At its core, tool calling is simple: you define functions the LLM can use, and it decides when and how to use them. But building a robust tool system requires careful design. Let’s break it down:

// Core tool interface
interface Tool {
  name: string;
  description: string;
  parameters: zod.ZodObject<any>;
  execute: (params: any) => Promise<any>;
}

// Example weather tool implementation
const weatherTool: Tool = {
  name: 'get_weather',
  description: 'Get current weather for a location',
  parameters: z.object({
    location: z.string(),
    units: z.enum(['celsius', 'fahrenheit']).optional()
  }),
  execute: async (params) => {
    const response = await fetch(`https://api.weather.com/${params.location}`);
    return response.json();
  }
};

3.1 Tool Response Processing

When tools interact with external systems, they often return data in different formats. For example, a weather API might return temperatures in different units, use varying date formats, or include extra fields we don’t need. We need to:

  1. Validate the data matches our expectations
  2. Transform it into a consistent format
  3. Handle missing or incorrect data gracefully

This is where our previous work with Zod becomes particularly valuable. By defining a strict response schema, we can:

  • Catch API changes early
  • Ensure consistent data structure throughout our application
  • Transform data automatically
  • Provide clear error messages when things go wrong
// Define expected response structure
const WeatherResponse = z.object({
  temperature: z.number(),
  condition: z.string(),
  location: z.string(),
  timestamp: z.date()
});

type Weather = z.infer<typeof WeatherResponse>;

class WeatherTool implements Tool {
  async execute(params: any): Promise<Weather> {
    const rawResponse = await fetch(`https://api.weather.com/${params.location}`);
    const data = await rawResponse.json();
    
    // Validate and transform response
    return WeatherResponse.parse({
      temperature: data.temp,
      condition: data.weather_condition,
      location: params.location,
      timestamp: new Date(data.ts)
    });
  }
}

3.2 Error Handling and Retries

Tools need robust error handling to be reliable in production:

class ResilientTool implements Tool {
  constructor(
    private baseTool: Tool,
    private maxRetries: number = 3,
    private backoff: number = 1000
  ) {}

  async execute(params: any): Promise<any> {
    let lastError: Error;
    
    for (let attempt = 0; attempt < this.maxRetries; attempt++) {
      try {
        const result = await this.baseTool.execute(params);
        return result;
      } catch (error) {
        lastError = error;
        if (!this.isRetryable(error)) throw error;
        await this.sleep(this.backoff * Math.pow(2, attempt));
      }
    }
    
    throw new Error(`Tool execution failed after ${this.maxRetries} attempts`, { cause: lastError });
  }

  private isRetryable(error: any): boolean {
    // Determine if error type warrants retry
    return error.status === 429 || error.status === 503;
  }

  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

4. Tool Orchestration

When building complex applications, you often need tools to work together. Here’s a pattern for tool orchestration:

interface ToolStep {
  tool: string;
  params: Record<string, any>;
  dependencies?: string[];  // IDs of steps this one depends on
}

interface WorkflowDefinition {
  steps: ToolStep[];
}

class ToolOrchestrator {
  private tools: Map<string, Tool>;
  private results: Map<string, any>;

  constructor(tools: Tool[]) {
    this.tools = new Map(tools.map(t => [t.name, t]));
    this.results = new Map();
  }

  async executeWorkflow(workflow: WorkflowDefinition) {
    const executionOrder = this.topologicalSort(workflow.steps);
    
    for (const stepId of executionOrder) {
      const step = workflow.steps.find(s => s.tool === stepId)!;
      const tool = this.tools.get(step.tool)!;
      
      // Resolve any parameter dependencies
      const params = this.resolveParameters(step.params);
      
      const result = await tool.execute(params);
      this.results.set(stepId, result);
    }
    
    return Object.fromEntries(this.results);
  }

  private resolveParameters(params: Record<string, any>): Record<string, any> {
    // Replace any ${stepId} references with actual results
    return Object.fromEntries(
      Object.entries(params).map(([key, value]) => {
        if (typeof value === 'string' && value.startsWith('${')) {
          const stepId = value.slice(2, -1);
          return [key, this.results.get(stepId)];
        }
        return [key, value];
      })
    );
  }
}

4.1 Performance Optimization

Tools can become a performance bottleneck if not handled carefully. Here are some optimization patterns:

Caching

  • Response Caching: Cache tool responses for frequently requested data that doesn’t change often
  • Cache Invalidation: Set appropriate TTL (Time To Live) based on how frequently the data changes
  • Cache Levels: Consider both memory caching for fast access and persistent caching for longer-term storage
  • Cache Keys: Create cache keys that account for all relevant parameters to avoid incorrect cache hits

Batch Processing

  • Request Batching: Group similar requests together to reduce API calls
  • Smart Queuing: Wait a short time for similar requests to arrive before processing
  • Parallel Processing: Execute independent tool calls simultaneously
  • Rate Limiting: Respect API limits while maximizing throughput

4.2 Security Considerations

When implementing tools, security should be a top priority:

class SecureTool implements Tool {
  constructor(
    private baseTool: Tool,
    private validator: (params: any) => boolean,
    private sanitizer: (params: any) => any
  ) {}

  async execute(params: any): Promise<any> {
    // Validate input
    if (!this.validator(params)) {
      throw new Error('Invalid tool parameters');
    }

    // Sanitize input
    const sanitizedParams = this.sanitizer(params);

    // Execute with sanitized params
    return this.baseTool.execute(sanitizedParams);
  }
}

// Usage example
const secureWeatherTool = new SecureTool(
  weatherTool,
  (params) => {
    // Validate location format
    return /^[a-zA-Z\s,]+$/.test(params.location);
  },
  (params) => ({
    // Sanitize location
    location: params.location.trim().toLowerCase(),
    units: params.units
  })
);

4.3 Monitoring and Debugging

Tools should be observable in production:

class MonitoredTool implements Tool {
  constructor(
    private baseTool: Tool,
    private metrics: MetricsClient
  ) {}

  async execute(params: any): Promise<any> {
    const start = performance.now();
    
    try {
      const result = await this.baseTool.execute(params);
      
      this.metrics.record({
        tool: this.baseTool.name,
        duration: performance.now() - start,
        status: 'success'
      });
      
      return result;
    } catch (error) {
      this.metrics.record({
        tool: this.baseTool.name,
        duration: performance.now() - start,
        status: 'error',
        error: error.message
      });
      
      throw error;
    }
  }
}

4.4 Testing Tools

Tools need comprehensive testing to ensure reliability:

describe('WeatherTool', () => {
  let tool: WeatherTool;
  let mockApi: jest.Mock;

  beforeEach(() => {
    mockApi = jest.fn();
    tool = new WeatherTool(mockApi);
  });

  it('handles successful api response', async () => {
    mockApi.mockResolvedValue({
      temp: 20,
      weather_condition: 'sunny',
      ts: '2024-02-08T12:00:00Z'
    });

    const result = await tool.execute({ location: 'London' });
    
    expect(result).toEqual({
      temperature: 20,
      condition: 'sunny',
      location: 'London',
      timestamp: expect.any(Date)
    });
  });

  it('handles api errors', async () => {
    mockApi.mockRejectedValue(new Error('API unavailable'));
    
    await expect(tool.execute({ location: 'London' }))
      .rejects
      .toThrow('API unavailable');
  });
});

5. Best Practices

When implementing tools for your LLM system, keep these guidelines in mind:

Start Simple

  • Begin with basic tools that solve immediate needs
  • Add complexity only when needed
  • Test thoroughly before adding more features

Focus on Reliability

  • Implement proper error handling from the start
  • Use timeouts to prevent hanging operations
  • Add logging for debugging
  • Monitor tool usage and performance

Security First

  • Validate all inputs before processing
  • Sanitize data before sending to external systems
  • Implement rate limiting
  • Use appropriate authentication
  • Keep sensitive data secure

Documentation

  • Document each tool’s purpose and limitations
  • Include example usage and expected responses
  • Document error cases and how to handle them
  • Keep documentation updated as tools evolve

6. Conclusion

This article focused on building robust tools for LLMs. The next step is to explore agents - systems that can autonomously use multiple tools to accomplish complex tasks. Stay tuned for our deep dive into agent architectures and implementation patterns.

Tools transform LLMs from knowledge repositories into systems that can take real actions. By following these patterns and best practices, you can build reliable, performant, and secure tool systems that integrate smoothly with your LLM applications.

Remember: Start simple, focus on reliability and error handling, and gradually add sophistication as your needs grow. The patterns shown here will scale with your application’s complexity.

If you haven’t already, check out our previous articles on prompt engineering techniques and structured output validation - they provide important foundations for working with LLM tools.

Enjoyed this article? Subscribe for more!

Stay Updated

🎁 LLM Prompting Cheat Sheet for Developers

Plus get fresh content delivered to your inbox. No spam, ever.

Related PostsTags: AI, Generative AI, Development, Ollama, LLMs

© 2025 Comyoucom Ltd. Registered in England & Wales