1. Introduction

In Part 1, we covered implementing background jobs using PostgreSQL with pg-boss. While pg-boss provides a solid foundation for background processing, certain use cases demand more sophisticated queue management and processing capabilities. This guide explores advanced background processing patterns using Bull, Redis, and hybrid architectures.

As applications scale, they often encounter scenarios where PostgreSQL-based job queues become a bottleneck. Common challenges include handling high-volume job processing, managing complex job dependencies, and providing real-time progress updates. Bull, backed by Redis, addresses these limitations with features specifically designed for sophisticated job processing requirements.

1.1 Consider these real-world scenarios:

An e-commerce platform processing thousands of concurrent checkouts during peak sales periods requires:

Immediate job ingestion without database write constraints
Multiple staged operations (payment verification, inventory updates, notifications)
Real-time status updates for user feedback
Guaranteed job completion even under heavy load

A content platform handling video processing needs:

Multi-stage processing pipelines (transcoding, thumbnail generation)
Progress tracking for long-running jobs
Automatic retries with backoff strategies
Resource-intensive background operations

These scenarios highlight where Bull’s architecture provides significant advantages:

Redis-based queue management enables rapid job ingestion and processing
Built-in support for job dependencies and workflows
Real-time event system for progress tracking
Advanced scheduling and prioritization capabilities
Robust error handling and retry mechanisms

2. Moving Beyond PostgreSQL-Based Queues

If you’re currently using a simple polling solution or pg-boss, the decision to migrate to Bull typically comes from specific technical requirements rather than general scaling needs.

Redis fundamentally changes the background processing architecture by providing an in-memory data store optimized for queue operations. This enables Bull to offer features that are challenging to implement efficiently with PostgreSQL:

Sub-millisecond job operations
Real-time job status updates
Efficient handling of job priorities and dependencies
Automatic cleanup of completed jobs
Built-in rate limiting and concurrency control

2.1 When to Adopt Bull and Redis

The decision to migrate from pg-boss to Bull typically comes from specific technical requirements rather than general scaling needs:

High-Frequency Job Processing When your application needs to handle thousands of jobs per minute with minimal latency, Bull’s Redis backend provides significant performance advantages. A typical example is processing user interactions in real-time, where job ingestion speed directly impacts user experience.

Complex Job Orchestration For workflows requiring multiple dependent stages, Bull offers native support for job dependencies and workflow management. Video processing pipelines exemplify this, where multiple processing stages must execute in sequence with proper error handling at each step.

Real-Time Progress Tracking Applications requiring live feedback benefit from Bull’s event-driven architecture. For instance, large file uploads with multiple processing stages can provide accurate progress updates to users, something that’s more complex to implement with PostgreSQL-based queues.

3. Implementing Bull with Redis

When building production-grade background job systems, we need a robust architecture that can handle real-world complexities. Bull, backed by Redis, provides the foundation for this architecture through its high-performance job queue system.

3.1 Core Architecture

bull-architecture

The system consists of three main components working together:

Queue Manager: A singleton service that handles queue creation, configuration, and lifecycle. It ensures consistent queue settings across your application and manages Redis connections efficiently.
Worker Processes: Dedicated processes that execute jobs from the queues. These can be scaled independently and handle job processing concurrency.
Job Processors: Modular components containing the business logic for different job types, such as email sending, image processing, or data exports.

Redis acts as the backbone for this architecture, providing:

Fast job storage and retrieval
Reliable job state management
Real-time communication between components
Atomic operations for job handling

3.2 Queue Manager Implementation

Let’s implement a robust Queue Manager that handles queue creation and configuration:

// lib/bull/config.ts
import Bull from 'bull';
import IORedis from 'ioredis';

export interface RedisConfig {
  host: string;
  port: number;
  password?: string;
  tls?: boolean;
}

export class QueueManager {
  private static instance: QueueManager;
  private queues: Map<string, Bull.Queue>;
  private redisConfig: RedisConfig;

  private constructor(config: RedisConfig) {
    this.queues = new Map();
    this.redisConfig = config;
  }

  static getInstance(config: RedisConfig): QueueManager {
    if (!QueueManager.instance) {
      QueueManager.instance = new QueueManager(config);
    }
    return QueueManager.instance;
  }

  getQueue(name: string): Bull.Queue {
    if (!this.queues.has(name)) {
      const queue = new Bull(name, {
        redis: this.redisConfig,
        defaultJobOptions: {
          removeOnComplete: false,  // Keep job history
          removeOnFail: false,      // Keep failed jobs
          attempts: 3,              // Retry failed jobs
          backoff: {
            type: 'exponential',    // Exponential backoff
            delay: 2000             // Start with 2 seconds
          }
        }
      });
      this.queues.set(name, queue);
    }
    return this.queues.get(name)!;
  }

  async closeAll(): Promise<void> {
    await Promise.all(
      Array.from(this.queues.values()).map(queue => queue.close())
    );
  }
}

This Queue Manager implements the singleton pattern to ensure consistent queue configuration across your application. It provides sensible defaults for job options while allowing flexibility for different queue types.

3.3 Job Processors

Each type of job gets its own processor, keeping the code organized and maintainable.

Here’s an example of an email processor:

// workers/processors/email.ts
import { Job } from 'bull';
import { sendEmail } from '../../lib/email';

export async function processEmailJob(job: Job) {
  const { to, template, data } = job.data;
  
  // Report progress (useful for long-running jobs)
  await job.progress(10);
  
  try {
    await sendEmail(to, template, data);
    await job.progress(100);
    
    // Return value is stored in job.returnvalue
    return { sent: true, timestamp: new Date() };
  } catch (error) {
    // Bull will handle retries based on our config
    throw error;
  }
}

3.4 Worker Process Setup

The worker process brings everything together, connecting queues with their processors:

// workers/index.ts
import { QueueManager } from '../lib/bull/config';
import { processEmailJob } from './processors/email';
import { processImageJob } from './processors/image';

const queueManager = QueueManager.getInstance({
  host: process.env.REDIS_HOST!,
  port: parseInt(process.env.REDIS_PORT!, 10),
  password: process.env.REDIS_PASSWORD
});

// Initialize queues with their processors
const queues = [
  {
    name: 'email',
    processor: processEmailJob,
    concurrency: 5,   // Process 5 emails at once. Set based on:
                      // - Redis connection limits
                      // - Email service rate limits
                      // - Available system memory
    cleanupOnComplete: true // Flag for cleanup
  },
  {
    name: 'image',
    processor: processImageJob,
    concurrency: 2,  // CPU intensive, lower concurrency
    cleanupOnComplete: false // Keep image processing jobs for audit
  }
];

// Set up each queue with its processor and completion handler
queues.forEach(({ name, processor, concurrency, cleanupOnComplete }) => {
  const queue = queueManager.getQueue(name);
  
  // Set up the processor
  queue.process(concurrency, processor);
  
  // Set up completion handler if cleanup is enabled
  if (cleanupOnComplete) {
    queue.on('completed', async (job) => {
      try {
        await job.remove();
        console.log(`Cleaned up completed job ${job.id} from queue ${name}`);
      } catch (error) {
        console.error(`Error cleaning up job ${job.id} from queue ${name}:`, error);
      }
    });
  }
  
  // Add error monitoring
  queue.on('failed', (job, error) => {
    console.error(`Job ${job.id} in queue ${name} failed:`, error);
  });
  
  queue.on('stalled', (jobId) => {
    console.warn(`Job ${jobId} in queue ${name} is stalled`);
  });
});

// Graceful shutdown handler
async function shutdown() {
  console.log('Shutting down workers...');
  
  // Pause all queues
  await Promise.all(
    queues.map(({ name }) => 
      queueManager.getQueue(name).pause(true)
    )
  );
  
  // Wait for active jobs
  await Promise.all(
    queues.map(async ({ name }) => {
      const queue = queueManager.getQueue(name);
      const active = await queue.getActive();
      return Promise.all(
        active.map(job => job.finished())
      );
    })
  );
  
  // Close Redis connections
  await queueManager.closeAll();
  
  process.exit(0);
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

This setup provides:

Independent scaling of different job types
Graceful shutdown handling
Progress tracking for long-running jobs
Automatic retry handling
Proper resource management through concurrency control

With this implementation in place, you can now add jobs to your queues from anywhere in your application using the Queue Manager. The workers will process these jobs based on their priority and configuration, with built-in error handling and retry logic.

4. Advanced Processing Patterns

Production applications often require more than simple queuing and processing. They need to handle complex workflows, manage system resources, and integrate with external services efficiently. Let’s explore two essential patterns for building robust job processing systems: job dependencies and rate limiting.

4.1 Job Dependencies and Workflows

Real-world business processes rarely consist of single, isolated tasks. Instead, they often involve multiple steps that must execute in a specific sequence, with some steps depending on the successful completion of others. Bull provides built-in support for these workflows through job dependencies.

Common scenarios where job dependencies are crucial:

Order processing with payment verification, inventory updates, and notifications
User registration flows with email verification and profile setup
Content publishing with processing, validation, and distribution steps

Here’s how to implement a multi-stage order processing workflow:

order-processing-flow

// lib/bull/workflows.ts
export async function startOrderProcessingWorkflow(orderId: string) {
  const queueManager = QueueManager.getInstance(redisConfig);
  const orderQueue = queueManager.getQueue('order-processing');

  // Step 1: Verify payment first
  const verifyPayment = await orderQueue.add(
    'verify-payment',
    { orderId }
  );
 
  // Step 2: Update inventory after payment
  const updateInventory = await orderQueue.add(
    'update-inventory',
    { orderId },
    { parent: verifyPayment.id }
  );
 
  // Step 3: Parallel operations after inventory update
  await Promise.all([
    orderQueue.add('notify-shipping', { orderId }, {
      parent: updateInventory.id
    }),
    orderQueue.add('send-confirmation', { orderId }, {
      parent: updateInventory.id
    })
  ]);
}

4.2 Rate Limiting and Prioritization

When your application interacts with external services or performs resource-intensive operations, you need to control the rate of job processing. This helps you:

Stay within API rate limits
Prevent system resource exhaustion
Ensure fair resource distribution
Prioritize critical operations

Bull provides powerful rate limiting and prioritization features. Here’s how to implement a rate-limited queue that also supports job priorities:

// lib/bull/rate-limiting.ts
export class RateLimitedQueue {
  private queue: Bull.Queue;

  constructor(queueName: string) {
    this.queue = QueueManager.getInstance(redisConfig)
      .getQueue(queueName);
   
    // Global rate limiter for the queue
    this.queue.limiter = {
      max: 100,        // max jobs per interval
      duration: 60000, // interval in ms (1 minute)
      bounceBack: true // queue jobs when limit is reached
    };
  }

  async addJob(data: any, priority: number = 2) {
    return this.queue.add(data, {
      priority, // 1 (highest) to 3 (lowest)
      attempts: 3,
      backoff: {
        type: 'exponential',
        delay: 1000
      }
    });
  }
}

Use this rate-limited queue for scenarios like:

API integrations with rate limits (e.g., sending emails, SMS)
Resource-intensive operations (e.g., image/video processing)
Operations that need to be throttled for system stability

For example, you might use different priorities for different types of emails:

const emailQueue = new RateLimitedQueue('email');

// High priority: Password reset emails
await emailQueue.addJob({ 
  type: 'password-reset',
  userId: 'user123'
}, 1);

// Medium priority: Welcome emails
await emailQueue.addJob({ 
  type: 'welcome',
  userId: 'user456'
}, 2);

// Low priority: Marketing emails
await emailQueue.addJob({ 
  type: 'marketing',
  userId: 'user789'
}, 3);

5. Deployment with Dokku

Deploying Bull workers requires careful consideration of process management and scaling. Let’s build on our existing Dokku deployment setup.

5.1 Process Configuration

A robust background job system requires three distinct processes:

# Procfile
web: npm start
worker: node dist/workers/index.js
scheduler: node dist/workers/scheduler.js

Each process serves a specific purpose:

web: Your Next.js application handling HTTP requests
worker: Dedicated job processors handling queued tasks
scheduler: Manages recurring jobs (similar to cron)

deployment-architecture

5.2 Redis Setup

First, install and configure Redis using Dokku’s plugin system:

# Install Redis plugin
dokku plugin:install https://github.com/dokku/dokku-redis.git

# Create Redis service with persistence
dokku redis:create background-jobs-redis \
  --custom-env REDIS_MAXMEMORY=256mb \
  --custom-env REDIS_MAXMEMORY_POLICY=allkeys-lru

# Enable persistence
dokku redis:persist background-jobs-redis

# Link Redis to your app
dokku redis:link background-jobs-redis my-nextjs-app

5.3 Worker Process Configuration

Scale your worker processes based on your needs:

# Scale to 3 worker processes
dokku ps:scale my-nextjs-app worker=3

# Keep scheduler as singleton to prevent duplicate jobs
dokku ps:scale my-nextjs-app scheduler=1

5.4 Environment Variables

Set up your environment variables for the background processing system:

# Redis configuration (automatically set by redis:link)
dokku config:set my-nextjs-app \
  BULL_PREFIX=my-nextjs-app \
  BULL_DEFAULT_JOB_ATTEMPTS=3 \
  BULL_DEFAULT_JOB_REMOVAL_DELAY=3600

5.5 Monitoring Configuration

Enable log aggregation for your worker processes:

# Enable worker process logging
dokku logs:vector-add my-nextjs-app worker

# View worker logs
dokku logs my-nextjs-app -p worker

5.6 High Availability Setup

For production deployments requiring high availability:

Configure Redis sentinel:

dokku redis:configure-sentinel background-jobs-redis

Enable automatic failover:

dokku redis:configure-sentinel background-jobs-redis \
  --sentinel-failover-timeout 5000

For larger deployments, consider Redis Cluster:

# Create a Redis cluster
dokku redis:create background-jobs-redis --cluster

# Configure cluster settings
dokku redis:configure-cluster background-jobs-redis \
  --nodes 3 \
  --replicas 1

ℹ️ Note: When using Redis Cluster, update your Bull configuration to include cluster options:

const queue = new Bull(name, {
  redis: {
    cluster: true,
    nodes: redisClusterNodes
  }
});

5.7 Deployment Health Checks

Create a health check endpoint for your worker processes:

// pages/api/worker-health.ts
export default async function handler(req, res) {
  const queueManager = QueueManager.getInstance(redisConfig);
  const queues = ['email', 'image']; // Add your queue names
  
  try {
    const status = await Promise.all(
      queues.map(async (name) => {
        const queue = queueManager.getQueue(name);
        const [active, waiting] = await Promise.all([
          queue.getActiveCount(),
          queue.getWaitingCount()
        ]);
        return { name, active, waiting };
      })
    );
    
    res.status(200).json({ status: 'healthy', queues: status });
  } catch (error) {
    res.status(500).json({ status: 'unhealthy', error: error.message });
  }
}

ℹ️ Note: In production, secure this endpoint by using authentication middleware, restricting to internal IPs, or moving it to a separate internal admin API.

Add this to your CHECKS file:

WAIT=10
ATTEMPTS=6
/api/worker-health  200  {"status":"healthy"}

6. Production Monitoring

Real-world background job systems need robust monitoring and error handling.

6.1 Error Handling

Implement comprehensive error tracking:

// lib/bull/error-handling.ts
export function setupQueueErrorHandling(queue: Bull.Queue) {
  // Queue-level errors
  queue.on('error', error => {
    console.error('Queue error:', error);
    Sentry.captureException(error);
  });

  // Failed jobs
  queue.on('failed', async (job: Job, error: Error) => {
    console.error(`Job ${job.id} failed:`, error);
    
    // Only alert on final failure
    if (job.attemptsMade === job.opts.attempts) {
      await notifyTeam({
        queue: queue.name,
        jobId: job.id,
        error: error.message,
        data: job.data
      });
      
      Sentry.withScope(scope => {
        scope.setContext('job', {
          id: job.id,
          queue: queue.name,
          attempts: job.attemptsMade,
          data: job.data
        });
        Sentry.captureException(error);
      });
    }
  });
  
  // Stalled jobs (took too long)
  queue.on('stalled', (jobId: string) => {
    console.warn(`Job ${jobId} stalled`);
  });
}

6.2 Monitoring Tools

While Bull Board provides a UI, command-line tools can be more reliable for production monitoring:

// scripts/queue-stats.ts
async function getQueueStats() {
  const queue = QueueManager.getInstance(redisConfig)
    .getQueue('my-queue');
    
  const [waiting, active, completed, failed] = await Promise.all([
    queue.getWaitingCount(),
    queue.getActiveCount(),
    queue.getCompletedCount(),
    queue.getFailedCount()
  ]);
  
  console.table({
    waiting,
    active,
    completed,
    failed
  });
}

7. Conclusion

Building a production-grade background job system requires careful consideration of:

Job processing patterns and workflows
Deployment and scaling strategies
Error handling and monitoring
Resource management and rate limiting

By following these patterns and practices, you can build a reliable background processing system that scales with your application’s needs. Remember to:

Start simple and add complexity as needed
Monitor job queues and worker health
Implement proper error handling and retries
Consider resource constraints and rate limits
Plan for graceful deployments and updates

The combination of Bull, Redis, and proper deployment practices provides a solid foundation for handling complex background processing needs in production environments.

Pedro Alonso

Advanced Background Processing in Next.js Part 2

Table of Contents