Introduction to the Worker Pool Design Pattern in Go
A man of great wisdom recently told me, “The only incorrect amount of concurrency is unbounded concurrency.”
Concurrency in Go can be challenging to understand, especially if you’re coming from a background in a language where concurrency works differently or isn’t an option. Here is a description of the Worker Pool design pattern in Go, which you can use to fan-out a particular kind of task to multiple concurrent workers.
The Worker Pool Pattern is a popular design pattern in Go for managing concurrency. The pattern efficiently distributes tasks among a fixed number of goroutines, which can help manage resource utilisation and improve performance.
Tip
The only incorrect amount of concurrency is unbounded concurrency
Understanding the Worker Pool Pattern
The worker pool pattern involves creating a specific number of workers (goroutines), each responsible for executing tasks. These workers listen on a common channel that dispatches tasks to them. As tasks arrive, any available worker can pick up a task and execute it. Once a worker completes a task, it returns to listening for more tasks. This model helps in managing a large number of tasks without overloading the system with goroutines, thus optimizing resource usage.
Worker Pool Example
func worker(tasks <-chan int, results chan<- int) {
// Keep receiving tasks from the queue until it is closed, then the loop finishes
for task := range tasks {
// Do something useful here
results <- foo(task)
}
// After exiting the loop, the worker goroutine just dies at the end of this function
}
func runWorkers() {
// Set the number of workers to an appropriate level by experimentation
workers := 100
// In this example we have 1000 tasks
taskCount := 1000
// Make channels for tasks and their results
tasks := make(chan int, taskCount)
results := make(chan int, taskCount)
// Spin up the workers
for w := 1; w <= workers; w++ {
go worker(w, tasks, results)
}
// Send jobs to the queue for workers to handle them
for j := 1; j <= taskCount; j++ {
tasks <- 42
}
// Closing the 'jobs' queue kills the workers when they've finished reading everything: see worker() above
close(tasks)
// You can receive values from the 'results' channel and use them as they come in
for a := 1; a <= taskCount; a++ {
res := <-results
fmt.Printf("Hello this is a result %d\n", res)
}
}
Key Components of the Worker Pool in Go
Task Channel: This is a channel through which tasks are sent to the workers. It’s usually buffered to hold multiple tasks waiting to be processed.
Worker Function: Each worker runs a specific function, typically in an infinite loop, listening for tasks on the task channel. When a task is received, the function processes the task and then returns to listening.
Dispatcher: A dispatcher function is responsible for feeding tasks into the task channel. It ensures that tasks are distributed among the available workers.
Results Channel: Optionally, if task results need to be communicated back, a results channel can be used where workers send back the results of processed tasks.
How Many Workers Do I Need?
The Worker Pool Pattern offers you controlled concurrency, so you can vary the number of workers. If their input channel is coming from an I/O bottleneck upstream, increase the number until further increases don’t improve performance. If they’re doing something computationally expensive, increase your number of workers until CPU becomes the bottleneck, and your resources are being well utilised. If a downstream process is wring to a disk or something slow, increase the number until the output channel is making good use of its buffer. Basically, mess around with that number until stuff runs faster.
Best Libraries for Working with Worker Pools in Go
Here are some popular libraries to explore further:
Ants- Automated management and recycling of very large numbers of goroutines, with an extensive API, periodic purging of goroutines, and efficient memory use.
Tunny - A comparatively lightweight library for spawning and managing a goroutine pool
Workerpool - A simple Worker Pool implementation that limits the concurrency of task execution, without blocking submitting tasks.
Conclusion
The worker pool design pattern is a powerful tool in Go for managing concurrency, particularly when dealing with a high volume of tasks. It optimises resource usage and improves application performance by distributing tasks evenly among a pool of workers. By understanding and implementing this pattern, developers can build more efficient and robust concurrent applications.