Randomized Load Balancing, Caching, and Big-O Math

No ratings

Presented at SREConAsiaAustralia 2018 by

Randomized load balancing is a common strategy to distribute requests across a server farm. When M requests are randomly assigned to N servers, every server is on average responsible for M/N requests. But in practice the distribution is not uniform: how many requests does the busiest server receive, relative to the average? This is the "peak to average load ratio". It is an important quantity that describes how much capacity is “wasted” when a system is provisioned for peak load. The closer this ratio is to one, the more uniform the utilization of servers is.This talk gives a quick overview over two theorems from the following papers: "Balls into Bins" and "Small Cache, Big Effect."Applied to the "requests to servers" scenario, the first theorem gives a closed expression for the amount of requests hitting the busiest server (with high likelihood). One somewhat surprising conclusion is that the peak to average ratio gets worse if number of requests and number of servers grow proportionally. The second theorem analyzes how effective small, fast caches can be, and how effective they remain as a system scales in size.