Tech Glossary

Load Shedding

Load Shedding refers to a strategy in distributed systems and networking where non-essential or lower-priority requests are deliberately dropped or deferred when the system is overloaded. The goal of load shedding is to preserve the stability and performance of the system by reducing the load it has to process, ensuring that critical operations or high-priority requests are handled efficiently even under high traffic or stress conditions.

Load shedding is commonly used in web services, cloud infrastructure, and distributed systems that need to maintain high availability and responsiveness. It can be implemented through various mechanisms, such as:

Rate Limiting: Controlling the number of requests a system processes over a given time, allowing only a certain number of requests per second.
Backpressure: When the system signals to the sender that it is overwhelmed and can no longer accept additional requests, causing the sender to reduce its traffic.
Graceful Degradation: The system reduces the quality of the service (e.g., serving cached data instead of generating new results) while still providing some level of service rather than completely failing.
Load shedding is especially important in real-time systems or microservices architectures, where performance degradation in one service can cascade to others if left unchecked. For instance, if a distributed service like a database or message queue receives more requests than it can handle, load shedding helps ensure that the most critical requests are processed while less important ones are dropped or delayed.

In summary, load shedding is a technique used to maintain system performance and prevent crashes by dropping non-critical requests during periods of high load. It helps ensure that high-priority requests are processed, improving system resilience under stress.

Glossary