Service levels provide a structured way to define and measure system reliability. They help align engineering and business expectations using shared, understandable terms.
SLI — Service Level Indicator
A quantitative measurement of some aspect of system performance. Examples: request latency (95th percentile), error rate, availability percentage, throughput. An SLI answers: "How is the system performing right now?"
SLO — Service Level Objective
Defines the target value or acceptable range for a given SLI. Examples: 99.9% availability per month, 95% of requests under 200ms, error rate below 0.1%. An SLO answers: "What level of reliability are we committing to achieve?" SLOs help teams prioritize reliability work and balance stability with feature delivery.
SLA — Service Level Agreement
A formal business contract based on SLOs. It defines the promised level of service and the consequences if targets are not met (e.g., financial penalties or service credits). An SLA answers: "What are the business consequences if reliability targets are not met?"
Why Service Levels Matter
Service levels translate technical performance into business language. They provide a common framework for discussing reliability, expectations, and accountability across engineering, operations, and leadership. They define system health in measurable, objective terms — not opinions.