Service Levels

Service Levels: SLI, SLO, SLA

Service levels provide a structured way to define and measure system reliability. They help align engineering and business expectations using shared, understandable terms.

SLI — Service Level Indicator

A quantitative measurement of some aspect of system performance. Examples: request latency (95th percentile), error rate, availability percentage, throughput. An SLI answers: "How is the system performing right now?"

SLO — Service Level Objective

Defines the target value or acceptable range for a given SLI. Examples: 99.9% availability per month, 95% of requests under 200ms, error rate below 0.1%. An SLO answers: "What level of reliability are we committing to achieve?" SLOs help teams prioritize reliability work and balance stability with feature delivery.

SLA — Service Level Agreement

A formal business contract based on SLOs. It defines the promised level of service and the consequences if targets are not met (e.g., financial penalties or service credits). An SLA answers: "What are the business consequences if reliability targets are not met?"

Why Service Levels Matter

Service levels translate technical performance into business language. They provide a common framework for discussing reliability, expectations, and accountability across engineering, operations, and leadership. They define system health in measurable, objective terms — not opinions.

Observability

Service Levels