Tracks individual requests across multiple services — In distributed systems, a single user request may travel through many services. Tracing follows that request end-to-end across service boundaries.
Each processing step generates a "span" — Every service involved emits a span containing metadata such as timing, duration, status, and contextual information. Together, these spans form a complete trace of the request lifecycle.
Helps visualize latency and bottlenecks — Tracing makes it possible to see where time is spent, identify slow dependencies, detect failures, and understand service interactions.
Common tools include Zipkin, Jaeger, and Grafana Tempo — These systems collect and store traces, allowing engineers to analyze distributed request flows.
Downside: requires coordinated instrumentation — All services must properly propagate the trace ID across requests. If one component fails to pass the context, the trace becomes incomplete.