Scrape Configuration
Prometheus collects metrics by scraping HTTP endpoints exposed by targets. Understanding scrape configuration is a core part of the PCA exam.
How Scraping Works
- Prometheus sends an HTTP GET request to the target's metrics endpoint (default
/metrics) - The target responds with metrics in Prometheus exposition format
- Prometheus parses and stores the metrics with a timestamp
- This repeats at the configured
scrape_interval
Basic Scrape Configuration
The scrape_configs section in prometheus.yml defines which targets to scrape:
global:
scrape_interval: 15s # Default interval for all jobs
evaluation_interval: 15s # How often to evaluate rules
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
Key Parameters
| Parameter | Default | Description |
|---|---|---|
scrape_interval |
1m (global) | How often to scrape targets |
scrape_timeout |
10s | Timeout for each scrape request |
metrics_path |
/metrics |
HTTP path to scrape |
scheme |
http |
HTTP or HTTPS |
honor_labels |
false |
If true, keep labels from the target |
honor_timestamps |
true |
Use timestamps from the target |
Static Configuration
The simplest way to define targets:
scrape_configs:
- job_name: "node-exporters"
scrape_interval: 30s
static_configs:
- targets:
- "node1:9100"
- "node2:9100"
- "node3:9100"
labels:
env: "production"
dc: "us-east-1"
Service Discovery
For dynamic environments, Prometheus supports many service discovery mechanisms:
File-Based Service Discovery
Watch a JSON or YAML file for target changes:
scrape_configs:
- job_name: "file-sd"
file_sd_configs:
- files:
- "targets/*.json"
refresh_interval: 5m
Target file format (targets/app.json):
[
{
"targets": ["app1:8080", "app2:8080"],
"labels": {
"env": "production",
"app": "myservice"
}
}
]
Kubernetes Service Discovery
Discover pods, services, nodes, and endpoints in Kubernetes:
scrape_configs:
- job_name: "kubernetes-pods"
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: (.+)
Other SD Mechanisms
consul_sd_configs— Consul service discoverydns_sd_configs— DNS-based discovery (SRV records)ec2_sd_configs— AWS EC2 instancesazure_sd_configs— Azure VMsgce_sd_configs— Google Compute Engine
Relabeling
Relabeling allows you to modify labels before scraping (relabel_configs) or before storage (metric_relabel_configs).
Common Relabel Actions
| Action | Description |
|---|---|
keep |
Keep targets matching regex |
drop |
Drop targets matching regex |
replace |
Set target label to replacement |
labelmap |
Map matching labels to new names |
labeldrop |
Remove matching labels |
labelkeep |
Keep only matching labels |
Example: Filter by Annotation
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: "true"
Example: Replace Port
relabel_configs:
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
Key Exam Tips
- scrape_interval: The global default is 1 minute. Job-level settings override the global default.
- honor_labels: When
false(default), Prometheus renames conflicting labels withexported_prefix. - Up metric: Prometheus automatically adds
up{job="...", instance="..."}metric (1 = healthy, 0 = failed scrape). - Meta labels: Service discovery provides
__meta_*labels that are available during relabeling but not stored. __address__: The special label that determines the target's host and port.