Skip to content

Crack SDE

Most of the content are generated by AI, with human being reviewed, edited, and revised

Menu
  • Home
  • Daily English Story
  • Tech Interviews
  • Cloud Native
  • DevOps
  • Artificial Intelligence
Menu

How Prometheus scrap works, and how to find the target node and get the metrics files

Posted on 04/14/202504/23/2025 by user

Below is an overview of how Prometheus scraping works, how Prometheus discovers (“finds”) targets in Kubernetes or other environments, and how it retrieves metrics from those targets.


1. Prometheus Scraping Fundamentals

  1. Pull-Based Model
    • Prometheus uses a pull model: it periodically sends HTTP requests (scrapes) to endpoints (targets) that expose metrics in a plaintext or OpenMetrics format.
    • By default, metrics are served at a path like http://<host>:<port>/metrics.
  2. Prometheus Configuration (prometheus.yml)
    • Prometheus’ behavior is controlled by a YAML config file (often named prometheus.yml).
    • This config includes one or more scrape_configs sections. Each scrape_config defines how Prometheus discovers targets and where it scrapes them from.

Example snippet of a scrape_config:

scrape_configs:
  - job_name: 'example-service'
    kubernetes_sd_configs:
      - role: service
    relabel_configs:
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
        action: replace
        regex: (.+):(?:\d+);(\d+)
        replacement: $1:$2
        target_label: __address__
  • job_name: The name of the scrape job.
  • kubernetes_sd_configs: Uses Kubernetes service discovery to find services.
  • relabel_configs: Filters or transforms discovered targets to the correct address/port/path for scraping.
  1. Scrape Interval
    • Each job has a scrape_interval (default: 15 seconds). Prometheus queries each discovered target at that interval.
  2. No “Metrics Files”
    • Prometheus doesn’t fetch “metrics files” in the sense of logs on disk. It sends HTTP GET requests to the target’s /metrics (or another path) endpoint, which returns the metrics in text format (the Prometheus exposition format).

2. How Prometheus Finds Targets

A. Static Configuration (Non-Kubernetes)

For basic setups (e.g., dev or PoC), you can hardcode targets:

scrape_configs:
  - job_name: 'static_example'
    static_configs:
      - targets: ['192.168.1.10:9100', '192.168.1.11:9100']

Prometheus will scrape each of those targets on the specified port and path.

B. Service Discovery (Kubernetes, EC2, Consul, etc.)

  1. Kubernetes Service Discovery
    • In a Kubernetes cluster, Prometheus can use the Kubernetes API to dynamically discover pods/services/endpoints.
    • Common approaches:
      • role: service: Discover services.
      • role: pod: Discover pods directly.
      • Service Monitors / Pod Monitors if using the Prometheus Operator.
  2. Annotations in Kubernetes
    • A common pattern is to annotate Services or Pods:
      • prometheus.io/scrape: "true"
      • prometheus.io/port: "8080"
      • prometheus.io/path: "/metrics"
    • Prometheus’ relabel_configs can filter in only those targets that have prometheus.io/scrape set to "true".
  3. Other Service Discovery
    • Prometheus also supports EC2, Azure, GCE, Consul, etc.
    • Each discovery mechanism has its own config block (ec2_sd_configs:, consul_sd_configs:, etc.).

3. Scraping Flow in Kubernetes

  1. Prometheus Queries the K8s API
    • Using the credentials provided (often via in-cluster config if you run Prometheus inside K8s), Prometheus queries the Kubernetes API to list Pods, Services, or Endpoints.
  2. Relabeling
    • The discovered targets have metadata (like labels, annotations).
    • Via relabel_configs, Prometheus transforms or filters this metadata to determine the final scrape endpoint (i.e., IP:port and path).
  3. HTTP GET to /metrics
    • On each scrape interval, Prometheus sends an HTTP GET request to each valid target.
    • The target returns a plaintext metrics payload (like node_cpu_seconds_total{cpu="0"} 1000) for each metric.
  4. Prometheus Ingests & Stores
    • Prometheus parses the returned data and stores the time series in its internal TSDB (time-series database).

4. Verifying Which Targets Are Scraped

  1. Prometheus Web UI
    • Access the Prometheus web UI (e.g., http://<prometheus-host>:9090).
    • Go to Status -> Targets.
    • You’ll see a list of all targets Prometheus is currently scraping, their job name, last scrape time, and scrape status.
  2. Debugging Discovery
    • In the web UI, go to Status -> Service Discovery.
    • This shows you the raw data returned by the service discovery mechanism (like the Kubernetes API) before relabeling.
    • You can see which pods/services are being discovered and how they are labeled.

5. How to “Find the Target Node and Get the Metrics”

  1. Kubernetes (Node)
    • If you want node metrics, you often run Node Exporter as a DaemonSet.
      • This exporter runs on each node (so each node is a target).
      • The node exporter typically exposes metrics on port 9100 at the /metrics path.
    • Alternatively, you can scrape kubelet’s cAdvisor endpoint to get container-level metrics.
  2. Pods and Services
    • If your microservice is instrumented with a Prometheus client library and you expose /metrics, Prometheus can discover and scrape that endpoint.
    • The underlying node is “found” automatically via the K8s service discovery logic (the node IP or Pod IP).
    • You can see in the Prometheus “Targets” page exactly which IP and port it’s scraping.
  3. Raw “Metrics File”
    • Technically, you can fetch the raw metrics text from any target by doing curl http://<target-ip>:<port>/metrics.
    • This is not stored as a file on the node by default. It’s generated dynamically when you hit the /metrics endpoint.

6. Example (Kubernetes Service Annotation)

Let’s say you have a Service manifest like:

apiVersion: v1
kind: Service
metadata:
  name: my-app
  namespace: default
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"
spec:
  selector:
    app: my-app
  ports:
    - port: 8080
      targetPort: 8080
  • Prometheus sees the annotation prometheus.io/scrape: "true"
  • It scrapes each pod behind that Service on port 8080 at the /metrics path.
  • You can verify this in the Prometheus web UI under Status -> Targets (look for my-app).

7. Key Takeaways

  1. Prometheus “Scrape Configs” define how targets are found and how often they are scraped.
  2. Kubernetes Service Discovery (and optional annotations) automates target discovery in a cluster.
  3. Prometheus Doesn’t Pull Metrics Files from the file system; it makes HTTP GET requests to the /metrics endpoint each scrape interval.
  4. Check the Prometheus UI under Status -> Targets or Status -> Service Discovery to see what endpoints are being scraped and how they’re labeled.

By setting up your scrape_configs properly in prometheus.yml (or by using ServiceMonitor/PodMonitor objects with the Prometheus Operator), Prometheus will automatically find the node, pod, and service endpoints in Kubernetes and scrape the metrics they expose.

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related

Recent Posts

  • LC#622 Design Circular Queue
  • Started with OpenTelemetry in Go
  • How Prometheus scrap works, and how to find the target node and get the metrics files
  • How to collect metrics of container, pods, node and cluster in k8s?
  • LC#200 island problem

Recent Comments

  1. another user on A Journey of Resilience

Archives

  • May 2025
  • April 2025
  • February 2025
  • July 2024
  • April 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • June 2023
  • May 2023

Categories

  • Artificial Intelligence
  • Cloud Computing
  • Cloud Native
  • Daily English Story
  • Database
  • DevOps
  • Golang
  • Java
  • Leetcode
  • Startups
  • Tech Interviews
©2025 Crack SDE | Design: Newspaperly WordPress Theme
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
View preferences
{title} {title} {title}