Prometheus 2.3: Remote Read Revamp and Massive Query Speedups

Prometheus 2.3: Remote Read Revamp and Massive Query Speedups

Introduction

May 15, 2018 delivered Prometheus 2.3.0, a feature-rich release targeting large Kubernetes clusters and advanced PromQL workflows. New subquery syntax, remote read rewrites, and smarter staleness detection accelerate diagnosis of microservice issues.


Release Highlights

Subqueries Arrive

  • PromQL now supports nested range evaluations (metric[5m:1m]), unlocking native SLA burn-rate and SLO error budget calculations.
  • Enables gap-free dashboards without resorting to recording rules explosion.

Remote Read Re-Architecture (Experimental)

  • Adds streaming chunk retrieval to reduce latency when federating long time ranges.
  • Bridges Thanos, Cortex, and Influx integrations with scale-friendly batching.
  • Includes max_samples_per_query safeguards for runaway federated queries.

Smarter Staleness Tracking

  • Fresh sample timestamps mark stale series automatically; no more manual or fallbacks.
  • Alerting rules gain consistency during rollouts or exporter restarts.

Kubernetes Impact

  • Horizontal Pod Autoscalers leveraging custom metrics see lower query latency.
  • kube-state-metrics dashboards render faster with reduced TSDB contention.
  • Works hand-in-hand with CoreDNS 1.2 and Calico 3.0 exporters via consistent metric naming.

Upgrade Path

  1. Snapshot your TSDB (promtool tsdb create-blocks-from rules).
  2. Roll out new Prometheus containers with --storage.tsdb.retention=15d (or workload requirement).
  3. Update Grafana dashboards to use subqueries for trailing latency percentiles.
  4. Enable remote read proxying in staging before flipping production traffic.
  5. Watch /metrics for prometheus_engine_query_duration_seconds to validate the expected drop.

Example: SLO Burn Rate Subquery

sum(rate(http_requests_total{code=~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))

Convert to a subquery to analyze the trailing 6 hours:

(sum(rate(http_requests_total{code=~"5.."}[5m])) /
 sum(rate(http_requests_total[5m])))
[6h:5m]

Summary

AspectDetails
Release DateMay 15, 2018
Key GainsSubqueries, faster remote read, staleness intelligence
Why it MattersGives Kubernetes operators richer SLO tooling and scalable multi-cluster observability

Prometheus 2.3 cements the project as the de facto metrics backbone for Kubernetes. With subqueries and efficient federation, platform teams can finally implement SRE-grade error budgets without performance tax.