GoLang Promethus 监控系统

2023-11-08 golang

这里通过 Prometheus 提供的客户端库添加相关的指标,并通过 HTTP 暴露相关指标。

指标类型

其中的 CounterGauge 类型比较好理解,而 SummaryHistogram 略微复杂,而 GoLang 中的 Summary 类型指标通过 beorn7/perks 实现。对于请求耗时来说,如果某几个耗时较长,采用均值很容易把整体值拖高。

示例

package main

import (
	"net/http"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
	temp := prometheus.NewGauge(prometheus.GaugeOpts{
		Name: "home_temperature_celsius",
		Help: "The current temperature in degrees Celsius.",
	})
	prometheus.MustRegister(temp)
	temp.Set(39)

	http.Handle("/metrics", promhttp.Handler())
	http.ListenAndServe(":8080", nil)
}

然后可以通过 http://localhost:8080/metrics 访问,包含了 go_promhttp_ 两类默认指标。

Registry

上面的示例采用全局的 Registry 实现,也可以自定义,其实现了 Gather 接口,通过该接口暴露相关指标,如下两者等价。

package main

import (
	"net/http"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/collectors"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
	registry := prometheus.NewRegistry()
	registry.MustRegister(
		collectors.NewProcessCollector(collectors.ProcessCollectorOpts{}),
		collectors.NewGoCollector(),
	)

	temp := prometheus.NewGauge(prometheus.GaugeOpts{
		Name: "home_temperature_celsius",
		Help: "The current temperature in degrees Celsius.",
	})
	registry.MustRegister(temp)
	temp.Set(39)

	http.Handle("/metrics", promhttp.HandlerFor(registry, promhttp.HandlerOpts{Registry: registry}))
	http.ListenAndServe(":8080", nil)
}

上述的 NewProcessCollector() 对应了 process_ 前缀指标,NewGoCollector() 对应了 go_ 前缀指标;而后面的 Registry: registry 则为 promhttp_ 指标,不指定时不会上报。

标签

如果需要添加标签则需要使用 NewXXXVec() 类型的函数,其中 XXX 就是上述的 Gauge Counter Summary Histogram 类型,这样允许指定额外的字符串切片参数。

package main

import (
	"net/http"

	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
)

func main() {
	cnt := prometheus.NewCounterVec(prometheus.CounterOpts{
		Name: "http_request_total",
		Help: "Total number of scrapes by HTTP status code.",
	}, []string{"code"})

	// Initialize the most likely HTTP status codes.
	cnt.WithLabelValues("200")
	cnt.WithLabelValues("500")
	cnt.WithLabelValues("503")
	prometheus.MustRegister(cnt)

	cnt.WithLabelValues("200").Inc()
	cnt.With(prometheus.Labels{"code": "200"}).Inc()

	http.Handle("/metrics", promhttp.Handler())
	http.ListenAndServe(":8080", nil)
}

其中 With() 效率要稍微低一些。

指标类型

gauge := prometheus.NewGauge(prometheus.GaugeOpts{
    Name: "gauge",
    Help: "Simple gauge metric.",
})
gauge.Set(0)
gauge.Inc()
gauge.Dec()
gauge.Add(23)
gauge.Sub(42)

counter := prometheus.NewCounter(prometheus.CounterOpts{
    Name: "counter",
    Help: "Simple counter metric.",
})
totalRequests.Inc()
totalRequests.Add(23)

histogram := prometheus.NewHistogram(prometheus.HistogramOpts{
    Name:    "histogram",
    Help:    "Simple histogram metric.",
    Buckets: []float64{0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10},
    //Buckets: prometheus.DefBuckets,
})
histogram.Observe(0.42)
timer := prometheus.NewTimer(histogram)
// ...
timer.ObserveDuration()

requestDurations := prometheus.NewSummary(prometheus.SummaryOpts{
    Name:       "http_request_duration_seconds",
    Help:       "A summary of the HTTP request durations in seconds.",
    Objectives: map[float64]float64{
      0.5: 0.05,   // 第50个百分位数,最大绝对误差为0.05。
      0.9: 0.01,   // 第90个百分位数,最大绝对误差为0.01。
      0.99: 0.001, // 第90个百分位数,最大绝对误差为0.001。
    },
  },
)
requestDurations.Observe(0.42)

其中 Histogram 是在服务端计算,而 Summary 是在客户端计算。