Prometheus + Grafana
Prometheus
ubuntu安装prometheus非常简单:
apt update apt install prometheus systemctl enable prometheus systemctl enable prometheus-node-exporter
apt安装prometheus和prometheus-node-exporter之后便带有基本配置,无需修改。
确保开启服务开启:
systemctl status prometheus systemctl status prometheus-node-exporter
顺便使用它监控mongodb,安装prometheus-mongodb-exporter
:
apt install prometheus-mongodb-exporter systemctl enable prometheus-mongodb-exporter
此外由于mongodb开启了密码验证,需要注意mongodb用户的权限:mongodb_exporter github链接
然后需要修改 /etc/default/prometheus-mongodb-exporter
中的 ARGS
如下:
# ARGS='-mongodb.uri="mongodb://localhost:27017"' ARGS='-mongodb.uri="mongodb://xxx::27017"'
mongodb URI格式如下:
mongodb://[username:]host1[:port1][,...hostN[:portN]][/[database][?options]]
如果 username 或 password 包含 @ : / %
四种符号需要使用 百分号编码.
错误添加需要删除时用db.getSiblingDB("admin").dropUser("mongodb_exporter")
然后重启一下服务
systemctl restart prometheus-mongodb-exporter
Grafana
安装:
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main" wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - apt update apt install grafana
配置:
配置文件为 /etc/grafana/grafana.ini
,注意如下内容:
... [server] domain = www.xxxx.com enforce_domain = true root_url = %(protocol)s://%(domain)s/grafana ... [security] admin_password = xxxx
然后访问 www.xxxx.com/grafana 登录,用户名admin,密码为上面设置的admin_password。
然后按照 [这里][https://github.com/percona/grafana-dashboards] 配置数据源使用prometheus,并导入面板。一般导入这些即可:
(注意:json中的pmm-singlestat-panel
可能需要替换为singlestat
)
使用Viewer角色用户登录查看
上述Dashboard配置好之后,不应继续使用admin登录系统。
在设置中“邀请”用户,填写自己的邮箱然后通过邮箱链接设置密码,即可以自己的邮箱登录grafana。
注:
使用prometheus监控两台服务器,配置文件 /etc/prometheus/prometheus.yml
内容如下:
# Sample config for Prometheus. global: scrape_interval: 15s # By default, scrape targets every 15 seconds. evaluation_interval: 15s # By default, scrape targets every 15 seconds. # scrape_timeout is set to the global default (10s). # Attach these labels to any time series or alerts when communicating with # external systems (federation, remote storage, Alertmanager). external_labels: monitor: 'example' # Load and evaluate rules in this file every 'evaluation_interval' seconds. rule_files: # - "first.rules" # - "second.rules" # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s scrape_timeout: 5s # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs: - targets: ['localhost:9090'] - job_name: "web-server" # If prometheus-node-exporter is installed, grab stats about the local # machine by default. static_configs: - targets: ['localhost:9100'] - job_name: "worker-node1" static_configs: - targets: ['192.168.0.5:9100']
这个配置是没问题的,在另一台机器 (192.168.0.5
) 上安装并启用 prometheus-node-exporter
即可。
但如果你仅仅修改了某个job_name
(而没有修改ip),比如把web-server
改为node
,那么grafana界面中的singlestat
panel将不能正确显示,显示“Only queries that return single...”,
这是因为singlestat只能显示一个结果,而查询语句查到了两个结果。解决方式是删除之前的数据系列:
首先停止prometheus服务,传入
--web.enable-admin-api
参数手动运行然后这样删除:
curl -X POST -g 'http://localhost:9090/api/v1/admin/tsdb/delete_series?match[]={instance="localhost:9100"}'
参考链接:Prometheus: Delete Time Series Metrcs
注2:上述情况的查询语句可能是这样的(可以在grafana中看到):