在10 分鐘教你使用Prometheus監控Spring Boot工程中介紹了如何使用Prometheus監控Spring Boot提供的默認指標,這篇介紹如何自定義業務指標,并使用Prometheus進行監控并報警,同時在 Grafana 進行展現
我們模擬一個賬務系統,主要功能有:充值與提現,其中會定義5 個業務指標,如下
針對以上5 業務指標,會使用prometheus的三種Metrics類型,如下
最終我們對以上指標進行 grafana 進行展現,同時對余額小于500 進行告警通知,效果如下
圖片
圖片
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId></dependency><dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId></dependency>
#監控的端點management.endpoints.web.exposure.include=*#應用程序名稱,在prometheus 上會顯示management.metrics.tags.applicatinotallow=${spring.application.name}#tomcat 指標需要開啟server.tomcat.mbeanregistry.enabled=true
@Service@Slf4jpublic class AccountServiceImpl implements IAccountService { @Autowired private MeterRegistry registry; //入金筆數 private Counter depositCounter; // 出金筆數 private Counter withdrawCounter; //入金金額 private DistributionSummary depositAmountSummary; // 出金金額 private DistributionSummary withdrawAmountSummary; //余額 private BigDecimal balance = new BigDecimal(1000); @PostConstruct private void init() { depositCounter = registry.counter("deposit_counter", "currency", "btc"); withdrawCounter = registry.counter("withdraw_counter", "currency", "btc"); depositAmountSummary = registry.summary("deposit_amount", "currency", "btc"); withdrawAmountSummary = registry.summary("withdraw_amount", "currency", "btc"); Gauge.builder("balanceGauge", () -> balance) .tags("currency", "btc") .description("余額") .register(registry); } @Override // 充值操作 public void depositOrder(BigDecimal amount) { log.info("depositOrder amount:{}", amount); try { //余額增加 balance = balance.add(amount); //充值筆數埋點 depositCounter.increment(); //充值金額埋點 depositAmountSummary.record(amount.doubleValue()); } catch (Exception e) { log.info("depositOrder error", e); } finally { log.info("depositOrder result:{}", amount); } } @Override //提現操作 public void withdrawOrder(BigDecimal amount) { log.info(" withdrawOrder amount:{}", amount); try { if (balance.subtract(amount).compareTo(BigDecimal.ZERO) < 0) { throw new Exception("提現金額不足,提現失敗"); } //余額減少 balance = balance.subtract(amount); // 提現筆數埋點數據 withdrawCounter.increment(); // 提現金額埋點 withdrawAmountSummary.record(amount.doubleValue()); } catch (Exception e) { log.info("withdrawOrder error", e); } finally { log.info("withdrawOrder result:{}", amount); } }}
@RestController@RequestMapping(ControllerConstants.PATH_PREFIX + "/account")public class AccountController { @Autowired IAccountService accountService; /** * 充值 */ @RequestMapping(value = "/deposit", method = RequestMethod.GET) public void deposit(@RequestParam("amount") BigDecimal amount) { accountService.depositOrder(amount); } /** * 提現 */ @RequestMapping(value = "/withdraw", method = RequestMethod.GET) public void withdraw(@RequestParam("amount") BigDecimal amount) { accountService.withdrawOrder(amount); }}
##充值筆數deposit_counter_total## 充值總金額deposit_amount_sum##提現筆數withdraw_counter_total##提現總金額withdraw_amount_sum## 余額balanceGauge
在prometheus.yml文件中進行配置業務系統采集點,5s 拉取一次指標,由于prometheus server 部署在docker 中,所以訪問主機IP 用host.docker.internal
#業務系統監控 - job_name: 'SpringBoot' # Override the global default and scrape_interval: 5s metrics_path: '/actuator/prometheus' static_configs: - targets: ['host.docker.internal:8080']
圖片
告警規則配置,在容器啟動時用主機的/data/prometheus目錄映射到容器的/prometheus目錄,因此在主機/data/prometheus/目錄創建rules文件夾,并創建告警文件business-alert.rules,這里告警對余額小于 500 則進行告警
groups:- name: businessAlert rules: - alert: balanceAlert expr: balanceGauge{applicatinotallow="backend"} < 500 for: 20s labels: severity: page team: g2park annotations: summary: "{{ $labels.currency }} balance is insufficient " description: "{{ $labels.currency }} balance : {{ $value }}"
啟動Prometheus,進行驗證,查詢采集目標,已生效
圖片
查詢充值次數,已采集點擊Alters,可以看到業務告警已經生效
在/data/prometheus/alertmanager目錄下,新增告警模板notify-template.tmpl,此目錄映射到altermanager 的/etc/alertmanager目錄,模板包含告警和自愈兩部分,2006-01-02 15:04:05是go語言的日志格式,固定值,加28800e9表示轉換為東八區時間,即北京時間
{{ define "test.html" }} {{- if gt (len .Alerts.Firing) 0 -}}{{ range .Alerts }}<h1 align="left" style="color:red;">告警</h1><pre>告警級別: {{ .Labels.severity }} 級 <br>告警類型: {{ .Labels.alertname }} <br>故障主機: {{ .Labels.instance }} <br>告警主題: {{ .Annotations.summary }} <br>告警詳情: {{ .Annotations.description }} <br>告警時間:{{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}<br> </pre>{{ end }}{{ end }}{{- if gt (len .Alerts.Resolved) 0 -}}{{ range .Alerts }}<h1 align="left" style="color:green;">恢復</h1><pre>告警名稱:{{ .Labels.alertname }}<br>告警級別:{{ .Labels.severity }}<br>告警機器:{{ .Labels.instance }}<br>告警主題:{{ .Annotations.summary }}<br>告警主題:{{ .Annotations.description }}<br>告警時間:{{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}<br> 恢復時間:{{ (.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}<br> </pre>{{- end }}{{- end }}{{- end }}
修改alertmanager.yml為以下內容,替換對應賬號即可
global: smtp_smarthost: smtp.qq.com:465 smtp_from: 9238223@qq.com smtp_auth_username: 9238223@qq.com smtp_auth_identity: 9238223@qq.com smtp_auth_password: 123 smtp_require_tls: falsetemplates: #添加模板 - '/etc/alertmanager/notify-template.tmpl' #指定路徑 route: group_by: ['alertname'] receiver: 'default-receiver' group_wait: 30s group_interval: 5m repeat_interval: 1hreceivers: - name: default-receiver email_configs: - to: abc123@foxmail.com html: '{{ template "test.html" . }}' send_resolved: true headers: { Subject: "系統監控告警{{- if gt (len .Alerts.Resolved) 0 -}}恢復{{ end }}" }
global: 這是一個全局配置部分,用于配置全局的Alertmanager設置。
route: 用于配置警報的路由規則。
receivers: 接收者部分,用于配置接收告警通知的收件人。
啟動Altermanager,進行驗證
docker start alertmanager
訪問stauts,如果出現以下結果則成功
告警驗證,系統默認余額為1000,調用backend/account/withdraw提現接口,使余額降至500,進行報警
等待20s 左右,prometheus 收到報警會推送至Altermanager
圖片
Altermanager則會根據我們配置時間等待 30s,進行通知告警
圖片
自愈驗證,調用充值backend/account/deposit接口,使余額大于500,等待6m 左右會收到自愈告警,如果嫌時間比較長,修改alertmanager.yml中 group_wait、group_interval參數值即可
啟動 Grafana,點擊新增面板,創建三種圖表,分別為余額走勢、提現與充值金額占比、提現與充值筆數走勢圖,如下
圖片
余額走勢,報表類型為Stat
sum(balanceGauge{applicatinotallow="backend"})
圖片
提現與充值金額占比,報表類型為Pie chart
withdraw_amount_sum{applicatinotallow="backend"}deposit_amount_sum{applicatinotallow="backend"}
提現與充值筆數走勢圖,報表類型為Time series
increase(deposit_counter_total{applicatinotallow="backend"}[5m])increase(withdraw_counter_total{applicatinotallow="backend"}[5m])
以上介紹了如何在Spring Boot中自定義業務指標以及對指標進行監控和告警,希望對你所幫助,注意以上示例只是為了簡單便于理解才是這樣寫,真實使用中,指標可以與數據庫或者緩存進行結合,比如余額報警,調用查詢余額接口即可。
本文鏈接:http://www.www897cc.com/showinfo-26-55323-0.html在SpringBoot中自定義指標并使用Prometheus監控報警
聲明:本網頁內容旨在傳播知識,若有侵權等問題請及時與本網聯系,我們將在第一時間刪除處理。郵件:2376512515@qq.com