Prometheus监控TLS Kubernetes集群

当我们完成Kubernetes集群环境部署后，就需要提取Kubernets集群中POD的日志提取和监控。当集群内的N台服务器在Kubernets的管理下自动创建和销毁POD，但在这种情况下，我们就不方便及时获取所有POD和服务器的运行状态及资源消耗状态，给我们感觉是，驾驶着一辆没有仪表盘的跑车在高速公路上飙车，给人一种心慌的感觉。
在以前的工作中，用过Nagios，Cacti，zabbix等监控工具。但在Kubernets集群中，这些工具并不适用。因此，我们需要引入新的监控工具Prometheus。

Prometheus简介

Prometheus是SoundCloud开源的一款监控软件。它的实现参考了Google内部的监控实现，与同样源自Google的Kubernetes项目十分搭配。Prometheus集成了数据采集，存储，异常告警多项功能，是一款一体化的完整方案。它针对大规模的集群环境设计了拉取式的数据采集方式、多维度数据存储格式以及服务发现等创新功能。
与传统监控工具相比，Prometheus 可以通过服务发现掌握集群内部已经暴露的监控点，然后主动拉取所有监控数据。通过这样的架构设计，我们仅需要向Kubernetes集群中部署一份Prometheus实例，它就可以通过向apiserver查询集群状态，然后向所有已经支持Prometheus metrics的kubelet获取所有Pod的运行数据。如果我们想采集底层服务器运行状态，通过DaemonSet在所有服务器上运行配套的node-exporter之后，Prometheus就可以自动采集到新的这部分数据。
这种动态发现的架构，非常适合服务器和程序都不固定的Kubernetes集群环境，同时也大大降低了运维的负担。

Prometheus官网：https://prometheus.io/
Prometheus官方下载地址：https://prometheus.io/download/
Prometheus官方文档地址：https://prometheus.io/docs/introduction/overview/

环境说明

环境：
　　　Prometheus v2.2.0
　　　node-exporter v0.15.2
　　　Kubernetes v1.8.2
　　　Centos 7.4

角色	IP	备注
k8s master	192.168.1.195	k8s master
k8s node	192.168.1.198	k8s node、Prometheus、node-exporter
k8s node	192.168.1.199	k8s node、Prometheus、node-exporter

部署node-exporter

node-exporter可以用于监控底层的服务器指标
下面是官方解释：

1 2	Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors. The WMI exporter is recommended for Windows users.

node-exporter Github：https://github.com/prometheus/node_exporter

为了能够收集每个节点的信息，这里使用DaemonSet的形式部署PODS

node-exporter.yaml：

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-ops
  labels:
    k8s-app: node-exporter
spec:
  template:
    metadata:
      labels:
        k8s-app: node-exporter
    spec:
      containers:
      - image: prom/node-exporter:latest
        name: node-exporter
        ports:
        - containerPort: 9100
          hostPort: 9100
          protocol: TCP
          name: http
        volumeMounts:
          - name: time
            mountPath: /etc/localtime
            readOnly: true
      volumes:
        - name: time
          hostPath:
            path: /etc/localtime
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: node-exporter
  name: node-exporter
  namespace: kube-ops
spec:
  ports:
  - name: http
    port: 9100
    targetPort: 9100
    protocol: TCP
  selector:
    k8s-app: node-exporter

1 2	[root@localhost prometheus]# kubectl create namespace kube-ops [root@localhost prometheus]# kubectl apply -f node-exporter.yaml

[root@localhost prometheus]# kubectl get pod -o wide -n kube-ops
NAME                         READY     STATUS    RESTARTS   AGE       IP            NODE
node-exporter-8d66t          1/1       Running   0          1h        172.30.41.5   192.168.1.199
node-exporter-xn5ss          1/1       Running   0          1h        172.30.57.6   192.168.1.198
[root@localhost prometheus]#

[root@localhost prometheus]# kubectl get svc -o wide -n kube-ops
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE       SELECTOR
node-exporter   ClusterIP   172.16.152.14   <none>        9100/TCP   1h        k8s-app=node-exporter
[root@localhost prometheus]#

部署Service Account

Kubernetes在1.8.0之后启用了RBAC特性，因此我们需要先通过RBAC授权，然后Prometheus通过RBAC连接Kubernetes集群，否则被拒绝后，将无法连接到K8s的API-SERVER
参考：https://kubernetes.io/docs/admin/authorization/rbac/

prometheus-service-account.yml：

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: kube-ops
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
  namespace: kube-ops
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
  namespace: kube-ops
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: kube-ops

1	[root@localhost prometheus]# kubectl apply -f prometheus-service-account.yml

[root@localhost prometheus]# kubectl get ServiceAccount -n kube-ops
NAME         SECRETS   AGE
default      1         1d
prometheus   1         55m
[root@localhost prometheus]#

部署Prometheus alertmanager配置文件

使用ConfigMap的形式来设置Prometheus的配置文件
参考：https://prometheus.io/docs/prometheus/latest/configuration/configuration/

prometheus-alertmanager-config.yml：

kind: ConfigMap
apiVersion: v1
metadata:
  name: alertmanager
  namespace: kube-ops
data:
  config.yml: |-
    global:
      smtp_smarthost: 'smtp.exmail.qq.com:465'
      smtp_from: 'user1@example.com'
      smtp_auth_username: 'user1@example.com'
      smtp_auth_password: 'password'
      smtp_require_tls: false
  
      resolve_timeout: 5m

    templates:
      - '/etc/alertmanager/*.tmpl'

    route:
      receiver: email
      group_wait: 30s
      group_interval: 5m
      repeat_interval: 10d
      group_by: [alertname]
      routes:
      - receiver: email
        group_wait: 10s
        match:
          team: node
    receivers:
    - name: email
      email_configs:
      - send_resolved: true
        to: 'user2@example.com,user3@example.com'

repeat_interval：指定告警发送间隔时间
to: 'user2@example.com,user3@example.com'指定多个收件人，每个收件人邮箱之间用逗号隔开

1	[root@localhst prometheus]# kubectl apply -f prometheus-alertmanager-config.yml

[root@localhst prometheus]# kubectl get ConfigMap -n kube-ops
NAME                DATA      AGE
alertmanager        2         21h
[root@localhst prometheus]#

需要修改global.smtp*和receivers.name.email_configs相关的邮件信息

部署Prometheus的配置文件

prometheus-config.yaml：

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: kube-ops
data:
  prometheus.yml: |
    global:
      scrape_interval: 30s
      scrape_timeout: 30s
    alerting:
      alertmanagers:
        - static_configs:
            - targets: ["192.168.1.198:9093"]
    rule_files:
      - "rules.yml"
    scrape_configs:
    - job_name: 'prometheus'
      static_configs:
        - targets: ['localhost:9090']

    - job_name: 'kubernetes-apiservers'
      kubernetes_sd_configs:
      - role: endpoints
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      relabel_configs:
      - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
        action: keep
        regex: default;kubernetes;https

    - job_name: 'kubernetes-nodes'
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: 192.168.1.195:6443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics

    - job_name: 'kubernetes-cadvisor'
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - target_label: __address__
        replacement: 192.168.1.195:6443
      - source_labels: [__meta_kubernetes_node_name]
        regex: (.+)
        target_label: __metrics_path__
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor

    - job_name: 'kubernetes-node-exporter'
      scheme: http
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - source_labels: [__meta_kubernetes_role]
        action: replace
        target_label: kubernetes_role
      - source_labels: [__address__]
        regex: '(.*):10250'
        replacement: '${1}:9100'
        target_label: __address__

  rules.yml: |
    groups:
    - name: rule
      rules:
      - alert: NodeFilesystemUsage
        expr: (node_filesystem_size{device="rootfs"} - node_filesystem_free{device="rootfs"}) / node_filesystem_size{device="rootfs"} * 100 > 80
        for: 2m
        labels:
          team: node
        annotations:
          summary: "{{$labels.instance}}: High Filesystem usage detected"
          description: "{{$labels.instance}}: Filesystem usage is above 80% (current value is: {{ $value }}"
      - alert: NodeMemoryUsage
        expr: (node_memory_MemTotal - (node_memory_MemFree+node_memory_Buffers+node_memory_Cached )) / node_memory_MemTotal * 100 > 80
        for: 2m
        labels:
          team: node
        annotations:
          summary: "{{$labels.instance}}: High Memory usage detected"
          description: "{{$labels.instance}}: Memory usage is above 80% (current value is: {{ $value }}"
      - alert: NodeCPUUsage
        expr: (100 - (avg by (instance) (irate(node_cpu{job="kubernetes-node-exporter",mode="idle"}[5m])) * 100)) > 80
        for: 2m
        labels:
          team: node
        annotations:
          summary: "{{$labels.instance}}: High CPU usage detected"
          description: "{{$labels.instance}}: CPU usage is above 80% (current value is: {{ $value }}"

job_name: 'kubernetes-node-exporter'中替换31672端口为9100，该端口是node-exporter暴露的NodePort端口，这里需要根据实际情况填写
在前面node-exporter.yaml中指定了targetPort: 9100，所以这里的端口需要修改为9100
kubernetes.default.svc:443为k8s api地址，如果安装k8s时不是使用的默认DNS，则需要手动修改
新增了Prometheus alertmanagers，需要修改alerting.alertmanagers.static_configs.targets的IP地址，这时Prometheus和alertmanagers是两个docker容器，IP为运行alertmanagers的宿主机的IP地址
新增了Prometheus alertmanagers告警规则，添加rule_files(指定报警规则)，并增加三条规则(rules.yml)
这里新增的三条报警规则分别是：节点的文件系统，节点内存，CPU的使用量。如果大于了80%的话就触发label为team=node的receiver(alertmanager 配置文件中配置)，可以看到上面的配置就会匹配email这个receiver

1	[root@localhost prometheus]# kubectl apply -f prometheus-config.yaml

[root@localhst prometheus]# kubectl get ConfigMap -n kube-ops
NAME                DATA      AGE
alertmanager        2         21h
prometheus-config   1         21h
[root@localhst prometheus]#

部署prometheus

使用Deployment的形式来设置Prometheus
参考：https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

创建Node Label:

[root@localhost ~]# kubectl label node 192.168.1.198 "appNodes=pro-00-monitor"
node "192.168.1.198" labeled
[root@localhost ~]# kubectl get node -a -l "appNodes=pro-00-monitor"
NAME            STATUS    ROLES     AGE       VERSION
192.168.1.198   Ready     <none>    24m       v1.9.2
[root@localhost ~]# 
[root@localhost ~]# mkdir -p /data/monitor   #创建数据挂载目录

prometheus-deploy.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: prometheus
  name: prometheus
  namespace: kube-ops
spec:
  replicas: 1
  template:
    metadata:
      labels:
        k8s-app: prometheus
    spec:
      nodeSelector:
        appNodes: pro-00-monitor
      securityContext:
        runAsUser: 0
      serviceAccountName: prometheus
      containers:
      - image: prom/prometheus:v2.2.0
        name: prometheus
        command:
        - "/bin/prometheus"
        args:
        - "--config.file=/etc/prometheus/prometheus.yml"
        - "--storage.tsdb.path=/prometheus"
        - "--storage.tsdb.retention=15d"
        ports:
        - containerPort: 9090
          hostPort: 9090
          protocol: TCP
          name: http
        volumeMounts:
        - mountPath: "/prometheus"
          name: data
          subPath: prometheus/data
        - mountPath: "/etc/prometheus"
          name: config-volume
        - mountPath: "/etc/localtime"
          name: time
          readOnly: true
        resources:
          requests:
            cpu: 1
            memory: 1Gi
          limits:
            cpu: 1
            memory: 2Gi
      - image: prom/alertmanager:v0.14.0
        name: alertmanager
        args: 
        - "--config.file=/etc/alertmanager/config.yml"
        - "--storage.path=/alertmanager"
        ports:
        - containerPort: 9093
          hostPort: 9093
          protocol: TCP
          name: http
        volumeMounts:
        - name: alertmanager-config-volume
          mountPath: /etc/alertmanager
        resources:
          requests:
            memory: 500Mi
          limits:
            memory: 1024Mi

      volumes:
      - name: data
        hostPath:
          path: "/data/monitor"
      - name: time
        hostPath:
          path: "/etc/localtime"
      - configMap:
          name: prometheus-config
        name: config-volume
      - name: alertmanager-config-volume
        configMap:
          name: alertmanager

1	[root@localhost prometheus]# kubectl apply -f prometheus-deploy.yaml

1
2
3

[root@localhost prometheus]# kubectl get pod -o wide -n kube-ops
prometheus-fc7685cc7-rwlc7   1/1       Running   0          34s       172.30.57.7   192.168.1.198
[root@localhost prometheus]#

[root@localhst ~]# netstat -tunlp |egrep '9090|9093'
tcp6       0      0 :::9090                 :::*                    LISTEN      4023/docker-proxy   
tcp6       0      0 :::9093                 :::*                    LISTEN      3983/docker-proxy   
[root@localhst ~]#

访问prometheus

prometheus启动成功后，我们就可以打开prometheus dashboard查看了，访问 http://ip:9090/graph ，点status–>Targets
可以看到prometheus已经成功访问到k8s api-server并获取到监控指标
prometheus_dashboard

访问Alertmanager

alertmanager启动后，我们可以打开alertmanager Dashboard查看，访问 http://ip:9093
当然在prometheus dashboard也可以查看，在status –> Runtime & Build Information 最底部
prometheus_dashboard
alertmanager_dashboard

告警规则

在Prometheus定义的rules生效后，可以在Status –>Rules 这里看到
Prometheus_rules

点击的expr会直接跳转到Prometheus graph页面查询，在制定报警规则的时候，可以先在Prometheus中测试表达式

在Prometheus的Alerts这里可以看到触发告警规则的状态
Prometheus_alerts

目前有三台主机成功触发规则
Prometheus_alerts
一个报警信息在生命周期内有下面3中状态：

inactive: 表示当前报警信息既不是firing状态也不是pending状态
pending: 表示在设置的阈值时间范围内被激活。这时Prometheus处于等待状态，大概等待3分钟左右，整合所有的告警条目，等待集中发送给Alertmanager
firing: 表示超过设置的阈值时间被激活。这时Prometheus处于发送告警到Alertmanager阶段，也是最终状态

触发规则后，也可以在Alertmanager Dashboard上看到
Alertmanager_dashboard

最后来一张，我们成功收到Alertmanager的告警邮件截图
Alertmanager_mail

查询监控数据

Prometheus提供了API的方式进行数据查询，同样可以使用query语言进行复杂的查询任务
点Graph
查询每个POD的CPU使用情况，输入：
sum by (pod_name)( rate(container_cpu_usage_seconds_total{image!="", pod_name!=""}[1m] ) )
prometheus_dashboard

Q&A

Question：
如果遇到下面的报错：
prometheus_error

Answer：
在prometheus-deploy.yaml中spec.spec.下增加如下配置：

1 2	securityContext: runAsUser: 0

详细配置参考上面prometheus-deploy.yaml配置文件
参考：https://github.com/prometheus/prometheus/issues/2939

Question：

1
2
3

level=error ts=2018-04-23T13:08:34.417214948Z caller=notify.go:303 component=dispatcher msg="Error on notify" err="dial t
cp 14.18.245.164:25: getsockopt: connection timed out"level=error ts=2018-04-23T13:08:34.417316796Z caller=dispatch.go:266 component=dispatcher msg="Notify for alerts failed" 
num_alerts=3 err="dial tcp 14.18.245.164:25: getsockopt: connection timed out"

Answer：
遇到上面的报错，首先先检查下docker容器是否联网，再测试下与smtp服务器是否正常通讯

1	telnet smtp.qq.com 25

在测试过程中发现，腾讯邮箱(个人QQ邮件+企业邮箱)，只支持SMTP SSL 465端口，且不支持TLS，smtp_require_tls这个参数官方默认是true，这里需要设置为 smtp_require_tls: false，使用不同邮箱的SMTP 需要具体测试。

Prometheus监控Kubernetes HTTP集群

环境：
　　　Prometheus v1.0.1
　　　node-exporter v0.15.2
　　　Kubernetes v1.8.2

服务器环境如上

#获取k8s kube-apiserver地址和端口
[root@localhost ~]# netstat -tunlp |grep -i kube-apiserver
tcp        0      0 192.168.1.195:6443      0.0.0.0:*               LISTEN      4555/kube-apiserver 
tcp        0      0 192.168.1.195:8080      0.0.0.0:*               LISTEN      4555/kube-apiserver 
[root@localhost ~]#

1
2
3

[root@localhost prometheus]# kubectl apply -f node-exporter.yaml 
[root@localhost prometheus]# kubectl apply -f prometheus-config.yaml 
[root@localhost prometheus]# kubectl apply -f prometheus-deploy.yaml

[root@localhost prometheus]# kubectl get pod -n kube-ops -o wide
NAME                          READY     STATUS    RESTARTS   AGE       IP            NODE
node-exporter-hjgds           1/1       Running   0          23m       172.30.41.5   192.168.1.199
node-exporter-zmlcg           1/1       Running   0          23m       172.30.57.6   192.168.1.198
prometheus-5f86bc8bc5-tbnxp   1/1       Running   0          10m       172.30.41.6   192.168.1.199
[root@localhost prometheus]#

[root@localhost prometheus]# kubectl get svc -n kube-ops -o wide
NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE       SELECTOR
node-exporter   ClusterIP   172.16.40.83   <none>        9100/TCP   24m       k8s-app=node-exporter
[root@localhost prometheus]#

[root@localhost prometheus]# kubectl get ConfigMap -n kube-ops -o wide
NAME                DATA      AGE
prometheus-config   1         22m
[root@localhost prometheus]#

prometheus_dashboard_http