[英]Syncing Prometheus and Kubewatch with Kubernetes Cluster
我想使用某些 API 从过去发生的事件中提取数据,以获取某些 Python 字典中 Kubernetes 集群中发生的所有事件。 我在互联网上发现可以将 Kube-watch 的所有数据存储在 Prometheus 上然后访问它。 我无法弄清楚如何设置它并在 python 中查看所有过去的 pod 事件。 也欢迎访问过去事件的任何替代解决方案。 谢谢!
我将描述一个不复杂的解决方案,我认为它可以满足您的所有要求。 有诸如Eventrouter 之类的工具可以获取 Kubernetes 事件并将它们推送到用户指定的接收器。 但是,正如您所提到的,您只需要 Pods 事件,因此我建议采用稍微不同的方法。
简而言之,您可以在 Pod 中运行kubectl get events --watch
命令,并使用Loki等日志聚合系统收集该命令的输出。
下面,我将提供详细的分步说明。
要仅显示 Pod 事件,您可以使用:
$ kubectl get events --watch --field-selector involvedObject.kind=Pod
我们想在 Pod 中运行这个命令。 出于安全原因,我创建了一个单独的events-collector
ServiceAccount,并分配了view
角色,我们的 Pod 将在此 ServiceAccount 下运行。
注意:我创建了一个Deployment而不是单个 Pod。
$ cat all-in-one.yml
apiVersion: v1
kind: ServiceAccount
metadata:
name: events-collector
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: events-collector-binding
subjects:
- kind: ServiceAccount
name: events-collector
namespace: default
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: events-collector
name: events-collector
spec:
selector:
matchLabels:
app: events-collector
template:
metadata:
labels:
app: events-collector
spec:
serviceAccountName: events-collector
containers:
- image: bitnami/kubectl
name: test
command: ["kubectl"]
args: ["get","events", "--watch", "--field-selector", "involvedObject.kind=Pod"]
应用上述清单后, event-collector
已创建并按预期收集 Pod 事件:
$ kubectl apply -f all-in-one.yml
serviceaccount/events-collector created
clusterrolebinding.rbac.authorization.k8s.io/events-collector-binding created
deployment.apps/events-collector created
$ kubectl get deploy,pod | grep events-collector
deployment.apps/events-collector 1/1 1 1 14s
pod/events-collector-d98d6c5c-xrltj 1/1 Running 0 14s
$ kubectl logs -f events-collector-d98d6c5c-xrltj
LAST SEEN TYPE REASON OBJECT MESSAGE
77s Normal Scheduled pod/app-1-5d9ccdb595-m9d5n Successfully assigned default/app-1-5d9ccdb595-m9d5n to gke-cluster-2-default-pool-8505743b-brmx
76s Normal Pulling pod/app-1-5d9ccdb595-m9d5n Pulling image "nginx"
71s Normal Pulled pod/app-1-5d9ccdb595-m9d5n Successfully pulled image "nginx" in 4.727842954s
70s Normal Created pod/app-1-5d9ccdb595-m9d5n Created container nginx
70s Normal Started pod/app-1-5d9ccdb595-m9d5n Started container nginx
73s Normal Scheduled pod/app-2-7747dcb588-h8j4q Successfully assigned default/app-2-7747dcb588-h8j4q to gke-cluster-2-default-pool-8505743b-p7qt
72s Normal Pulling pod/app-2-7747dcb588-h8j4q Pulling image "nginx"
67s Normal Pulled pod/app-2-7747dcb588-h8j4q Successfully pulled image "nginx" in 4.476795932s
66s Normal Created pod/app-2-7747dcb588-h8j4q Created container nginx
66s Normal Started pod/app-2-7747dcb588-h8j4q Started container nginx
您可以安装Loki来存储日志和处理查询。 Loki 就像 Prometheus,但对于日志 :)。 安装 Loki 的最简单方法是使用grafana/loki-stack Helm 图表:
$ helm repo add grafana https://grafana.github.io/helm-charts
"grafana" has been added to your repositories
$ helm repo update
...
Update Complete. ⎈Happy Helming!⎈
$ helm upgrade --install loki grafana/loki-stack
$ kubectl get pods | grep loki
loki-0 1/1 Running 0 76s
loki-promtail-hm8kn 1/1 Running 0 76s
loki-promtail-nkv4p 1/1 Running 0 76s
loki-promtail-qfrcr 1/1 Running 0 76s
您可以使用LogCLI工具对 Loki 服务器运行 LogQL 查询。 有关安装和使用此工具的详细信息可以在LogCLI 文档中找到。 我将演示如何在 Linux 上安装它:
$ wget https://github.com/grafana/loki/releases/download/v2.2.1/logcli-linux-amd64.zip
$ unzip logcli-linux-amd64.zip
Archive: logcli-linux-amd64.zip
inflating: logcli-linux-amd64
$ mv logcli-linux-amd64 logcli
$ sudo cp logcli /bin/
$ whereis logcli
logcli: /bin/logcli
要从 Kubernetes 集群外部查询 Loki 服务器,您可能需要使用Ingress资源公开它:
$ cat ingress.yml
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
name: loki-ingress
spec:
rules:
- http:
paths:
- backend:
serviceName: loki
servicePort: 3100
path: /
$ kubectl apply -f ingress.yml
ingress.networking.k8s.io/loki-ingress created
$ kubectl get ing
NAME CLASS HOSTS ADDRESS PORTS AGE
loki-ingress <none> * <PUBLIC_IP> 80 19s
最后,我创建了一个简单的 Python 脚本,我们可以用它来查询 Loki 服务器:
注意:我们需要按照文档中的描述设置LOKI_ADDR
环境变量。 您需要将<PUBLIC_IP>
替换为您的 Ingress IP。
$ cat query_loki.py
#!/usr/bin/env python3
import os
os.environ['LOKI_ADDR'] = "http://<PUBLIC_IP>"
os.system("logcli query '{app=\"events-collector\"}'")
$ ./query_loki.py
...
2021-07-02T10:33:01Z {} 2021-07-02T10:33:01.626763464Z stdout F 0s Normal Pulling pod/backend-app-5d99cf4b-c9km4 Pulling image "nginx"
2021-07-02T10:33:00Z {} 2021-07-02T10:33:00.836755152Z stdout F 0s Normal Scheduled pod/backend-app-5d99cf4b-c9km4 Successfully assigned default/backend-app-5d99cf4b-c9km4 to gke-cluster-1-default-pool-328bd2b1-288w
2021-07-02T10:33:00Z {} 2021-07-02T10:33:00.649954267Z stdout F 0s Normal Started pod/web-app-6fcf9bb7b8-jbrr9 Started container nginx2021-07-02T10:33:00Z {} 2021-07-02T10:33:00.54819851Z stdout F 0s Normal Created pod/web-app-6fcf9bb7b8-jbrr9 Created container nginx
2021-07-02T10:32:59Z {} 2021-07-02T10:32:59.414571562Z stdout F 0s Normal Pulled pod/web-app-6fcf9bb7b8-jbrr9 Successfully pulled image "nginx" in 4.228468876s
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.