简体   繁体   中英

I/O monitoring on Kubernetes / CoreOS nodes

I have a Kubernetes cluster. Provisioned with kops , running on CoreOS workers. From time to time I see a significant load spikes, that correlate with I/O spikes reported in Prometheus from node_disk_io_time_ms metric. The thing is, I seem to be unable to use any metric to pinpoint where this I/O workload actually originates from. Metrics like container_fs_* seem to be useless as I always get zero values for actual containers, and any data only for whole node.

Any hints on how can I approach the issue of locating what is to be blamed for I/O load in kube cluster / coreos node very welcome

If you are using nginx ingress you can configure it with

enable-vts-status: "true"

This will give you a bunch of prometheus metrics for each pod that has on ingress. The metric names start with nginx_upstream_

In case it is the cronjob creating the spikes, install node-exporter daemonset and check the metrics container_fs_

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM