I/O monitoring on Kubernetes / CoreOS nodes

Question

I have a Kubernetes cluster. Provisioned with kops , running on CoreOS workers. From time to time I see a significant load spikes, that correlate with I/O spikes reported in Prometheus from node_disk_io_time_ms metric. The thing is, I seem to be unable to use any metric to pinpoint where this I/O workload actually originates from. Metrics like container_fs_* seem to be useless as I always get zero values for actual containers, and any data only for whole node.

Any hints on how can I approach the issue of locating what is to be blamed for I/O load in kube cluster / coreos node very welcome

Answer 1

If you are using nginx ingress you can configure it with

enable-vts-status: "true"

This will give you a bunch of prometheus metrics for each pod that has on ingress. The metric names start with nginx_upstream_

In case it is the cronjob creating the spikes, install node-exporter daemonset and check the metrics container_fs_

I/O monitoring on Kubernetes / CoreOS nodes

Question

1 answers

solution1
1 2017-10-29 10:07:46

I/O monitoring on Kubernetes / CoreOS nodes

Question

1 answers

solution1 1 2017-10-29 10:07:46

solution1
1 2017-10-29 10:07:46