[英]Best way to re/use redis connections for prometheus django exporter
I am getting an error我收到一个错误
redis.exceptions.ConnectionError: Error 24 connecting to redis-service:6379. Too many open files.
...
OSError: [Errno 24] Too many open files
I know this can be fixed by increasing the ulimit
but I don't think that's the issue here and also this is a service running on a container.我知道这可以通过增加
ulimit
来解决,但我认为这不是问题所在,而且这是在容器上运行的服务。 The application starts up correctly works for 48 hours correctly and then I get the above error.应用程序启动正常工作 48 小时,然后我收到上述错误。 Which implies that the connections are growing over time exponentially.
这意味着连接随着时间呈指数增长。
What my application is basically doing我的应用程序基本上在做什么
- background_task (ran using celery) -> collects data from postgres and sets it on redis
background_task(使用 celery 运行)-> 从 postgres 收集数据并将其设置在 redis
- prometheus reaches the app at '/metrics' which is a django view -> collects data from redis and serves the data using django prometheus exporter
prometheus 在“/metrics”处到达应用程序,这是一个 django 视图 -> 从 redis 收集数据并使用 django prometheus 导出器提供数据
The code looks something like this代码看起来像这样
views.py
from prometheus_client.core import GaugeMetricFamily, REGISTRY
from my_awesome_app.taskbroker.celery import app
class SomeMetricCollector:
def get_sample_metrics(self):
with app.connection_or_acquire() as conn:
client = conn.channel().client
result = client.get('some_metric_key')
return {'some_metric_key': result}
def collect(self):
sample_metrics = self.get_sample_metrics()
for key, value in sample_metrics.items():
yield GaugeMetricFamily(key, 'This is a custom metric', value=value)
REGISTRY.register(SomeMetricCollector())
tasks.py
# This is my boilerplate taskbroker app
from my_awesome_app.taskbroker.celery import app
# How it's collecting data from postgres is trivial to this issue.
from my_awesome_app.utility_app.utility import some_value_calculated_from_query
@app.task()
def app_metrics_sync_periodic():
with app.connection_or_acquire() as conn:
client = conn.channel().client
client.set('some_metric_key', some_value_calculated_from_query(), ex=21600)
return True
I don't think the background data collection in tasks.py
is causing the Redis connections to grow exponentially but it's the Django view '/metrics'
in views.py
which is causing.我不认为tasks.py中的背景数据收集导致
tasks.py
连接呈指数增长,但它是views.py
中的Django视图'/metrics'
导致的。
Can you please tell me what I am doing wrong here?你能告诉我我在这里做错了什么吗? If there is a better way to read from Redis from a Django view.
如果有更好的方法从 Django 视图中读取 Redis。 The Prometheus instance scrapes the Django application every
5s
. Prometheus 实例每隔
5s
抓取一次 Django 应用程序。
This answer is according to my use case and research.这个答案是根据我的用例和研究得出的。
The issue here, according to me, is the fact that each request to /metrics
initiates a new thread where the views.py
creates new connections in the Celery
broker's connection pool.据我说,这里的问题是,对
/metrics
的每个请求都会启动一个新线程, views.py
在该线程中会在Celery
代理的连接池中创建新连接。
This can be easily handled by letting Django
manage its own Redis
connection pool through cache backend and Celery
manage its own Redis
connection pool and not use each other's connection pools from their respective threads. This can be easily handled by letting
Django
manage its own Redis
connection pool through cache backend and Celery
manage its own Redis
connection pool and not use each other's connection pools from their respective threads.
config.py
# CACHES
# ------------------------------------------------------------------------------
# For more details on options for your cache backend please refer
# https://docs.djangoproject.com/en/3.1/ref/settings/#backend
CACHES = {
"default": {
"BACKEND": "django_redis.cache.RedisCache",
"LOCATION": "redis://localhost:6379/0",
"OPTIONS": {
"CLIENT_CLASS": "django_redis.client.DefaultClient",
},
}
}
views.py
from prometheus_client.core import GaugeMetricFamily, REGISTRY
# *: Replacing celery app with Django cache backend
from django.core.cache import cache
class SomeMetricCollector:
def get_sample_metrics(self):
# *: This is how you will get the new client, which is still context managed.
with cache.client.get_client() as client:
result = client.get('some_metric_key')
return {'some_metric_key': result}
def collect(self):
sample_metrics = self.get_sample_metrics()
for key, value in sample_metrics.items():
yield GaugeMetricFamily(key, 'This is a custom metric', value=value)
REGISTRY.register(SomeMetricCollector())
This will ensure that
Django
will maintain it'sRedis
connection pool and not cause new connections to be spun up unnecessarily.这将确保
Django
将保持其Redis
连接池,并且不会导致不必要地启动新连接。
tasks.py
# This is my boilerplate taskbroker app
from my_awesome_app.taskbroker.celery import app
# How it's collecting data from postgres is trivial to this issue.
from my_awesome_app.utility_app.utility import some_value_calculated_from_query
@app.task()
def app_metrics_sync_periodic():
with app.connection_or_acquire() as conn:
# *: This will force celery to always look into the existing connection pool for connection.
client = conn.default_channel.client
client.set('some_metric_key', some_value_calculated_from_query(), ex=21600)
return True
/metrics
is hit on the web app, is by:/metrics
时,手动验证连接是否在增长的最简单方法是: $ redis-cli 127.0.0.1:6379> CLIENT LIST...
$ celery -A my_awesome_app.taskbroker worker --concurrency=20 -l ERROR -E
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.