简体   繁体   English

将python Web应用程序指标发送到InfluxDB

[英]Send python web app metrics to InfluxDB

For now I have a such monitoring system configured: 现在,我已经配置了这样的监视系统:

web_app (via python statsd client) -> statsd -> ...
    ... -> carbon-relay-ng -> carbon-cache -> whisper

And I use Grafana over Graphite as a graphing tool. 我将Grafana Graphite而不是Graphite

Due to too poor query performance I've decided to change this stack to just InfluxDB + Grafana bundle. 由于查询性能太差,我决定将此堆栈更改为仅InfluxDB + Grafana捆绑包。 So, my question is how can I send app metrics to the InfluxDB ? 因此,我的问题是如何将应用程序指标发送到InfluxDB I prefer to keep this bundle pretty simple, so I would like to skip statsd if it's possible. 我希望此捆绑包非常简单,因此,如果可能,我想跳过statsd Should I replace python statsd client with influxdb-python and use telegraf UDP service as an aggregation part in front of the InfluxDB or just send metrics directly to the InfluxDB instance? 我应该用influxdb-python代替python statsd客户端还是telegraf UDP服务用作InfluxDB前面的聚合部分,还是直接将指标发送到InfluxDB实例?

I would send the data to telegraf using the line protocol. 我会使用线路协议将数据发送到telegraf。

I've used influxdb-python a lot to submit stats directly to InfluxDB. 我已经大量使用了influxdb-python将统计信息直接提交给InfluxDB。 Sending the results locally to telegraf may well be quicker, depending on how quickly and reliably your InfluxDB installation responds - this will block your application if there are delays. 将结果本地发送到telegraf可能会更快,这取决于InfluxDB安装响应的速度和可靠性-如果有延迟,这将阻止您的应用程序。

The line protocol seems easier to use to me than the other options, and telegraf can accept the line protocol directly. 在我看来,线路协议比其他选项更易于使用,并且Telegraf可以直接接受线路协议。 A potential downside is that anything you send that way would end up in the database allocated to telegraf stats. 潜在的弊端是,您以这种方式发送的任何信息最终都会存储在分配给telegraf统计信息的数据库中。 Going directly to InfluxDB you can choose which database your data will end up in, although it means bypassing the python module if you'd like to use the line protcol format. 直接进入InfluxDB,您可以选择数据最终存储在哪个数据库中,尽管如果您要使用行协议格式,则意味着绕过python模块。

To use influxdb-python and send to InfluxDB directly, you have a choice of JSON format or using a subclass of SeriesHelper 要使用influxdb-python并直接发送到InfluxDB,您可以选择JSON格式或使用SeriesHelper的子类。

JSON JSON

Creating the JSON structure that the write_points / write uses is really awkward and clunky. 创建write_points / write使用的JSON结构确实很笨拙而且笨拙。 It only converts it into line format anyway. 无论如何它只会将其转换为行格式。

Compare the JSON: 比较JSON:

json_body = [
    {
        "measurement": "cpu_load_short",
        "tags": {
            "host": "server01",
            "region": "us-west"
        },
        "time": "2009-11-10T23:00:00Z",
        "fields": {
            "value": 0.64
        }
    }
]

to the line format: 改为行格式:

# measurement,tag1=tag1value,tag2=tag2value column1=... 
cpu_load_short,host=server01,region=us-west value=0.64 1465290833288375000

I know which I think is easier to produce (And I know the timestamps don't match, I'm just using examples). 我知道我认为更容易制作(我知道时间戳不匹配,我只是在使用示例)。 The line format can be POST ed straight to InfluxDB using the requests library, or sent via UDP if that listener has been configured. 行格式可以是POST直编到InfluxDB使用requests库,或者如果该收听已被配置为经由UDP发送。

SeriesHelper SeriesHelper

The module has a way to just accept the values and tags, by using SeriesHelper which can be awkward to set up, but is easy to use. 通过使用SeriesHelper ,该模块有一种只接受值和标签的方法,该方法可能难以设置,但易于使用。

The example they give is: 他们给出的示例是:

from influxdb import InfluxDBClient, SeriesHelper

myclient = InfluxDBClient(host, port, user, password, dbname)

class MySeriesHelper(SeriesHelper):
    # Meta class stores time series helper configuration.
    class Meta:
        client = myclient
        series_name = 'events.stats.{server_name}'
        fields = ['some_stat', 'other_stat']
        tags = ['server_name']
        bulk_size = 5
        autocommit = True


MySeriesHelper(server_name='us.east-1', some_stat=159, other_stat=10)
MySeriesHelper(server_name='us.east-1', some_stat=158, other_stat=20)

So you can see from calling MySeriesHelper, that makes life easy once it's set up, but the configuration for the client either needs to be set up in the global scope (which is bad for a module) or in the class definition. 因此,从调用MySeriesHelper可以看到,这使设置后的工作变得轻松,但是需要在全局范围(这对模块不利)或类定义中设置客户端的配置。 This isn't good for getting configuration from a config file or service discovery, so you end up doing things like this in your config parsing functions: 这不利于从配置文件或服务发现中获取配置,因此最终会在配置解析功能中执行以下操作:

# Read host, port, user password, dbname from config file, then:
MySeriesHelper.Meta.client = InfluxDBClient(host, port, user, password, dbname)
# Now it is safe to call MySeriesHelper

I've not had issues with reliability with influxdb-python, and most of the time we use SeriesHelper classes. 我没有influxdb-python的可靠性问题,并且大多数时候我们使用SeriesHelper类。 It's not the most complex of things, but the idea behind metrics is not that one person with the knowledge adds it all, but that it's part of the way of life of all the people writing code at every part in the chain. 这不是最复杂的事情,但是指标背后的想法并不是让一个拥有知识的人将所有知识都添加进去,而是这是所有在链中每个部分编写代码的人们的生活方式的一部分。 From that perspective, ease of use is key to get people to adopt a tool. 从这个角度来看,易用性是促使人们采用工具的关键。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM