简体   繁体   English

使用Elastic / Kibana搜索日志条目频率的模式?

[英]Using Elastic/Kibana to search for patterns in frequency of log entries?

I need to take millions of entries from a log (which span the past couple of years), and somehow, using the timestamp field, determine which periods, using days of the week as the grouping criteria, show the least activity. 我需要从日志中获取数百万个条目(跨越过去的几年),并使用时间戳字段以某种方式使用星期几作为分组标准来确定哪些时段显示的活动最少。 The goal is to show that, for example, Wednesdays between 02:00 and 04:00 has historically shown the lowest level of activity. 目的是表明,例如,从02:00到04:00之间的星期三历来显示最低的活动水平。 So, I'm imagining a graph, with time periods on the X-axis (00:00 - 00:14, 00:15 - 00:29, 00:30 - 00:44... or similar... you get the idea) and some kind of representation of log activity on the Y-axis. 所以,我正在想象一个图表,其时间段在X轴上(00:00-00:14、00:15-00:29、00:30-00:44 ...或类似... (有个主意),并以某种方式在Y轴上表示对数活动。 It would show 7 graph lines, one for each day of the week. 它会显示7条图形线,一周的每一天一条。 This would make it trivial to determine from the graph which period is quietest. 从图中确定哪个周期最安静将变得很简单。

I've not personally used Kibana before, but from what I know about it, it seems likely that this is the best tool to use for this kind of task. 我以前没有亲自使用过Kibana,但是据我所知,这似乎是用于执行此类任务的最佳工具。

Is there a feature or plugin, or something that has this ability already? 是否有功能或插件,或者已经具备此功能的东西 Or will I be needing to develop a custom solution to this? 还是我需要为此开发定制解决方案?

In the end, I gave up on Kibana/Elastic. 最后,我放弃了Kibana / Elastic。 There's probably a way of doing it, but instead, I just used MySQL: 可能有一种方法可以做到,但是我只是使用MySQL:

SELECT 
    t.bucket,
    COALESCE(SUM(total), 0) AS total
FROM
    tmp_time_bucket t
        LEFT JOIN
    (SELECT 
        DATE_FORMAT(FROM_UNIXTIME(FLOOR((UNIX_TIMESTAMP(launchtime)) / 300) * 300), '%H:%i:00') AS bucket,
            COUNT(launchtime) AS total
    FROM
        launchjobs
    WHERE
        launchtime <> '0000-00-00 00:00:00'
            AND DAYNAME(launchtime) = 'wednesday'
    GROUP BY FROM_UNIXTIME(FLOOR((UNIX_TIMESTAMP(launchtime)) / 300) * 300)
    ORDER BY launchtime ASC) m ON t.bucket = m.bucket
GROUP BY bucket
ORDER BY bucket ASC

...where tmp_time_bucket is a table with a single VARCHAR(8) column named bucket which contains all 288 5 minute time buckets in a 24 hour period (so, "00:00:00", "00:05:00", ... "23:50:00", "23:55:00" , you get the idea) ...其中tmp_time_bucket是一个表,其中包含名为bucket的单个VARCHAR(8)列,其中包含24小时内的所有288个5分钟时间段(因此, "00:00:00", "00:05:00", ... "23:50:00", "23:55:00" ,您就知道了)

I ran this 7 times, once for each day, and exported the resultset to CSV. 我每天运行一次7次,然后将结果集导出到CSV。 Then I used http://plot.ly and imported the data and made the graph that I needed, which (if you're interested) can be seen here: https://plot.ly/~theplankmeister/7/?share_key=FZERWAphDIQsa1swGtixb7 然后,我使用http://plot.ly并导入了数据并制作了所需的图形,可以在这里看到(如果您有兴趣): https : //plot.ly/~theplankmeister/7/?share_key = FZERWAphDIQsa1swGtixb7

Looking at the graph, I can easily see that the answer I was looking for in the data is Thursday 22:45 until Friday 00:55. 查看图表,我可以很容易地看到我在数据中寻找的答案是星期四22:45到星期五00:55。

Hope this helps someone in the future! 希望这对以后的人有所帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM