简体   繁体   English

Python中的日志分析

[英]Log Analysis in Python

For our internal monitoring process, I want to find out how many exceptions have taken place on a particular day. 对于我们的内部监控流程,我想了解特定日期发生了多少例外情况。 We want to extract the information from the log file of our application (Pylons project). 我们想从我们的应用程序(Pylons项目)的日志文件中提取信息。

I want to do this in Python itself. 我想用Python本身做这件事。 I am aware that I can write a script which will do the offline processing on the log for counting the number of exceptions (and possibly other information related to the exception as well). 我知道我可以编写一个脚本,它将对日志进行离线处理,以计算异常的数量(以及可能还有与异常相关的其他信息)。

I want to ask whether there is already some library which I can use to do log file analysis in Python or what is the best way to do this? 我想问一下是否已经有一些库可以用来在Python中进行日志文件分析,或者最好的方法是什么?

I just had a similar situation and found the logtools Python package for the job. 我只是遇到了类似的情况,并找到了logtools Python包。 I used it for analyzing a Tomcat6/Solr log file. 我用它来分析Tomcat6 / Solr日志文件。

Copy log from server and install logtools in a virtualenv: 从服务器复制日志并在virtualenv中安装logtools

mkdir /tmp/logwtf
cd /tmp/logwtf
scp server:/var/log/tomcat6/catalina.2012-02-03.log ./catalina.log
virtualenv --system-site-packages --distribute .
. bin/activate
pip install -e 'git+https://github.com/adamhadani/logtools.git#egg=logtools'

Summarize search request traffic: 总结搜索请求流量:

qps -r'^(.*?) org\.apache\.solr\.core\.SolrCore execute' \
    -F '%b %d, %Y %I:%M:%S %p' \
    -W900 \
    --ignore \
    <catalina.log

All server activity between 1:10 and 1:20 PM: 所有服务器活动在1:10到1:20 PM之间:

qps -r'^(.*? 1:1.:.. PM) ' \
    -F '%b %d, %Y %I:%M:%S %p' \
    -W15 \
    --ignore \
    <catalina.log

logtools includes additional scripts for filtering bots, tagging log lines by country, log parsing, merging, joining, sampling and filtering, aggregation and plotting, URL parsing, summary statistics and computing percentiles. logtools包括用于过滤机器人,按国家/地区标记日志行,日志解析,合并,加入,采样和过滤,聚合和绘图,URL解析,汇总统计和计算百分位数的其他脚本。 See the the package's GitHub page for more information. 有关更多信息,请参阅包的GitHub页面

Some additional info, like a sample log would be nice. 一些额外的信息,如示例日志将是不错的。 Generally speaking you can always use the powerful re library that work with regular expressions. 一般来说,您总是可以使用强大的re库来处理正则表达式。

Regular Expressions 常用表达

re Library 重新图书馆

So yeah for general problems re is always a good possibility... 所以,对于一般问题,重新总是很有可能......

If you post a sample log I can see if I find anything that fits better to ur problem. 如果您发布示例日志,我可以看到是否找到了更适合您问题的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM