繁体   English   中英

使用 Python 分析日志

[英]Analyze logs with Python

我有一个带有日志的 csv 文件。 我需要分析它并从文件中选择必要的信息。 问题是它有很多带有标题的表格。 他们没有名字。 表之间由空行分隔,也彼此分隔。 假设我需要从 %idle 列中选择所有数据,其中 CPU = all

结构:

09:20:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
09:21:06,all,4.98,0.00,5.10,0.00,0.00,0.00,0.06,0.00,89.86
09:21:06,0,12.88,0.00,5.62,0.03,0.00,0.02,1.27,0.00,80.18

12:08:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
12:09:06,all,5.48,0.00,5.24,0.00,0.00,0.00,0.12,0.00,89.15
12:09:06,0,18.57,0.00,5.35,0.02,0.00,0.00,3.00,0.00,73.06

09:20:06,runq-sz,plist-sz,ldavg-1,ldavg-5,ldavg-15
09:21:06,3,1444,2.01,2.12,2.15
09:22:06,4,1444,2.15,2.14,2.15

一种相当愚蠢的解决方案是对原始 CSV 使用“普通”文件阅读器。 您可以将所有内容读取到一个新的换行符作为单个 CSV,然后解析您刚刚在内存中读取的文本

每次“看到”换行符时,您都知道将其视为全新的 CSV,因此您可以对其重复上述过程。

例如,您将有一个包含以下内容的字符串:

09:20:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
09:21:06,all,4.98,0.00,5.10,0.00,0.00,0.00,0.06,0.00,89.86
09:21:06,0,12.88,0.00,5.62,0.03,0.00,0.02,1.27,0.00,80.18

然后在内存中解析它。 一旦你到了换行符,你就会知道你需要一个包含以下内容的新字符串:

12:08:06,CPU,%usr,%nice,%sys,%iowait,%steal,%irq,%soft,%guest,%idle
12:09:06,all,5.48,0.00,5.24,0.00,0.00,0.00,0.12,0.00,89.15
12:09:06,0,18.57,0.00,5.35,0.02,0.00,0.00,3.00,0.00,73.06

等等 - 您可以像这样处理尽可能多的表。

您可以使用以下程序来解析此 csv。

result={}
with open("log.csv","r") as f:
    for table in f.read().split("\n\n"):
        rows=table.split("\n")
        header=rows[0]
        for row in rows[1:]:
            for i,j in zip(header.split(",")[1:],row.split(",")[1:]):
                if i in result:
                    result[i].append(j)
                else:
                    result[i]=[j]
print(result["%idle"])

输出(%idle 的值)

['89.86', '80.18', '89.15', '73.06']

这假设表列和行值的顺序相同,并且没有两个表具有共同的列名。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM