如何从带有熊猫的分组日志中绘制时间序列？

Question

I'm trying to analyze a log file using Pandas.我正在尝试使用 Pandas 分析日志文件。 I want to plot three lines for the count of levels "ERROR", "INFO", and "WARN" per second.我想为每秒“错误”、“信息”和“警告”级别的计数绘制三行。 With x = date (seconds), y = count. x = 日期（秒），y = 计数。

After importing my log file, my data frame looks like this:导入日志文件后，我的数据框如下所示：

df_logs

I floor the date per second:我每秒计算日期：

df_logs['date'] = df_logs['date'].dt.floor('S')

Then I group by message level:然后我按消息级别分组：

ds_grouped = df_logs.groupby(['date','level'])['level'].count()

From here, I'm completely stuck:从这里开始，我完全陷入困境：

type(ds_grouped)
> pandas.core.frame.DataFrame

I guess the correct seaborn plot is:我猜正确的seaborn情节是：

sns.lineplot(x='date', 
             y='count',
             hue='level', 
             data=ds_grouped)

How to plot the grouped data frame?如何绘制分组数据框？

Answer 1

Here is a way to create the plot, IIUC:这是一种创建情节的方法，IIUC：

# create test data
import numpy as np
import pandas as pd

n = 10_000
np.random.seed(123)
timestamps = pd.date_range(start='2020-08-27 09:00:00', 
                           periods=60*60*4, freq='1s')
level = ['info', 'info', 'info', 'warn','warn', 'error']

df = pd.DataFrame(
    {'timestamp': np.random.choice(timestamps, n), 
     'level': np.random.choice(level, n),})
print(df.head())

            timestamp  level
0 2020-08-27 09:59:42   info
1 2020-08-27 12:14:06   warn
2 2020-08-27 09:22:26   info
3 2020-08-27 12:24:12  error
4 2020-08-27 10:26:58   info

Second, sample in 5-minute intervals.其次，每隔 5 分钟采样一次。 You can change frequency in pd.Grouper below:您可以在下面的pd.Grouper中更改频率：

t = (df.assign(counter = 1)
     .set_index('timestamp')
     .groupby([pd.Grouper(freq='5min'), 'level']).sum()
     .squeeze()
     .unstack())
print(t.head())

level                error  info  warn
timestamp                             
2020-08-27 09:00:00     35   123    66
2020-08-27 09:05:00     32    91    73
2020-08-27 09:10:00     41   113    64
2020-08-27 09:15:00     32   110    66
2020-08-27 09:20:00     35   107    61

Third, create the plot with t.plot();第三，使用t.plot();创建绘图t.plot();

如何从带有熊猫的分组日志中绘制时间序列？

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-08-27 17:03:46

如何从带有熊猫的分组日志中绘制时间序列？

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-08-27 17:03:46

解决方案1
1 已采纳 2020-08-27 17:03:46