简体   繁体   English

根据 Pandas Dataframe 中的时间戳列过滤给定的列(计数)

[英]Filter a given Column(count) based on timestamp Column in Pandas Dataframe

  1. I have a Pandas Dataframe which looks like below我有一个 Pandas Dataframe 如下所示

在此处输入图像描述

  • I want my Output or Visualization Plots which tell:我想要我的 Output 或可视化图告诉:
  • During which Hour, how many Jobs have failed,completed (count)在哪个小时内,有多少作业失败,完成(计数)

First filter by boolean indexing only rows filled by Failed and then use crosstab with DataFrame.plot.bar :首先按boolean indexingFailed填充的行,然后使用带有DataFrame.plot.barcrosstab

df1 = df[df['Status'].eq('Failed')]
out = pd.crosstab(df1['Hour'], df1['Job'])

out.plot.bar()
import pandas as pd

df = pd.read_csv('./data.csv')

# status
status = set(df['Status'])
dfStatus = {s: df[df['Status'] == s] for s in status}

# hours
hours = set(df['Hour'])
dfStatusPerHour = {}

# calculate them explicitly
for s in status:
    dfStatusPerHour[s] = {h: dfStatus[s][dfStatus[s]['Hour'] == h].shape[0] for h in hours}

# show results
for s in status:
    print(f"{s} : {dfStatusPerHour[s]}")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM