简体   繁体   English

如何将零值添加到分组以进行后续正常减法 Python Pandas

[英]How to add zero values to a grouping for subsequent normal subtraction Python Pandas

I have a csv file from which I get the data and do the grouping.我有一个 csv 文件,我从中获取数据并进行分组。 Which looks something like this看起来像这样

Time时间 Operation手术 Count数数
10:00:00 10:00:00 Up向上 40 40
10:00:00 10:00:00 Down 24 24
10:00:01 10:00:01 Up向上 4 4
10:00:01 10:00:01 Down 54 54
10:00:02 10:00:02 Down 22 22
10:00:03 10:00:03 Up向上 12 12
10:00:03 10:00:03 Down 11 11

To do this, I use为此,我使用

df = pd.read_csv(Ex_Csv, usecols=['Time','Count','Operation'], parse_dates=[0])
df['Time'] = df['Time'].dt.floor('S', 0).dt.time
df2 = df.groupby(['Operation', 'Time']).sum()

After I do the subtraction在我做减法之后

out = df2.loc['Up']-df2.loc['Down']

I expected that if the values for example 'up' as at 10:00:02 did not come that it would be equal to 0 and I would get 0 - 22 and I get this我预计如果在 10:00:02 时没有出现例如“向上”的值,它会等于 0,我会得到 0 - 22,我会得到这个

Time时间 Count数数
10:00:00 10:00:00 16 16
10:00:01 10:00:01 -50 -50
10:00:02 10:00:02 -22 -22
10:00:03 10:00:03 1 1

But I get this但我明白了

Time时间 Count数数
10:00:00 10:00:00 16 16
10:00:01 10:00:01 -50 -50
10:00:02 10:00:02
10:00:03 10:00:03 1 1

Is it possible to somehow equate the value of 'up' or 'down' to zero if it didn 't come?如果它没有出现,是否有可能以某种方式将“向上”或“向下”的值等同于零?

Try to pivot your dataframe then fill null values by 0 then compute the diff:尝试 pivot 你的 dataframe 然后用 0 填充 null 值然后计算差异:

out = (df.pivot('Time', 'Operation', 'Count').fillna(0).diff(1, axis=1)['Up']
         .rename('Count').reset_index())
print(out)

# Output
       Time  Count
0  10:00:00   16.0
1  10:00:01  -50.0
2  10:00:02  -22.0
3  10:00:03    1.0

Before the diff , your dataframe looks like:diff之前,您的 dataframe 看起来像:

>>> df.pivot('Time', 'Operation', 'Count').fillna(0)
Operation  Down    Up
Time                 
10:00:00   24.0  40.0
10:00:01   54.0   4.0
10:00:02   22.0   0.0
10:00:03   11.0  12.0

Safe way:安全方式:

out = df.pivot('Time', 'Operation', 'Count').fillna(0)
out = pd.Series(out['Up']-out['Down'], index=out.index, name='Count').reset_index()
print(out)

# Output
       Time  Count
0  10:00:00   16.0
1  10:00:01  -50.0
2  10:00:02  -22.0
3  10:00:03    1.0
import pandas as pd
from io import StringIO

data = StringIO("""Time;Operation;Count
10:00:00;Up;40
10:00:00;Down;24
10:00:01;Up;4
10:00:01;Down;54
10:00:02;Down;22
10:00:03;Up;12
10:00:03;Down;11
""")

df = pd.read_csv(data, sep=';')
pd.Time = pd.to_datetime(df.Time).dt.time
df.groupby(['Operation', 'Time']).sum()

df2 = pd.pivot(df, index='Time', columns='Operation', values='Count').fillna(0).astype(int)
df2.Up - df2.Down

Output: Output:

Time
10:00:00    16
10:00:01   -50
10:00:02   -22
10:00:03     1
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM