[英]How to add zero values to a grouping for subsequent normal subtraction Python Pandas
I have a csv file from which I get the data and do the grouping.我有一个 csv 文件,我从中获取数据并进行分组。 Which looks something like this
看起来像这样
Time![]() |
Operation![]() |
Count![]() |
---|---|---|
10:00:00 ![]() |
Up![]() |
40 ![]() |
10:00:00 ![]() |
Down![]() |
24 ![]() |
10:00:01 ![]() |
Up![]() |
4 ![]() |
10:00:01 ![]() |
Down![]() |
54 ![]() |
10:00:02 ![]() |
Down![]() |
22 ![]() |
10:00:03 ![]() |
Up![]() |
12 ![]() |
10:00:03 ![]() |
Down![]() |
11 ![]() |
To do this, I use为此,我使用
df = pd.read_csv(Ex_Csv, usecols=['Time','Count','Operation'], parse_dates=[0])
df['Time'] = df['Time'].dt.floor('S', 0).dt.time
df2 = df.groupby(['Operation', 'Time']).sum()
After I do the subtraction在我做减法之后
out = df2.loc['Up']-df2.loc['Down']
I expected that if the values for example 'up' as at 10:00:02 did not come that it would be equal to 0 and I would get 0 - 22 and I get this我预计如果在 10:00:02 时没有出现例如“向上”的值,它会等于 0,我会得到 0 - 22,我会得到这个
Time![]() |
Count![]() |
---|---|
10:00:00 ![]() |
16 ![]() |
10:00:01 ![]() |
-50 ![]() |
10:00:02 ![]() |
-22 ![]() |
10:00:03 ![]() |
1 ![]() |
But I get this但我明白了
Time![]() |
Count![]() |
---|---|
10:00:00 ![]() |
16 ![]() |
10:00:01 ![]() |
-50 ![]() |
10:00:02 ![]() |
|
10:00:03 ![]() |
1 ![]() |
Is it possible to somehow equate the value of 'up' or 'down' to zero if it didn 't come?如果它没有出现,是否有可能以某种方式将“向上”或“向下”的值等同于零?
Try to pivot your dataframe then fill null values by 0 then compute the diff:尝试 pivot 你的 dataframe 然后用 0 填充 null 值然后计算差异:
out = (df.pivot('Time', 'Operation', 'Count').fillna(0).diff(1, axis=1)['Up']
.rename('Count').reset_index())
print(out)
# Output
Time Count
0 10:00:00 16.0
1 10:00:01 -50.0
2 10:00:02 -22.0
3 10:00:03 1.0
Before the diff
, your dataframe looks like:在
diff
之前,您的 dataframe 看起来像:
>>> df.pivot('Time', 'Operation', 'Count').fillna(0)
Operation Down Up
Time
10:00:00 24.0 40.0
10:00:01 54.0 4.0
10:00:02 22.0 0.0
10:00:03 11.0 12.0
Safe way:安全方式:
out = df.pivot('Time', 'Operation', 'Count').fillna(0)
out = pd.Series(out['Up']-out['Down'], index=out.index, name='Count').reset_index()
print(out)
# Output
Time Count
0 10:00:00 16.0
1 10:00:01 -50.0
2 10:00:02 -22.0
3 10:00:03 1.0
import pandas as pd
from io import StringIO
data = StringIO("""Time;Operation;Count
10:00:00;Up;40
10:00:00;Down;24
10:00:01;Up;4
10:00:01;Down;54
10:00:02;Down;22
10:00:03;Up;12
10:00:03;Down;11
""")
df = pd.read_csv(data, sep=';')
pd.Time = pd.to_datetime(df.Time).dt.time
df.groupby(['Operation', 'Time']).sum()
df2 = pd.pivot(df, index='Time', columns='Operation', values='Count').fillna(0).astype(int)
df2.Up - df2.Down
Output: Output:
Time
10:00:00 16
10:00:01 -50
10:00:02 -22
10:00:03 1
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.