I have a dataframe with statistical data that is cumulated with each new row. Every day a new row is added. Now I want to iterate over my column so that each row (starting from last) is substracted from the row above. The new value should be put into a new column. This is how my dataframe looks and the values in the column 'diff' are my desired outcome:
time In diff
0 2017-06-26 7.086
1 2017-06-27 8.086 1
2 2017-06-28 10.200 2.114
this is what I came up with:
for x in df['In']:
df['diff'] = df.iloc[-1] - df.iloc[-2]
but thats not it. How do I start the loop from the last row and how do I make the iloc more dynamic? Can someone help? thank you!
You can use Series.diff
:
df['diff'] = df['In'].diff()
print (df)
time In diff
0 2017-06-26 7.086 NaN
1 2017-06-27 8.086 1.000
2 2017-06-28 10.200 2.114
使用pd.Series.diff
df.assign(Diff=df.In.diff())
This can be done using shift()
:
df
In time
0 7.086 2017-06-26
1 8.086 2017-06-27
2 10.200 2017-06-28
df.sort_values('time', inplace=True)
df['diff'] = df['In'] - df['In'].shift(1)
df
In time diff
0 7.086 2017-06-26 NaN
1 8.086 2017-06-27 1.000
2 10.200 2017-06-28 2.114
Here is all you need to do.
df['diff'] = df.In - df.In.shift(1)
# In [16]: df
# Out[16]:
# time In diff
# 0 2017-06-26 7.086 NaN
# 1 2017-06-27 8.086 1.000
# 2 2017-06-28 10.200 2.114
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.