简体   繁体   中英

How to add a calculated column in a pandas dataframe?

I am new to python/pandas so I'm struggling a bit here. I have a dataframe with air quality data from 2016 to 2020. I want to calculate the annual rate of change for each measured value to compare them with the value the year before at the same day and month.

These are the first lines of the dataframe.

         Date Country       City Specie count   min   max median variance
0  2020-02-23      CR  San José   pm25    20  13.0  53.0   25.0  1232.00
1  2020-04-04      CR  San José   pm25    23  17.0  57.0   38.0  1302.57
2  2020-04-24      CR  San José   pm25    23  30.0  80.0   59.0  1966.13
3  2020-01-14      CR  San José   pm25    24  13.0  34.0   21.0   379.55
4  2020-02-07      CR  San José   pm25    23  57.0  95.0   72.0   838.97

Does anybody have an idea as to how I can proceed?

Thank you

pandas.DataFrame.pct_change You can easily retrieve it using the 'pandas:pct_change' method.

data='''
Date Country City Specie count min max median variance
0 2020-02-23 CR SanJos pm25 20 13.0 53.0 25.0 1232.00
1 2020-04-04 CR SanJos pm25 23 17.0 57.0 38.0 1302.57
2 2020-04-24 CR SanJos pm25 23 30.0 80.0 59.0 1966.13
3 2020-01-14 CR SanJos pm25 24 13.0 34.0 21.0 379.55
4 2020-02-07 CR SanJos pm25 23 57.0 95.0 72.0 838.97
5 2019-04-24 CR SanJos pm25 23 29.0 80.0 59.0 1966.13
6 2018-04-24 CR SanJos pm25 23 28.0 80.0 59.0 1966.13
7 2017-04-24 CR SanJos pm25 23 27.0 80.0 59.0 1966.13
8 2016-04-24 CR SanJos pm25 23 26.0 80.0 59.0 1966.13
'''
import pandas as pd
import datetime
import io

df = pd.read_csv(io.StringIO(data), sep=' ', parse_dates=[0], index_col=0)
df = pd.read_csv(io.StringIO(data), sep=' ', parse_dates=[0], index_col=0)
df1 = df[(df['Date'].dt.month == 4) & (df['Date'].dt.day == 24)]

df1
Date    Country City    Specie  count   min max median  variance
2   2020-04-24  CR  SanJos  pm25    23  30.0    80.0    59.0    1966.13
5   2019-04-24  CR  SanJos  pm25    23  29.0    80.0    59.0    1966.13
6   2018-04-24  CR  SanJos  pm25    23  28.0    80.0    59.0    1966.13
7   2017-04-24  CR  SanJos  pm25    23  27.0    80.0    59.0    1966.13
8   2016-04-24  CR  SanJos  pm25    23  26.0    80.0    59.0    1966.13

df1['min'].pct_change()
2         NaN
5   -0.033333
6   -0.034483
7   -0.035714
8   -0.037037
Name: min, dtype: float64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM