I have the following dataframe:
2012 2013 2014 2015 2016 2017 2018 Kategorie
0 5.31 5.27 5.61 4.34 4.54 5.02 7.07 Gewinn pro Aktie in EUR
1 13.39 14.70 12.45 16.29 15.67 14.17 10.08 KGV
2 -21.21 -0.75 6.45 -22.63 -7.75 9.76 47.52 Gewinnwachstum
3 -17.78 2.27 -0.55 3.39 1.48 0.34 NaN PEG
Now, I am selecting only the KGV
row with:
df[df["Kategorie"] == "KGV"]
Which outputs:
2012 2013 2014 2015 2016 2017 2018 Kategorie
1 13.39 14.7 12.45 16.29 15.67 14.17 10.08 KGV
How do I calculate the mean()
of the last five years (2016,15,14,13,12 in this example)?
I tried
df[df["Kategorie"] == "KGV"]["2016":"2012"].mean()
but this throws a TypeError
. Why can I not slice the columns here?
loc
supports that type of slicing (from left to right):
df.loc[df["Kategorie"] == "KGV", "2012":"2016"].mean(axis=1)
Out:
1 14.5
dtype: float64
Note that this does not necessarily mean 2012, 2013, 2014, 2015 and 2016. These are strings so it means all columns between df['2012']
and df['2016']
. There could be a column named foo
in between and it would be selected.
Not sure why the last five years are 2012-2016 (they seem to be the first five years). Notwithstanding, to find the mean for 2012-2016 for 'KGV'
, you can use
df[df['Kategorie'] == 'KGV'][[c for c in df.columns if c != 'Kategorie' and 2012 <= int(c) <= 2016]].mean(axis=1)
I used filter
and iloc
row = df[df.Kategorie == 'KGV']
row.filter(regex='\d{4}').sort_index(1).iloc[:, -5:].mean(1)
1 13.732
dtype: float64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.