简体   繁体   中英

how to get the average of dataframe column values

                    A        B
DATE                 
2013-05-01        473077    71333
2013-05-02         35131    62441
2013-05-03           727    27381
2013-05-04           481     1206
2013-05-05           226     1733
2013-05-06           NaN     4064
2013-05-07           NaN    41151
2013-05-08           NaN     8144
2013-05-09           NaN       23
2013-05-10           NaN       10

say i have the dataframe above. what is the easiest way to get a series with the same index which is the average of the columns A and B? the average needs to ignore NaN values. the twist is that this solution needs to be flexible to the addition of new columns to the dataframe.

the closest i have come was

df.sum(axis=1) / len(df.columns)

however, this does not seem to ignore the NaN values

(note: i am still a bit new to the pandas library, so i'm guessing there's an obvious way to do this that my limited brain is simply not seeing)

Simply using df.mean() will Do The Right Thing(tm) with respect to NaNs:

>>> df
                 A      B
DATE                     
2013-05-01  473077  71333
2013-05-02   35131  62441
2013-05-03     727  27381
2013-05-04     481   1206
2013-05-05     226   1733
2013-05-06     NaN   4064
2013-05-07     NaN  41151
2013-05-08     NaN   8144
2013-05-09     NaN     23
2013-05-10     NaN     10
>>> df.mean(axis=1)
DATE
2013-05-01    272205.0
2013-05-02     48786.0
2013-05-03     14054.0
2013-05-04       843.5
2013-05-05       979.5
2013-05-06      4064.0
2013-05-07     41151.0
2013-05-08      8144.0
2013-05-09        23.0
2013-05-10        10.0
dtype: float64

You can use df[["A", "B"]].mean(axis=1) if there are other columns to ignore.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM