简体   繁体   中英

Add mean, median and standard deviation values as new array columns in Python

I tried to find the mean,median and standard deviation and add them as new columns for each index values in the following array:

import pandas as pd
#create a dictionary "salesDict"
salesDict = {'Samsung Galaxy S10': [769.34, 834.23, 900.12, 1021.12],
'iPhone X': [983.11, 881.21, 1210.32, 1100.34],
'Google Pixel 4': [1021.18, 1321.12, 832.14, 992.15]}
#create a list called "dates"
dates=( '01/01/2020', '01/02/2020', '01/03/2020', '01/04/2020')
#create a dataframe called "sales" with "dates" as index
sales=pd.DataFrame(salesDict,index=dates)
print(sales)
#create a Mean column that contains mean value for each date
sales["Mean"]=sales.mean(axis = 1)
#create a Median column that contains median value for each date
sales["Median"]=sales.median(axis = 1)
#create a Std column that contains the standard deviation for each date
sales["Std"]=sales.std(axis=1)
sales.drop(['Samsung Galaxy S10', 'iPhone X', 'Google Pixel 4'], axis=1, inplace=True)
print(sales)

and the result looked like this:

            Samsung Galaxy S10  iPhone X  Google Pixel 4
01/01/2020              769.34    983.11         1021.18
01/02/2020              834.23    881.21         1321.12
01/03/2020              900.12   1210.32          832.14
01/04/2020             1021.12   1100.34          992.15
                   Mean       Median         Std
01/01/2020   924.543333   953.826667   96.879804
01/02/2020  1012.186667   946.698333  192.155044
01/03/2020   980.860000   940.490000  143.694352
01/04/2020  1037.870000  1029.495000   39.779060

It turned out that I only got the mean correctly, the 2 other columns' values are wrong. Could anybody guide me through this problem as I'm just a blank paper to Python. Much appreciation!!

The problem arises because the Median gets affected by the freshly calculated Mean; the Standard Deviation gets affected by the freshly calculated Mean and Median.
In order to avoid you calculate the (mean,) median and std based on only the item columns by selecting them: sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].
These changes should correct your results:

sales["Mean"]=sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].mean(axis = 1)
sales["Median"]=sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].median(axis = 1)
sales["Std"]=sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].std(axis = 1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM