[英]Python/Pandas for solving grouped mean, median, mode and standard deviation
[英]Add mean, median and standard deviation values as new array columns in Python
我試圖找到平均值、中位數和標准差,並將它們添加為以下數組中每個索引值的新列:
import pandas as pd
#create a dictionary "salesDict"
salesDict = {'Samsung Galaxy S10': [769.34, 834.23, 900.12, 1021.12],
'iPhone X': [983.11, 881.21, 1210.32, 1100.34],
'Google Pixel 4': [1021.18, 1321.12, 832.14, 992.15]}
#create a list called "dates"
dates=( '01/01/2020', '01/02/2020', '01/03/2020', '01/04/2020')
#create a dataframe called "sales" with "dates" as index
sales=pd.DataFrame(salesDict,index=dates)
print(sales)
#create a Mean column that contains mean value for each date
sales["Mean"]=sales.mean(axis = 1)
#create a Median column that contains median value for each date
sales["Median"]=sales.median(axis = 1)
#create a Std column that contains the standard deviation for each date
sales["Std"]=sales.std(axis=1)
sales.drop(['Samsung Galaxy S10', 'iPhone X', 'Google Pixel 4'], axis=1, inplace=True)
print(sales)
結果看起來像這樣:
Samsung Galaxy S10 iPhone X Google Pixel 4
01/01/2020 769.34 983.11 1021.18
01/02/2020 834.23 881.21 1321.12
01/03/2020 900.12 1210.32 832.14
01/04/2020 1021.12 1100.34 992.15
Mean Median Std
01/01/2020 924.543333 953.826667 96.879804
01/02/2020 1012.186667 946.698333 192.155044
01/03/2020 980.860000 940.490000 143.694352
01/04/2020 1037.870000 1029.495000 39.779060
結果我只得到了正確的平均值,其他兩列的值是錯誤的。 誰能指導我解決這個問題,因為我只是 Python 的空白紙。 非常感謝!!
出現問題是因為中值受到新計算的平均值的影響; 標准偏差受到新計算的平均值和中位數的影響。
為了避免您僅通過選擇項目列來計算(平均值)中位數和標准差:sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]]。
這些更改應該更正您的結果:
sales["Mean"]=sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].mean(axis = 1)
sales["Median"]=sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].median(axis = 1)
sales["Std"]=sales[["Samsung Galaxy S10", "iPhone X", "Google Pixel 4"]].std(axis = 1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.