簡體   English   中英

Python的熊貓數據框中的Groupby函數似乎不起作用

[英]Groupby function in pandas dataframe of Python does not seem to work

我有一張表格,其中包含15個國家/地區的各種信息(例如,能源供應,可再生能源供應的比例)。 我必須創建一個數據框,其中包含每個洲的國家/地區數量,洲際水平,各個國家/地區的平均數,標准差和總人口信息。 數據幀由上述表格的數據組成。 我的問題是,在將15個國家/地區映射到各自的大陸后,我似乎無法在大陸級別匯總數據。 我必須使用預定義的詞典來解決此任務。 您能幫我嗎? 請在下面找到我的代碼:

def answer_eleven():

import numpy as np
import pandas as pd

Top15 = answer_one()
Top15['Country Name'] = Top15.index

ContinentDict  = {'China':'Asia', 
                  'United States':'North America', 
                  'Japan':'Asia', 
                  'United Kingdom':'Europe', 
                  'Russian Federation':'Europe', 
                  'Canada':'North America', 
                  'Germany':'Europe', 
                  'India':'Asia',
                  'France':'Europe', 
                  'South Korea':'Asia', 
                  'Italy':'Europe', 
                  'Spain':'Europe', 
                  'Iran':'Asia',
                  'Australia':'Australia', 
                  'Brazil':'South America'}

Top15['Continent'] = pd.Series(ContinentDict)
#Top15['size'] = Top15['Country'].count()
Top15['Population'] = (Top15['Energy Supply'] / Top15['Energy Supply per Capita'])
#columns_to_keep = ['Continent', 'Population']
#Top15 = Top15[columns_to_keep]
#Top15 = Top15.set_index('Continent').groupby(level=0)['Population'].agg({'sum': np.sum})
Top15.set_index(['Continent'], inplace = True)
Top15['size'] = Top15.groupby(['Continent'])['Country Name'].count()
Top15['sum'] = Top15.groupby(['Continent'])['Population'].sum()
Top15['mean'] = Top15.groupby(['Continent'])['Population'].mean()
Top15['std'] = Top15.groupby(['Continent'])['Population'].std()
columns_to_keep = ['size', 'sum', 'mean', 'std']
Top15 = Top15[columns_to_keep]
#Top15['Continent Name'] = Top15.index
#Top15.groupby(['Continent'], level = 0, sort = True)['size'].count()

return Top15.iloc[:5]
answer_eleven()

我相信您需要agg來匯總字典:

def answer_eleven():

    Top15 = answer_one()
    ContinentDict  = {'China':'Asia',
                      'United States':'North America',
                      'Japan':'Asia',
                      'United Kingdom':'Europe',
                      'Russian Federation':'Europe',
                      'Canada':'North America',
                      'Germany':'Europe',
                      'India':'Asia',
                      'France':'Europe',
                      'South Korea':'Asia',
                      'Italy':'Europe',
                      'Spain':'Europe',
                      'Iran':'Asia',
                      'Australia':'Australia',
                      'Brazil':'South America'}

    Top15['Population'] = (Top15['Energy Supply'] / Top15['Energy Supply per Capita'])
    Top15 = Top15.groupby(ContinentDict)['Population'].agg(['size','sum','mean','std'])
    return Top15

df = answer_eleven()
print (df)

                        sum          mean           std  size
Country Name                                                 
Asia           2.771785e+09  9.239284e+08  6.913019e+08     3
Australia      2.331602e+07  2.331602e+07           NaN     1
Europe         4.579297e+08  7.632161e+07  3.464767e+07     6
North America  3.528552e+08  1.764276e+08  1.996696e+08     2
South America  2.059153e+08  2.059153e+08           NaN     1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM