Python Pandas Groupby -- by = list 給我一個錯誤？

Question

編程新手，如果這真的很簡單請見諒

我知道我們應該能夠在熊貓中使用一個列表來分組，並且它們的長度必須相等，但不知何故我無法讓它工作？

使用來自 seaborn 的 Titanic 數據集
定義年齡組的函數

def age_groups(x):
    array = []
    for i in x['age']:
        if(math.isnan(i)):
            array.append(9)
        if(i < 20):
            array.append(1)
        if(i < 40):
            array.append(2)
        if(i < 60):
            array.append(3)
        else:
            array.append(4)
    return array

groups = age_groups(titanic)
titanic.groupby(groups).mean()

我收到以下錯誤

文件“pandas\\hashtable.pyx”，第 683 行，在 pandas.hashtable.PyObjectHashTable.get_item (pandas\\hashtable.c:12322)

密鑰錯誤：2

預先感謝

Answer 1

您需要確保傳遞給 groupby 函數的變量包含在數據幀中：

import seaborn as sns
import numpy as np

titanic = sns.load_dataset('titanic')

titanic['groups'] = titanic['age']
titanic.loc[np.isnan(titanic.age), 'groups'] = 9
titanic.loc[titanic.age >= 60, 'groups'] = 4
titanic.loc[titanic.age < 60, 'groups'] = 3
titanic.loc[titanic.age < 40, 'groups'] = 2
titanic.loc[titanic.age < 20, 'groups'] = 1
titanic.groupby('groups').mean()


        survived    pclass        age  ...       fare  adult_male     alone
groups                                 ...                                 
1.0     0.481707  2.530488  11.979695  ...  31.794741    0.298780  0.329268
2.0     0.387597  2.304910  28.580103  ...  32.931200    0.658915  0.653747
3.0     0.394161  1.824818  47.354015  ...  41.481784    0.635036  0.569343
4.0     0.269231  1.538462  65.096154  ...  43.467950    0.846154  0.730769
9.0     0.293785  2.598870        NaN  ...  22.158567    0.700565  0.751412

[5 rows x 8 columns]

Answer 2

有一種更簡單的方法來獲取年齡組，那就是使用numpy.digitize ，它根據值所屬的 bin 返回一個整數，分別為0和len(bins) （此處為5 ）不足和溢出. NaN似乎進入溢出（因為它們不比任何數字都小）。

groups = np.digitize(titanic.age, [0, 20, 40, 60, titanic.age.max() + 1])
titanic.groupby(groups).age.mean()
# 1    11.979695
# 2    28.580103
# 3    47.354015
# 4    65.096154
# 5          NaN
# Name: age, dtype: float64

Python Pandas Groupby -- by = list 給我一個錯誤？

問題描述

2 個解決方案

解決方案1
1 2019-03-14 16:17:44

解決方案2
1 已采納 2019-03-14 16:22:30

Python Pandas Groupby -- by = list 給我一個錯誤？

問題描述

2 個解決方案

解決方案1 1 2019-03-14 16:17:44

解決方案2 1 已采納 2019-03-14 16:22:30

解決方案1
1 2019-03-14 16:17:44

解決方案2
1 已采納 2019-03-14 16:22:30