来自“Python for Data Analysis”文本的简单groupby示例失败

Question

I just started learning python (mostly as open source replacement for matlab using "ipython --pylab" ), going through the examples from the "Python for Data Analysis" text. 我刚刚开始学习python（主要是使用“ipython --pylab”作为matlab的开源替换），通过“Python for Data Analysis”文本中的示例。 On page 253, a simple example is shown using 'groupby' (passing a list of arrays). 在页253，使用'groupby'（传递数组列表）显示一个简单的示例。 I repeat it exactly as in the text, but I get this error: "TypeError: 'Series' objects are mutable, thus they cannot be hashed" 我在文本中完全重复它，但是我得到了这个错误：“TypeError：'Series'对象是可变的，因此它们不能被散列”

import pandas as pd
from pandas import DataFrame

df = DataFrame({'key1' : ['a','a','b','b','a'],'key2' : ['one','two','one','two\
','one'],'data1' : np.random.randn(5),'data2' : np.random.randn(5)})

grouped = df['data1'].groupby(df['key1'])
means = df['data1'].groupby(df['key1'],df['key2']).mean()

-----DETAILS OF TYPEERROR------- ----- TYPEERROR的详细信息-------

TypeError                                 Traceback (most recent call last)
<ipython-input-7-0412f2897849> in <module>()
----> 1 means = df['data1'].groupby(df['key1'],df['key2']).mean()

/home/joeblow/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/core/generic.pyc in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze)
   2725 
   2726         from pandas.core.groupby import groupby
-> 2727         axis = self._get_axis_number(axis)
   2728         return groupby(self, by, axis=axis, level=level, as_index=as_index,
   2729                        sort=sort, group_keys=group_keys, squeeze=squeeze)

/home/joeblow/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_axis_number(self, axis)
    283 
    284     def _get_axis_number(self, axis):
--> 285         axis = self._AXIS_ALIASES.get(axis, axis)
    286         if com.is_integer(axis):
    287             if axis in self._AXIS_NAMES:

/home/joeblow/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/core/generic.pyc in __hash__(self)
    639     def __hash__(self):
    640         raise TypeError('{0!r} objects are mutable, thus they cannot be'
--> 641                         ' hashed'.format(self.__class__.__name__))
    642 
    643     def __iter__(self):

TypeError: 'Series' objects are mutable, thus they cannot be hashed

What simple thing am I missing here? 我在这里错过了什么简单的事情？

Answer 1

You didn't do it exactly as in the text. 你没有完全像在文本中那样做。 :^) ：^）

>>> means = df['data1'].groupby([df['key1'],df['key2']]).mean()
>>> means
key1  key2
a     one     1.127536
      two     1.220386
b     one     0.402765
      two    -0.058255
dtype: float64

If you're grouping by two arrays, you need to pass a list of the arrays. 如果要按两个数组进行分组，则需要传递一个数组列表。 You instead passed two arguments: (df['key1'],df['key2']) , which are being interpreted as by and axis . 您改为传递两个参数： (df['key1'],df['key2']) ，它们被解释为by和axis 。

来自“Python for Data Analysis”文本的简单groupby示例失败

问题描述

1 个解决方案

解决方案1
3 2014-06-19 20:33:45

来自“Python for Data Analysis”文本的简单groupby示例失败

问题描述

1 个解决方案

解决方案1 3 2014-06-19 20:33:45

解决方案1
3 2014-06-19 20:33:45