繁体   English   中英

Pandas 不可散列类型:'numpy.ndarray'

[英]Pandas unhashable type: 'numpy.ndarray'

df_ppc.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 892 entries, 0 to 891
Data columns (total 4 columns):
Player     892 non-null object
Mean       892 non-null object
Team       892 non-null object
Position   892 non-null object

如果我喜欢这样:

df = df_ppc.groupby(['Player'])['Mean'].max().sort_values(ascending=False)

有用。

如果我这样分组:

df = df_ppc.groupby(['Player', 'Team'])['Mean'].max().sort_values(ascending=False)

它抛出:

  File "pandas/_libs/hashtable_class_helper.pxi", line 1798, in pandas._libs.hashtable.PyObjectHashTable.factorize
  File "pandas/_libs/hashtable_class_helper.pxi", line 1718, in pandas._libs.hashtable.PyObjectHashTable._unique
TypeError: unhashable type: 'numpy.ndarray'

为什么? 我该如何解决?

编辑:

样品表:

        Player        Mean      Team  \
715  Richard Franco   0.2354   Avaí   
12       Alan Costa   0.6543   CSA   
14      Alan Santos   0.0345   Botafogo   

           Posicao 
715  Meio-Campista       
12        Zagueiro         
14   Meio-Campista  

df_pcc 是这样构建的:

position = df_players.groupby('Player')['position'].agg(pd.Series.mode)
team = df_players.groupby('Team')['time_nome'].agg(pd.Series.mode)
mean = df_players.groupby('atleta_nome').mean()['points']

df_ppc = pd.DataFrame([team, position, mean]).T

df_ppc.columns = ['Team','Position','Mean']   

df_ppc = df_ppc.reset_index() 

构建df_ppc时,只选择第一个模式,因为 function 将返回一系列而不是单个值

position = df_players.groupby('Player')['position'].agg(lambda x : x.mode().iloc[0])
team = df_players.groupby('Team')['time_nome'].agg(lambda x : x.mode().iloc[0])

例如

pd.Series([1,1,2,2]).mode()
Out[24]: 
0    1
1    2
dtype: int64

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM