Pandas ValueError：只能将大小为 1 的数组转换为 Python 标量

Question

With the following code:使用以下代码：

#Bring in the 'player matches' dataframe
df_pm = sql('select * from PlayerMatchesDetail', c).drop('TableIndex', axis=1)
df_pm['GoalInv'] = df_pm['Goals']+df_pm['GoalAssists']

df_pm.head(3) # THIS PRINTS FINE (see below)

# We need to associate a match ID to each row here, so that we can groupby properly.    
def MatchIDLookup(gw, ht, at):
    '''
    Takes a gameweek, hometeam, and awayteam,
    and returns the matchID of the game
    '''
    return int(df_fixtures.loc[(df_fixtures['GameWeek']==gw)
                  &(((df_fixtures['HomeTeam']==ht)
                     &(df_fixtures['AwayTeam']==at))
                   |((df_fixtures['HomeTeam']==at)
                     &(df_fixtures['AwayTeam']==ht))),'MatchID'].item())

#Apply the function to insert the matchID
df_pm['MatchID'] = df_pm.apply(lambda x: MatchIDLookup(x['GameWeek'],
                                                       x['ForTeam'],
                                                       x['AgainstTeam']), axis=1)

#Create a multi-index
df_pm.set_index(['MatchID','Player'], inplace=True)

#We now create columns in the player match dataframe, describing their expected goals, assists, and goal involvement.

#Goals
df_pm['XG'] = df.groupby(['MatchID','Player']).sum()[['XG']]
#Assists
df_pm['XA'] = df.groupby(['MatchID','AssistedBy']).sum()[['XG']]

#Fill NAs with 0s
df_pm.fillna(0, inplace=True)

#Calculate goal Involvement
df_pm['XGI'] = df_pm['XG'] + df_pm['XA']

# Let's see how player gameweeks are distributed...
plt.figure(figsize=(10,3))
plt.hist(df_pm['XG'], label='XG', bins=30)

plt.xlim(0)
plt.ylim(0,1000)
plt.title('Distribution of player XG in each match')

plt.figure(figsize=(10,3))
plt.hist(df_pm['XA'], label='XGA', bins=30, color=color_list[1])

plt.xlim(0)
plt.ylim(0,1000)
plt.title('Distribution of player XA in each match')

plt.figure(figsize=(10,3))
plt.hist(df_pm['XGI'], label='XGI', bins=30, color=color_list[2])

plt.xlim(0)
plt.ylim(0,1000)
plt.title('Distribution of player XGI in each match');
plt.show()

I am getting the following traceback:我得到以下回溯：

Traceback (most recent call last):
  File "expected_goals.py", line 365, in <module>
    x['AgainstTeam']), axis=1)
  File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/frame.py", line 6878, in apply
    return op.get_result()
  File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/apply.py", line 186, in get_result
    return self.apply_standard()
  File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/apply.py", line 296, in apply_standard
    values, self.f, axis=self.axis, dummy=dummy, labels=labels
  File "pandas/_libs/reduction.pyx", line 620, in pandas._libs.reduction.compute_reduction
  File "pandas/_libs/reduction.pyx", line 128, in pandas._libs.reduction.Reducer.get_result
  File "expected_goals.py", line 365, in <lambda>
    x['AgainstTeam']), axis=1)
  File "expected_goals.py", line 360, in MatchIDLookup
    &(df_fixtures['AwayTeam']==ht))),'MatchID'].item())
  File "/Users/me/anaconda2/envs/data_science/lib/python3.7/site-packages/pandas/core/base.py", line 652, in item
    return self.values.item()
ValueError: can only convert an array of size 1 to a Python scalar

Notes:笔记：

df.fixtures prints fine: df.fixtures打印良好：

                 MatchID  GameWeek       Date        HomeTeam                 AwayTeam
FixturesBasicID                                                                      
1                 46605         1 2019-08-09       Liverpool             Norwich City
2                 46606         1 2019-08-10     Bournemouth         Sheffield United
3                 46607         1 2019-08-10         Burnley              Southampton
4                 46608         1 2019-08-10  Crystal Palace                  Everton
5                 46609         1 2019-08-11  Leicester City  Wolverhampton Wanderers

And, before using MatchIDLookup() , df_pm.head(3) also prints fine:而且，在使用MatchIDLookup()之前， df_pm.head(3)也可以正常打印：

                                Player  GameWeek  Minutes    ForTeam  ... CreatedCentre  CreatedLeft  CreatedRight  GoalInv
PlayerMatchesDetailID                                                 ...                                                  
1                              Alisson         1       90  Liverpool  ...             0            0             0        0
2                      Virgil van Dijk         1       90  Liverpool  ...             0            0             0        1
3                         Joseph Gomez         1       90  Liverpool  ...             0            0             0        0

How do I fix this?我该如何解决？

Answer 1

Without trying it out I believe the issue is the int() in the return of MatchIDLookup() function.如果不尝试，我相信问题是MatchIDLookup() function 返回的int() )。 Pandas usually doesn't allow this. Pandas 通常不允许这样做。 Instead, return the value without conversion to int and then add below:相反，返回值而不转换为 int，然后在下面添加：

df_pm['MatchID'] = df_pm['MatchID'].astype(int)

PS Also, I would generally advise against converting any type of IDs to integers but keeping it as strings - simple reason if an id starts with zero (0654 or 0012) by converting it to integer you will lose the 4 digit format. PS 另外，我通常建议不要将任何类型的 ID 转换为整数，但将其保留为字符串 - 如果 ID 以零（0654 或 0012）开头，通过将其转换为 integer，您将失去 4 位格式。

EDIT:编辑：

def MatchIDLookup(gw, ht, at):

    res = df_fixtures.loc[(df_fixtures['GameWeek']==gw)
                  &(((df_fixtures['HomeTeam']==ht)
                     &(df_fixtures['AwayTeam']==at))
                   |((df_fixtures['HomeTeam']==at)
                     &(df_fixtures['AwayTeam']==ht))),'MatchID']

    return res.item() if len(res) > 0 else 'not found' ```

Pandas ValueError：只能将大小为 1 的数组转换为 Python 标量

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-05-05 05:34:46

Pandas ValueError：只能将大小为 1 的数组转换为 Python 标量

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-05-05 05:34:46

解决方案1
0 已采纳 2020-05-05 05:34:46