简体   繁体   中英

Python Pivot Table KeyError

I have a pivot table that I looking to get values from and add to a new column on a separate data frame. I'd like to add the 'ReleaseSpeed' value to the new dataframe based on the 'PitcherID' matching on both the pivot table and dataframe, for the given 'PitchType'... however, I'm having issues getting the code to call upon the pitch type.

To get the pivot table I used: pitcher_avg = pitcher_avg.pivot_table(index = ['PitcherID'], columns = ['PitchType'], values = ['ReleaseSpeed'], aggfunc = np.mean, fill_value = 0).reset_index()

          PitcherID ReleaseSpeed             ...                                 
PitchType                     CB         CF  ...         SF         SI         SL
0             80027     0.000000  86.022476  ...   0.000000  86.953833  80.533818
1            113724     0.000000  85.923250  ...   0.000000  89.452660  77.514283
2            142254     0.000000   0.000000  ...   0.000000  93.813669  86.085831
3            145462    75.401915  86.017263  ...  83.681604   0.000000   0.000000
4            149319    83.115615  93.617160  ...   0.000000  95.678535   0.000000
..              ...          ...        ...  ...        ...        ...        ...
868          774828     0.000000   0.000000  ...   0.000000  92.273510  84.243239
869          775376    76.968184   0.000000  ...   0.000000   0.000000  87.667449
870          783719    76.871411   0.000000  ...  85.757180  90.571193  83.681105
871          796795    73.575867   0.000000  ...  83.693867   0.000000   0.000000
872          796926    59.545178   0.000000  ...   0.000000  79.432177  70.142013

I've tried

FB_same_id_conditions = [total_data['PitcherID'] == pitcher_avg['PitcherID'] & pitcher_avg['PitchType'] == 'FB']... but this returns KeyError: 'PitchType'

I've also tried

total_data['CBAvgVelo'] = np.where(pitcher_avg['PitchType'],= 'FB' & total_data['PitcherID'] == pitcher_avg['PitcherID'],"",pitcher_avg['ReleaseSpeed'])

The desired output is something like this:

          PitcherID    CBAvgVelo  CFAvgVelo  
0             80027     0.000000  86.022476  
1             80027     0.000000  86.022476  
2            145462    75.401915  86.017263  
3            145462    75.401915  86.017263 
4            145462    75.401915  86.017263  

You can import & use pandas for it. it will easily convert from one dateframe to another dataframe. you can refer [doc1][1] & [doc2][2] & [doc3][3].

pivoted = df.pivot(index='PitcherID', columns='CBAvgVelo', values='CFAvgVelo')\
            .reset_index()
pivoted.columns.name=None
print(pivoted)
# PitcherID    CBAvgVelo  CFAvgVelo  
#0  80027     0.000000  86.022476  
#1  80027     0.000000  86.022476 ```

Hope this helps...

  [1]: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html
  [2]: https://pythoninoffice.com/get-values-rows-and-columns-in-pandas-dataframe/#:~:text=pandas%20get%20rows%20We%20can%20use.loc%20%5B%5D%20to,left%20blank%2C%20we%20can%20get%20the%20entire%20row.
  [3]: https://www.geeksforgeeks.org/python-pandas-dataframe/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM