[英]Pass Different Columns in Pandas DataFrame in a Custom Function in df.apply()
[英]adding multiple columns to a dataframe using df.apply and a lambda function
我正在嘗試使用 df.apply 和 lambda 函數向現有數據框添加多列。 我可以一一添加列,但無法將所有列一起添加。 我的代碼
def get_player_stats(player_name):
print(player_name)
resp = requests.get(player_id_api + player_name)
if resp.status_code != 200:
# This means something went wrong.
print('Error {}'.format(resp.status_code))
result = resp.json()
player_id = result['data'][0]['pid']
resp_data = requests.get(player_data_api + str(player_id))
if resp_data.status_code != 200:
# This means something went wrong.
print('Error {}'.format(resp_data.status_code))
result_data = resp_data.json()
check1 = len(result_data.get('data',None).get('batting',None))
# print(check1)
check2 = len(result_data.get('data',{}).get('batting',{}).get('ODIs',{}))
# check2 = result_data.get(['data']['batting']['ODIs'],None)
# print(check2)
if check1 > 0 and check2 > 0:
total_6s = result_data['data']['batting']['ODIs']['6s']
total_4s = result_data['data']['batting']['ODIs']['4s']
average = result_data['data']['batting']['ODIs']['Ave']
total_innings = result_data['data']['batting']['ODIs']['Inns']
total_catches = result_data['data']['batting']['ODIs']['Ct']
total_stumps = result_data['data']['batting']['ODIs']['St']
total_wickets = result_data['data']['bowling']['ODIs']['Wkts']
print(average,total_innings,total_4s,total_6s,total_catches,total_stumps,total_wickets)
return np.array([average,total_innings,total_4s,total_6s,total_catches,total_stumps,total_wickets])
else:
print('No data for player')
return '','','','','','',''
cols = ['Avg','tot_inns','tot_4s','tot_6s','tot_cts','tot_sts','tot_wkts']
for col in cols:
players_available[col] = ''
players_available[cols] = players_available.apply(lambda x: get_player_stats(x['playerName']) , axis =1)
我嘗試將列顯式添加到數據框中,但仍然出現錯誤
ValueError: Must have equal len keys and value when setting with an iterable
有人可以幫我弄這個嗎?
這很棘手,因為在 Pandas 中,apply 方法會隨着版本而演變。
在我的版本 (0.25.3) 以及其他最新版本中,如果函數返回pd.Series
對象,則它可以工作。
在您的代碼中,您可以嘗試更改函數中的返回值:
return pd.Series([average,total_innings,total_4s,total_6s,
total_catches,total_stumps,total_wickets])
return pd.Series(['','','','','','',''])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.