[英]Apply in pandas depending on row not working
我有一個買方和助理姓名的數據框,如下所示:
df = pd.DataFrame([
{ 'buyer': 'Lebron James', 'assistant': 'Lebron James' },
{ 'buyer': 'Jon Snow', 'assistant': 'Arya Stark' },
{ 'buyer': 'Frodo Baggins', 'assistant': 'Sam Gamyi' }
])
我想將買家的名字分為他們的名字和姓氏,所以預期的輸出將是:
first_name surname Lebron James Jon Snow Frodo Baggings
為此,我定義了一個函數並嘗試使用apply():
def first_name(row):
return df['buyer'][row].split()[0]
df['first_name'] = df.apply(first_name, axis = 1)
但是,出現以下錯誤:
Traceback (most recent call last):
File "<ipython-input-35-f3bcdf3bb991>", line 1, in <module>
df.apply(first_name, axis = 1)
File "/Users/javier.lopez/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 6487, in apply
return op.get_result()
File "/Users/javier.lopez/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py", line 151, in get_result
return self.apply_standard()
File "/Users/javier.lopez/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py", line 257, in apply_standard
self.apply_series_generator()
File "/Users/javier.lopez/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py", line 286, in apply_series_generator
results[i] = self.f(v)
File "<ipython-input-32-410cb25f2482>", line 2, in first_name
return df['buyer'][row].split()[0]
File "/Users/javier.lopez/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 5067, in __getattr__
return object.__getattribute__(self, name)
AttributeError: ("'Series' object has no attribute 'split'", 'occurred at index 0')
我已經了解到,使用apply with axis = 1發送了行號作為參數,所以我不明白為什么它行不通。 如果我手動將行號作為參數,它將按預期工作:
first_name(1)
要回答您的問題,您可以使用:
def first_name(x):
return x.split()[0]
df['first']=df.buyer.apply(first_name)
print(df)
assistant buyer first
0 Lebron James Lebron James Lebron
1 Arya Stark Jon Snow Jon
2 Sam Gamyi Frodo Baggins Frodo
但是,正如@Sandeep指出的那樣,您還應該將內置的熊貓解決方案視為series.str.split()
,您可以使用df.assign()
直接分配該列
df=df.assign(first=df.buyer.str.split().str[0])
assistant buyer first
0 Lebron James Lebron James Lebron
1 Arya Stark Jon Snow Jon
2 Sam Gamyi Frodo Baggins Frodo
使用Series.str.split
:
df1 = df['buyer'].str.split(expand=True).rename(columns={0:'first_name',1:"surname"})
print(df1)
first_name surname
0 Lebron James
1 Jon Snow
2 Frodo Baggins
要么:
df = df.join(df1)
print(df)
assistant buyer first_name surname
0 Lebron James Lebron James Lebron James
1 Arya Stark Jon Snow Jon Snow
2 Sam Gamyi Frodo Baggins Frodo Baggins
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.