[英]Extract first and last words from strings as a new column in pandas
I am struggling to create two new columns based on string in another column. 我正在努力根据另一列中的字符串创建两个新列。
what I have 我有的
Profile
0 Technician
1 Service Engineer
2 Sales and Service Support Engineer
what I like to have 我想要拥有的
First Last
0 Technician NaN
1 Service Engineer
2 Sales Engineer
My attempt was to use solutions like 我的尝试是使用类似的解决方案
new = tl['Profile'].str.split(' ')
tl['First'] = new[0]
tl['Last'] = new[1]
But this is correct only for First. 但这仅对First是正确的。
Let's try str.extract
here: 让我们在这里尝试str.extract
:
df['Profile'].str.extract(r'^(?P<First>\S+).*?(?P<Last>\S+)?$')
First Last
0 Technician NaN
1 Service Engineer
2 Sales Engineer
Not many str
methods will be as elegant as this because of the additional need to handle sentences with one word only. 由于仅需要处理一个单词的句子,因此没有太多的str
方法会如此优雅。
You can also use str.partition
here. 您也可以在这里使用str.partition
。
u = df['Profile'].str.partition()
pd.DataFrame({'First': u[0], 'Last': u[2].str.split().str[-1]})
First Last
0 Technician NaN
1 Service Engineer
2 Sales Engineer
Without regex, using loops 没有正则表达式,使用循环
For last name 姓氏
k=[]
for i in df_names_test['Name']:
h=len(i.split(" "))
j=i.split(" ")[h-1]
k.append(j)
df_names_test["Last"]=k
for first name 名字
k=[]
for i in df_names_test['Name']:
j=i.split(" ")[0]
k.append(j)
df_names_test["First"]=k
Using Lambda functions: First name 使用Lambda函数:名字
df_names_test['First']=df_names_test['Name'].apply(lambda x: x.split(" ")[0])
Last name: 姓:
df_names_test['Last']=df_names_test['Name'].apply(lambda x: x.split(" ")[-1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.