简体   繁体   English

从字符串中提取字词的最后一个作为熊猫中的新列

[英]Extract first and last words from strings as a new column in pandas

I am struggling to create two new columns based on string in another column. 我正在努力根据另一列中的字符串创建两个新列。

what I have 我有的

     Profile
0    Technician
1    Service Engineer
2    Sales and Service Support Engineer

what I like to have 我想要拥有的

     First              Last
0    Technician         NaN
1    Service            Engineer
2    Sales              Engineer

My attempt was to use solutions like 我的尝试是使用类似的解决方案

new = tl['Profile'].str.split(' ')
tl['First'] = new[0]
tl['Last'] = new[1]

But this is correct only for First. 但这仅对First是正确的。

Let's try str.extract here: 让我们在这里尝试str.extract

df['Profile'].str.extract(r'^(?P<First>\S+).*?(?P<Last>\S+)?$')

        First      Last
0  Technician       NaN
1     Service  Engineer
2       Sales  Engineer

Not many str methods will be as elegant as this because of the additional need to handle sentences with one word only. 由于仅需要处理一个单词的句子,因此没有太多的str方法会如此优雅。


You can also use str.partition here. 您也可以在这里使用str.partition

u = df['Profile'].str.partition()
pd.DataFrame({'First': u[0], 'Last': u[2].str.split().str[-1]})

        First      Last
0  Technician       NaN
1     Service  Engineer
2       Sales  Engineer

Without regex, using loops 没有正则表达式,使用循环

For last name 姓氏

k=[]
for i in df_names_test['Name']:
    h=len(i.split(" "))
    j=i.split(" ")[h-1]
    k.append(j)


df_names_test["Last"]=k

for first name 名字

k=[]
for i in df_names_test['Name']:

    j=i.split(" ")[0]
    k.append(j)


df_names_test["First"]=k

Using Lambda functions: First name 使用Lambda函数:名字

df_names_test['First']=df_names_test['Name'].apply(lambda x: x.split(" ")[0])

Last name: 姓:

df_names_test['Last']=df_names_test['Name'].apply(lambda x: x.split(" ")[-1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM