简体   繁体   中英

PANDAS - remove a part of a string based on the last occurrence of a character

I have a data-frame:

df1 = pd.DataFrame({'Item': ["SYD_QANTAS AIRWAYS : LTD_DOC-Turn Cost :Sep", "SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost :Jul", "SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost :Aug"]})

I would like to remove the part of a string starting from the last occurrence of the character ":". This character can be present in the middle of the string as well but I want to remove the string only from the last occurrence, so the expected result would be:

在此处输入图像描述

How do I do that?

First we can split the string and join the list of strings excluding last entry

you can try something like this

df1['Item']=df1['Item'].apply(lambda x:':'.join(x.split(':')[:-1]))

and the expected result would be:

0 SYD_QANTAS AIRWAYS : LTD_DOC-Turn Cost
1      SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost
2      SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost

removes from reversed list until first ":"

import pandas as pd

df1 = pd.DataFrame({'Item': ["SYD_QANTAS AIRWAYS : LTD_DOC-Turn Cost :Sep", 
                             "SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost :Jul", 
                             "SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost :Aug"]})

k = 0
for i in df1:
    while k < len(df1[i]):
        for j in list(reversed(df1[i][k])):
            if j==':':
                j_index = list(reversed(df1[i][k])).index(j)+1
                df1[i][k] = df1[i][k][:-j_index]
                break
        k +=1
print(df1)

outputs:

                                      Item
0  SYD_QANTAS AIRWAYS : LTD_DOC-Turn Cost
1    SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost
2    SYD_QANTAS AIRWAYS LTD_DOC-Turn Cost

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM