简体   繁体   English

Python / Pandas从结尾删除特定的字符串

[英]Python/Pandas remove specific string from ending

I am trying to remove ending 'OF' from a column in the pandas dataframe. 我试图从pandas数据帧中的列中删除结尾的'OF'。 I tried 'rstrip', 'split', but it also removes 'O' and 'F', I just need to remove 'OF'. 我尝试'rstrip','拆分',但它也删除'O'和'F',我只需要删除'OF'。 How to do that? 怎么做? Not sure why rstrip removes 'O' and 'F' when I have specifically passed 'OF'. 我不知道为什么当我专门通过'OF'时,rstrip会删除'O'和'F'。 Sorry if this question was asked before, I just couldn't find one yet. 对不起,如果以前问过这个问题,我还是找不到一个。 Thanks. 谢谢。

Sample Data: 样本数据:

l1 = [1,2,3,4]
l2 = ['UNIVERSITY OF CONN. OF','ONTARIO','UNIV. OF TORONTO','ALASKA DEPT.OF']
df = pd.DataFrame({'some_id':l1,'org':l2})
df

some_id org
1       UNIVERSITY OF CONN. OF
2       ONTARIO
3       UNIV. OF TORONTO
4       ALASKA DEPT.OF

Tried: 尝试:

df.org.str.rstrip('OF')
# df.org.str.split('OF')[0] # Not what I am looking for

Results: 结果:

0    UNIVERSITY OF CONN. # works
1                  ONTARI # 'O' was removed
2         UNIV. OF TORONT # 'O' was removed
3            ALASKA DEPT. # works

Final output needed: 需要最终输出:

0    UNIVERSITY OF CONN. 
1                  ONTARIO
2         UNIV. OF TORONTO
3            ALASKA DEPT.

You can try this regex: 你可以尝试这个正则表达式:

df.org = df.org.str.replace('(OF)$','')

where $ indicates the end of string. 其中$表示字符串的结尾。 Or 要么

df.org.str.rstrip('(OF)')

seems to work as expected. 似乎按预期工作。

Output: 输出:

0    UNIVERSITY OF CONN. 
1                 ONTARIO
2        UNIV. OF TORONTO
3            ALASKA DEPT.
Name: org, dtype: object

str.extract

Capture everything up until, and not including, a single optional 'OF' at the end of the word. 捕获所有内容,直到并且不包括单词末尾的单个可选'OF' I added a few more rows for test cases. 我为测试用例添加了几行。

df['extract'] = df.org.str.extract('(.*?)(?=(?:OF$)|$)')

#   some_id                     org               extract
#0        1  UNIVERSITY OF CONN. OF  UNIVERSITY OF CONN. 
#1        2                 ONTARIO               ONTARIO
#2        3        UNIV. OF TORONTO      UNIV. OF TORONTO
#3        4          ALASKA DEPT.OF          ALASKA DEPT.
#4        5            fooOFfooOFOF            fooOFfooOF
#5        6                      fF                    fF
#6        7                   Seven                 Seven

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:在Python中查找并删除以特定子字符串开头和结尾的字符串 - Python: Find and remove a string starting and ending with a specific substring in python 在python中查找并删除以特定数字开头和结尾的字符串 - Find and remove a string starting and ending with a specific numbers in python 在python中查找并删除以特定子字符串开头和结尾的字符串 - Find and remove a string starting and ending with a specific substring in python 如何删除以特定字符串结尾的字符串 - how to remove string ending with specific string Python:从以某些单词开头和结尾的字符串中删除子字符串 - Python: Remove substrings from string starting and ending with certain words 从pandas数据框中的字符串中删除特定URL - Remove a SPECIFIC url from a string in a pandas dataframe 从 Python pandas 数据集中删除特定数据 - Remove specific data from a Python pandas dataset Python/Pandas 删除包含特定字符串的字符串的开头 - Python/Pandas remove the start of a string which contains a specific string 从字符串中删除任何撇号 - Python Pandas - Remove any apostrophes from string - Python Pandas Python pandas dataframe:在数组列中,如果第一项包含特定字符串,则从数组中删除该项 - Python pandas dataframe : In an array column, if first item contains specific string then remove that item from array
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM