简体   繁体   English

如何从字符串中删除特定的单词?

[英]How to strip a specific word from a string?

I need to strip a specific word from a string.我需要从字符串中删除特定的单词。

But I find python strip method seems can't recognize an ordered word.但我发现 python strip 方法似乎无法识别有序单词。 The just strip off any characters passed to the parameter.只是剥离传递给参数的任何字符。

For example:例如:

>>> papa = "papa is a good man"
>>> app = "app is important"
>>> papa.lstrip('papa')
" is a good man"
>>> app.lstrip('papa')
" is important"

How could I strip a specified word with python?我怎样才能用 python 去除指定的单词?

Use str.replace .使用str.replace

>>> papa.replace('papa', '')
' is a good man'
>>> app.replace('papa', '')
'app is important'

Alternatively use re and use regular expressions.或者使用re并使用正则表达式。 This will allow the removal of leading/trailing spaces.这将允许删除前导/尾随空格。

>>> import re
>>> papa = 'papa is a good man'
>>> app = 'app is important'
>>> papa3 = 'papa is a papa, and papa'
>>>
>>> patt = re.compile('(\s*)papa(\s*)')
>>> patt.sub('\\1mama\\2', papa)
'mama is a good man'
>>> patt.sub('\\1mama\\2', papa3)
'mama is a mama, and mama'
>>> patt.sub('', papa3)
'is a, and'

最简单的方法是简单地用空字符串替换它。

s = s.replace('papa', '')

You can also use a regexp with re.sub :您还可以在re.sub使用正则表达式:

article_title_str = re.sub(r'(\s?-?\|?\s?Times of India|\s?-?\|?\s?the Times of India|\s?-?\|?\s+?Gadgets No'',
                           article_title_str, flags=re.IGNORECASE)

Providing you know the index value of the beginning and end of each word you wish to replace in the character array, and you only wish to replace that particular chunk of data, you could do it like this.如果您知道字符数组中要替换的每个单词的开头和结尾的索引值,并且您只想替换该特定数据块,则可以这样做。

>>> s = "papa is papa is papa"
>>> s = s[:8]+s[8:13].replace("papa", "mama")+s[13:]
>>> print(s)
papa is mama is papa

Alternatively, if you also wish to retain the original data structure, you could store it in a dictionary.或者,如果您还希望保留原始数据结构,则可以将其存储在字典中。

>>> bin = {}
>>> s = "papa is papa is papa"
>>> bin["0"] = s
>>> s = s[:8]+s[8:13].replace("papa", "mama")+s[13:]
>>> print(bin["0"])
papa is papa is papa
>>> print(s)
papa is mama is papa

A bit 'lazy' way to do this is to use startswith - it is easier to understand this rather regexps.一个有点“懒惰”的方法是使用startswith - 它更容易理解,而不是正则表达式。 However regexps might work faster, I haven't measured.但是正则表达式可能工作得更快,我还没有测量。

>>> papa = "papa is a good man"
>>> app = "app is important"
>>> strip_word = 'papa'
>>> papa[len(strip_word):] if papa.startswith(strip_word) else papa
' is a good man'
>>> app[len(strip_word):] if app.startswith(strip_word) else app
'app is important'

If want to remove the word from only the start of the string , then you could do:如果只想从string 的开头删除单词,则可以执行以下操作:

  string[string.startswith(prefix) and len(prefix):]  

Where string is your string variable and prefix is the prefix you want to remove from your string variable.其中 string 是您的字符串变量,而 prefix 是您要从字符串变量中删除的前缀。

For example:例如:

  >>> papa = "papa is a good man. papa is the best."  
  >>> prefix = 'papa'
  >>> papa[papa.startswith(prefix) and len(prefix):]
  ' is a good man. papa is the best.'

Check it:核实:

use replace()
------------
var.replace("word for replace"," ")
-----------------------------------
one = " papa is a good man"

two = " app is important"

one.replace(" papa ", " ")

output=> " is a good man"

two.replace(" app ", " ")

output=> " is important

If we're talking about prefixes and suffixes and your version of Python is at least 3.9, then you can use these new methods :如果我们谈论的是前缀和后缀,并且您的 Python 版本至少为 3.9,那么您可以使用这些新方法

>>> 'TestHook'.removeprefix('Test')
'Hook'
>>> 'BaseTestCase'.removeprefix('Test')
'BaseTestCase'

>>> 'MiscTests'.removesuffix('Tests')
'Misc'
>>> 'TmpDirMixin'.removesuffix('Tests')
'TmpDirMixin'

It is better to最好是

  1. Split the words拆分的话

  2. Join the ones we are interested in with an if statement (you can pass in multiple words to strip)用if语句加入我们感兴趣的(你可以传入多个单词来剥离)

    sentence = "papa is a good man" sentence = "爸爸是个好人"

    ' '.join(word for word in sentence.split() if word not in ['papa']) ' '.join(句子中的单词。split()如果单词不在 ['papa'] 中)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM