如何刪除以某事開頭和結尾的子字符串？

Question

如何從以特定字符組合開頭和結尾的字符串中刪除子字符串，例如：

' bla <span class=""latex""> ... This can be different1 ... </span> blub <span class=""latex""> ... This can be different2 ... </span> bleb'

我想要的結果：

'bla blub bleb'

我嘗試過這樣的事情

string.replace('<span class=""latex"">' * '</span>', '')

但這不起作用。

有沒有辦法實現這個？

Answer 1

閱讀有關re.sub function的信息。

一個簡單的例子：

import re

s = ' cvbcx cvbcx <span class=""latex""> ... This can be different ... </span>vcvbcxbvxc'
re.sub(r'<span class=""latex"">.+</span>', '<span class=""latex""></span>', s)

>> ' cvbcx cvbcx <span class=""latex""></span>vcvbcxbvxc'

Answer 2

這可以工作：

>>> import re
>>> x=re.sub(r"""<span class=""latex"">.+?</span>""", "", s)

>>> x
' bla  blub  bleb'

正則表達式101

編輯：在 OP 澄清后，將答案更改為使用惰性量詞而不是捕獲組。 雖然這可行，但它不能擴展到更復雜的情況。 如果是這種情況，正確的解決方案是解析字符串並提取所需的內容。

Answer 3

如果您想要某些部分而不是其他部分，則需要使用組。

import re

s = ' cvbcx cvbcx <span class=""latex""> ... This can be different ... </span>vcvbcxbvxc'
r = re.search( r'(<span class=""latex"">)(.+)(</span>)', s)

print(s)
# cvbcx cvbcx <span class=""latex""> ... This can be different ... </span>vcvbcxbvxc

# print(r)
# <re.Match object; span=(13, 73), match='<span class=""latex""> ... This can be different >

print(r.group(1), r.group(3))
# <span class=""latex""> </span>

Answer 4

如果要將數據保留在兩者之間：

    >>> x
'<span class=""latex""> ... This can be different ... </span>'
>>> 
>>> d = re.sub('<(/)?span(\ class=\"\".*\"\")?(>)', '', x)
>>> 
>>> d
' ... This can be different ... '
>>>

如果要保留標簽：

>>> x
'<span class=""latex""> ... This can be different ... </span>'
>>> 
>>> 
>>> 
>>> new_data = 'abc 123 456'
>>> 
>>> 
>>> d = re.sub('\">.*</','\">{}</'.format(new_data),x)
>>> 
>>> 
>>> d
'<span class=""latex"">abc 123 456</span>'
>>> 
>>> 
>>>

如何刪除以某事開頭和結尾的子字符串？

問題描述

4 個解決方案

解決方案1
3 2019-10-18 18:11:13

解決方案2
3 已采納 2019-10-18 18:21:42

解決方案3
1 2019-10-18 18:22:39

解決方案4
1 2019-10-18 18:27:59

如何刪除以某事開頭和結尾的子字符串？

問題描述

4 個解決方案

解決方案1 3 2019-10-18 18:11:13

解決方案2 3 已采納 2019-10-18 18:21:42

解決方案3 1 2019-10-18 18:22:39

解決方案4 1 2019-10-18 18:27:59

解決方案1
3 2019-10-18 18:11:13

解決方案2
3 已采納 2019-10-18 18:21:42

解決方案3
1 2019-10-18 18:22:39

解決方案4
1 2019-10-18 18:27:59