如何在python中刪除兩個特定單詞之間的文本

Question

我使用漂亮的湯包解析了一個 url 以獲取其文本。 我想刪除條款和條件部分中的所有文本，即“關鍵條款：……適用條款和條件”段落中的所有字詞。

以下是我嘗試過的：

import re

#"text" is part of the text contained in the url
text="Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"
rex=re.compile('Key\ (.*?)T&Cs.')"""to remove words between "Key" and 
"T&Cs" """
terms_and_cons=rex.findall(text)
text=re.sub("|".join(terms_and_cons)," ",text)
#I also tried: text=re.sub(terms_and_cons[0]," ",text)
print(text)

即使列表“terms_and_cons”非空，上面的內容也只是保持字符串“text”不變。 如何成功刪除“Key”和“T&Cs”之間的單詞？ 請幫我。 我已經被這段所謂的簡單代碼困住了很長一段時間，它變得非常令人沮喪。 謝謝你。

Answer 1

您在正則表達式中缺少re.DOTALL標志，以將換行符與點匹配。

方法 1：使用 re.sub

import re

text="""Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"""

rex = re.compile("Key\s(.*)T&Cs", re.DOTALL)
text = rex.sub("Key T&Cs", text)
print(text)

方法二：使用組

將文本與組匹配並從原始文本中刪除該組的文本。

import re

text="""Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"""

rex = re.compile("Key\s(.*)T&Cs", re.DOTALL)
matches = re.search(rex, text)
text = text.replace(matches.group(1), "")
print(text)

如何在python中刪除兩個特定單詞之間的文本

問題描述

1 個解決方案

解決方案1
1 已采納 2019-08-02 08:03:54

如何在python中刪除兩個特定單詞之間的文本

問題描述

1 個解決方案

解決方案1 1 已采納 2019-08-02 08:03:54

解決方案1
1 已采納 2019-08-02 08:03:54