簡體   English   中英

如何在python中刪除兩個特定單詞之間的文本

[英]How to remove text between two specfic words in python

我使用漂亮的湯包解析了一個 url 以獲取其文本。 我想刪除條款和條件部分中的所有文本,即“關鍵條款:……適用條款和條件”段落中的所有字詞。

以下是我嘗試過的:

import re

#"text" is part of the text contained in the url
text="Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"
rex=re.compile('Key\ (.*?)T&Cs.')"""to remove words between "Key" and 
"T&Cs" """
terms_and_cons=rex.findall(text)
text=re.sub("|".join(terms_and_cons)," ",text)
#I also tried: text=re.sub(terms_and_cons[0]," ",text)
print(text)

即使列表“terms_and_cons”非空,上面的內容也只是保持字符串“text”不變。 如何成功刪除“Key”和“T&Cs”之間的單詞? 請幫我。 我已經被這段所謂的簡單代碼困住了很長一段時間,它變得非常令人沮喪。 謝謝你。

您在正則表達式中缺少re.DOTALL標志,以將換行符與點匹配。

方法 1:使用 re.sub

import re

text="""Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"""

rex = re.compile("Key\s(.*)T&Cs", re.DOTALL)
text = rex.sub("Key T&Cs", text)
print(text)

方法二:使用組

將文本與組匹配並從原始文本中刪除該組的文本。

import re

text="""Welcome to Company Key.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
Key Terms; Single bets only. Any returns from the free bet will be paid 
back into your account minus the free bet stake. Free bets can only be 
placed at maximum odds of 5.00 (4/1). Bonus will expire midnight, Tuesday 
26th February 2019. Bonus T&Cs and General T&Cs apply.                                                                                                                                                                                                                                                    
"""

rex = re.compile("Key\s(.*)T&Cs", re.DOTALL)
matches = re.search(rex, text)
text = text.replace(matches.group(1), "")
print(text)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM