[英]Find all duplicates words in text
我試圖在文本中查找所有重復的單詞,每個重復的單詞都包含在一個元組中,並將所有元組保存在一個列表中。 它需要在“so,so”之類的詞之間用標點符號來整理案例
我嘗試使用該模式:
/(\b\S+\b)\s+\b\1\b/
但它不會返回我正在尋找的內容,並且無法以我需要的形式保存結果
我正在尋找的示例:
the text = "i went to to a party, party at my uncle's house"
Output 末尾的 function:
[(to ,to), (party, party)]
正則表達式用於查找特定模式而不是單詞,您應該做的是 @thshea 所說的,或者您可以使用以下代碼:
_answer_ = []
the_text = "i went to to a party, party at my uncle's house"
the_text = the_text.replace(",","")
words = the_text.split(" ")
words2 = list(set(words))
for word in list(words2):
if word in words:
words.remove(word)
for word2 in words:
_answer_ += [tuple([word2,word2])]
_answer_
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.