I'm trying to find all duplicated words in text, each duplicate contained in a tulpes and save all tuples in a list. it needs to colclude cases with punctuation between the words like "so, so"
I tried to use the pattern:
/(\b\S+\b)\s+\b\1\b/
but it doesnt return what im looking for, and got trouble with saving the results in the form i need
example of what im looking for:
the text = "i went to to a party, party at my uncle's house"
Output at the end of the function:
[(to ,to), (party, party)]
Regex is for finding specific patterns and not words what you should do is what @thshea said or you can use this code:
_answer_ = []
the_text = "i went to to a party, party at my uncle's house"
the_text = the_text.replace(",","")
words = the_text.split(" ")
words2 = list(set(words))
for word in list(words2):
if word in words:
words.remove(word)
for word2 in words:
_answer_ += [tuple([word2,word2])]
_answer_
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.