I have a long block of text that contains a subtext that I want to remove based on a partial match (90%).
string = "Adam is a boy who lives in Michigan.
He loves to eat apples and oranges.
He also enjoys playing with his dog and cat.
Adam is a happy boy."
substring = "He loves to apple oranges"
And I want to return
"Adam is a boy who lives in Michigan.
He also enjoys playing with his dog and cat.
Adam is a happy boy."
The words "eat" and "and" don't appear in the substring, but I want to remove the whole sentence "He loves to eat apples and oranges." I'm not really sure how to do this. Thanks!
You can use difflib.SequenceMatcher
:
from difflib import SequenceMatcher
'\n'.join(s for s in string.splitlines() if SequenceMatcher(' '.__eq__, s, substring).ratio() < 0.6)
This returns:
Adam is a boy who lives in Michigan.
He also enjoys playing with his dog and cat.
Adam is a happy boy.
string = string.replace(substring,'')
这会将字符串中的子字符串替换为空( ""
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.