简体   繁体   中英

Excluding phrases from text

Assuming I have such a sentence:

text = 'Romeo and Juliet is a tragedy written by William Shakespeare early in his career about two young star-crossed lovers whose deaths ultimately reconcile their feuding families'

and a list with phrases:

phrases = ['Romeo and Juliet', 'William Shakespeare', 'career', 'lovers', 'deaths', 'feuding families']

Is it possible to exclude these phrases from the text to get:

result = ['is', 'a', 'tragedy', 'written', 'by', 'early', 'in', 'his', 'about', 'two', 'young', 'star-crossed', 'whose', 'ultimately', 'reconcile', 'their']

I have used filter before but only with single words not phrases

You can replace all your phrases with an empty string with str replace and then use str split to split the resulting string along the withspaces.

For example:

for phrase in phrases:
    text = text.replace(phrase, '')

result = text.split()

print(result)

You can just iterate over the phrases and use the replace function from python to remove them from the string. After that you split the string at the spaces and should have your desired output.

Welcome to Stackoverflow btw (;

for phrase in phrases:
    text = text.replace(phrase, '')

result = text.split(' ')
result.remove('')
print(result)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM