get sentence from list of sentences with exact word match : Python

Question

Let's say I have a list of sentences:

sent = ["Chocolate is loved by all.", 
        "Brazil is the biggest exporter of coffee.", 
        "Tokyo is the capital of Japan.",
        "chocolate is made from cocoa."]

I want to return all sentences that have the exact full word "chocolate", ie ["Chocolate is loved by all.", "chocolate is made from cocoa."] . If any sentence does not have the word "chocolate", it shouldn't be returned. The word "chocolateyyy" should not be returned either.

How can I do this in Python?

Answer 1

This will make sure that the search word is actually a full word, rather than a sub-word like 'chocolateyyy'. It's also not case sensitive, so 'Chocolate' = 'chocolate' despite the first letters being capitalised differently.

sent = ["Chocolate is loved by all.", "Brazil is the biggest exporter of coffee.",
        "Tokyo is the capital of Japan.","chocolate is made from cocoa.", "Chocolateyyy"]

search = "chocolate"

print([i for i in sent if search in i.lower().split()])

Here's a more expanded version for clarity with an explanation:

result = []
for i in sent: # Go through each string in sent
    lower = i.lower() # Make the string all lowercase
    split = lower.split(' ') # split the string on ' ', or spaces
                     # The default split() splits on whitespace anyway though
    if search in split: # if chocolate is an entire element in the split array
        result.append(i) # add it to results
print(result)

I hope this helps :)

Answer 2

You need:

filtered_sent = [i for i in sent if 'chocolate' in i.lower()]

Output

['Chocolate is loved by all.', 'chocolate is made from cocoa.']

Answer 3

From this question , you want some of the methods in the re library . In particular:

\\b Matches the empty string, but only at the beginning or end of a word.

You can therefore search for "chocolate" using re.search(r'\\bchocolate\\b', your_sentence, re.IGNORECASE) .

The rest of the solution is just to iterate through your list of sentences and return a sublist that matches your target string.

Answer 4

You can use the regular expression library in python:

import re

sent = ["Chocolate is loved by all.", 
        "Brazil is the biggest exporter of coffee.", 
        "Tokyo is the capital of Japan.",
        "chocolate is made from cocoa."]
match_string = "chocolate"
matched_sent = [s for s in sent if len(re.findall(r"\bchocolate\b", s, re.IGNORECASE)) > 0]
print (matched_sent)

get sentence from list of sentences with exact word match : Python

Question

4 answers

solution1
5 2018-09-28 11:16:02

solution2
3 2018-09-28 11:11:09

solution3
2 2018-09-28 11:18:16

solution4
1 ACCPTED 2018-09-28 11:47:54

get sentence from list of sentences with exact word match : Python

Question

4 answers

solution1 5 2018-09-28 11:16:02

solution2 3 2018-09-28 11:11:09

solution3 2 2018-09-28 11:18:16

solution4 1 ACCPTED 2018-09-28 11:47:54

solution1
5 2018-09-28 11:16:02

solution2
3 2018-09-28 11:11:09

solution3
2 2018-09-28 11:18:16

solution4
1 ACCPTED 2018-09-28 11:47:54