How to find specific words in a text and count them using python?

Question

I want to check if certain words appear in an input text and if so, how many times.

Those are my inputs:

List of words: keywords = ["apple", "banana", "orange", "lemon"]
Text to scan: text = "This apple is very tasty but the banana is not delicious at all."

Now I want to count how many times a word from the keywords list appears in the input text.

So the output should look something like that for this example:

`I found 2 words.

This is what I got so far, but it's outputting 0 instead of 2 in that case.

text = "This apple is very tasty but the banana is not delicious at all."

keywords = ["apple", "banana", "orange", "lemon"]

def dictionary_score(text):
    wordcount=0
    for line in text:
        line = line.strip()
        line = line.lower()
        words = line.split(" ")
        for word in words:
            if keywords in words:
                wordcount += 1
print(f"I found {wordcount} words")

Where is the problem with the correct counting?

Answer 1

The problem lies with if keywords in words: . It checks whether the entirety of the keywords list is within your words list.

You probably wanted to check whether each word is in the keywords list:

if word in keywords:

Answer 2

text is a string, and for line in text iterates on the characters of the string. Can be replaced by for line in text.splitlines():

should be if word in keywords: instead of if keywords in words:

 text = "This apple is very tasty but the banana is not delicious at all." keywords = ["apple", "banana", "orange", "lemon"] def dictionary_score(text): wordcount=0 for line in text.splitlines(): print(line) line = line.strip() line = line.lower() words = line.split(" ") for word in words: if word in keywords: wordcount += 1 print(f"I found {wordcount} words") dictionary_score(text)```

Output: I found 2 words

Answer 3

Your code has several errors:

text = "This apple is very tasty but the banana is not delicious at all."
keywords = ["apple", "banana", "orange", "lemon"]

def dictionary_score(text):
    wordcount=0
    for line in text: #Iterate over each string character
        line = line.strip()
        line = line.lower()
        words = line.split(" ") #Here the list will be empty, because you are operating on a character.
        for word in words: #You are iterating over a empty list
            if keywords in words: #Checking if the list keywords is in words(that is empty)
                wordcount += 1
print(f"I found {wordcount} words")

for line in text: is iterating over each character of the string, after taking the char you string, lower and split it.
if keywords in words: here you are checking that the keywords list is in words list, that is empty because the previous explanation.

Here the fixed code:

text = "This apple is very tasty but the banana is not delicious at all."
keywords = ["apple", "banana", "orange", "lemon"]

def dictionary_score(text):
    wordcount = 0
    words = text.strip().lower().split(" ") #split the string, after stripping and lowering it
    for word in words: # Iterate over the words
        if word in keywords: # If the word is in the keywords list increment the counter
            wordcount += 1
    print(f"I found {wordcount} words") 

dictionary_score(text)

output: I found 2 words

Answer 4

Use Counter in collections

from collections import Counter
text = "This apple is very tasty but the banana is not delicious at all."
dict_words = Counter(text.split(" "))
dict_word.get("apple", 0 ) #Get the word count for apple

How to find specific words in a text and count them using python?

Question

4 answers

solution1
1 2020-08-31 07:03:30

solution2
1 ACCPTED 2020-08-31 07:06:20

solution3
1 2020-08-31 07:09:44

solution4
0 2020-08-31 07:01:33

How to find specific words in a text and count them using python?

Question

4 answers

solution1 1 2020-08-31 07:03:30

solution2 1 ACCPTED 2020-08-31 07:06:20

solution3 1 2020-08-31 07:09:44

solution4 0 2020-08-31 07:01:33

solution1
1 2020-08-31 07:03:30

solution2
1 ACCPTED 2020-08-31 07:06:20

solution3
1 2020-08-31 07:09:44

solution4
0 2020-08-31 07:01:33