简体   繁体   中英

How to find specific words in a text and count them using python?

I want to check if certain words appear in an input text and if so, how many times.

Those are my inputs:

  • List of words: keywords = ["apple", "banana", "orange", "lemon"]
  • Text to scan: text = "This apple is very tasty but the banana is not delicious at all."

Now I want to count how many times a word from the keywords list appears in the input text.

So the output should look something like that for this example:

`I found 2 words.

This is what I got so far, but it's outputting 0 instead of 2 in that case.

text = "This apple is very tasty but the banana is not delicious at all."

keywords = ["apple", "banana", "orange", "lemon"]

def dictionary_score(text):
    wordcount=0
    for line in text:
        line = line.strip()
        line = line.lower()
        words = line.split(" ")
        for word in words:
            if keywords in words:
                wordcount += 1
print(f"I found {wordcount} words") 

Where is the problem with the correct counting?

The problem lies with if keywords in words: . It checks whether the entirety of the keywords list is within your words list.

You probably wanted to check whether each word is in the keywords list:

if word in keywords:
  1. text is a string, and for line in text iterates on the characters of the string. Can be replaced by for line in text.splitlines():

  2. should be if word in keywords: instead of if keywords in words:

     text = "This apple is very tasty but the banana is not delicious at all." keywords = ["apple", "banana", "orange", "lemon"] def dictionary_score(text): wordcount=0 for line in text.splitlines(): print(line) line = line.strip() line = line.lower() words = line.split(" ") for word in words: if word in keywords: wordcount += 1 print(f"I found {wordcount} words") dictionary_score(text)```

Output: I found 2 words

Your code has several errors:

text = "This apple is very tasty but the banana is not delicious at all."
keywords = ["apple", "banana", "orange", "lemon"]

def dictionary_score(text):
    wordcount=0
    for line in text: #Iterate over each string character
        line = line.strip()
        line = line.lower()
        words = line.split(" ") #Here the list will be empty, because you are operating on a character.
        for word in words: #You are iterating over a empty list
            if keywords in words: #Checking if the list keywords is in words(that is empty)
                wordcount += 1
print(f"I found {wordcount} words") 
  • for line in text: is iterating over each character of the string, after taking the char you string, lower and split it.

  • if keywords in words: here you are checking that the keywords list is in words list, that is empty because the previous explanation.

Here the fixed code:

text = "This apple is very tasty but the banana is not delicious at all."
keywords = ["apple", "banana", "orange", "lemon"]

def dictionary_score(text):
    wordcount = 0
    words = text.strip().lower().split(" ") #split the string, after stripping and lowering it
    for word in words: # Iterate over the words
        if word in keywords: # If the word is in the keywords list increment the counter
            wordcount += 1
    print(f"I found {wordcount} words") 

dictionary_score(text)

output: I found 2 words

Use Counter in collections

from collections import Counter
text = "This apple is very tasty but the banana is not delicious at all."
dict_words = Counter(text.split(" "))
dict_word.get("apple", 0 ) #Get the word count for apple

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM