简体   繁体   中英

counting occurrences of items in an array python

The purpose of this program is to read in a file, change all the words into individual tokens, and place these tokens into an array. The program then removes all punctuation and changes all letters to lowercase. Then the program should count how many times each command line argument occurs in the array, and print the result. My program is able to successfully create an array of depunctuated, lowercase tokens. My problem now is how to loop through the array and count the occurrences of a particular word, and how I should call these functions in the main function. My depunctuate function works as written

This is my program:

import sys
from scanner import *

def main():
    print("the name of the program is",sys.argv[0])
    for i in range(1,len(sys.argv),1):
        print("   argument",i,"is", sys.argv[i])
    tokens = readTokens("text.txt")
    cleanTokens = depunctuateTokens(tokens)
    words = [token.lower() for token in cleanTokens]
    count = find(words)
    print(words)
    print(count)
def readTokens(s):
    arr=[]
    s=Scanner("text.txt")
    token=s.readtoken()
    while (token != ""):
        arr.append(token)
        token=s.readtoken()
    s.close()
    return arr

def depunctuateTokens(arr):
    result=[]
    for i in range(0,len(arr),1):
        string=arr[i]
        cleaned=""
        punctuation="""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
        for i in range(0,len(string),1):
            if string[i] not in punctuation:
                cleaned += string[i]
        result.append(cleaned)
    return result

def find(tokens,words):
    return occurences(tokens,words)>0

def occurences(tokens,words):
    count = 0
    for i in range(0,len(words),1):
        if (words[i] == tokens):
            count += 1
        return count

main()

Use list.count .

>>> l = [1,2,3,4,5,6,7,44,4,4,4,4]
>>> print(l.count(4))
>>> 5

Your existing function isn't too far off:

def occurences(tokens,words):
    count = 0
    for i in range(0,len(words),1):
        if (words[i] == tokens):
            count += 1
        return count

The first problem is that you've indented the return count inside the for loop. That means it will return each time through the loop, which means it will only ever process the first word. So, it will return 1 if the first word matches, 0 otherwise. Just unindent that return and that problem goes away.


The second problem is that, judging by the names of the parameters, you're expecting both tokens and words to be lists of strings. So, a single word words[i] is never going to match a whole list of tokens. Maybe you wanted to test whether that word matches any of the tokens in the list , instead of whether it matches the list? In that case, you'd write:

if words[i] in tokens:

Finally, while your find function seems to call occurences properly (well, you spelled occurrences wrong, but you did so consistently, so that's OK), you don't actually call find properly, so you'll never get here. Your call looks like this:

count = find(words)

… but your definition like this:

def find(tokens,words):

You have to pass something to that tokens parameter. I'm not sure what to pass—but you're the one who designed and wrote this code; what did you write the function for?


I suspect that what you're really looking for is counts of each token. In which case, with your design, both find and occurrences should actually take a single token , not a list of tokens as an argument. In which case you don't want the in expression above, you want to rename the parameter. And you have no use for find , you want to just call occurences directly. And you want to call it in a loop, like this:

for word in words:
    count = occurences(word, words)
    print('{}: {}'.format(word, count))

And, just as your other two functions were reproducing functions already built in ( str.translate and lower ), this one is too: list.count . If you were supposed to write it yourself for learning purposes, that's fine, but if that's not part of the assignment, just use the built-in function.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM