简体   繁体   中英

python list.count always returns 0

I have a lengthy Python list and would like to count the number of occurrences of a single character. For example, how many total times does 'o' occur? I want N=4.

lexicon = ['yuo', 'want', 'to', 'sioo', 'D6', 'bUk', 'lUk'], etc.

list.count() is the obvious solution. However, it consistently returns 0. It doesn't matter which character I look for. I have double checked my file - the characters I am searching for are definitely there. I happen to be calculating count() in a for loop:

for i in range(100): 
    # random sample 500 words 
    sample = list(set(random.sample(lexicon, 500)))
    C1 = ['k']
    total = sum(len(i) for i in sample) # total words
    sample_count_C1 = sample.count(C1) / total

But it returns 0 outside of the for loop, over the list 'lexicon' as well. I don't want a list of overall counts so I don't think Counter will work.

Ideas?

If we take your list (the shortened version you supplied):

lexicon = ['yu', 'want', 'to', 'si', 'D6', 'bUk', 'lUk']

then we can get the count using sum() and a generator-expression :

count = sum(s.count(c) for s in lexicon)

so if c were, say, 'k' this would give 2 as there are two occurances of k .


This will work in a for-loop or not, so you should be able to incorporate this into your wider code by yourself.


With your latest edit, I can confirm that this produces a count of 4 for 'o' in your modified list.

If I understand your question correctly, you would like to count the number of occurrences of each character for each word in the list. This is known as a frequency distribution.

Here is a simple implementation using Counter

from  collections import Counter
lexicon = ['yu', 'want', 'to', 'si', 'D6', 'bUk', 'lUk']
chars = [char for word in lexicon for char in word]
freq_dist = Counter(chars)
Counter({'t': 2, 'U': 2, 'k': 2, 'a': 1, 'u': 1, 'l': 1, 'i': 1, 'y': 1, 'D': 1, '6': 1, 'b': 1, 's': 1, 'w': 1, 'n': 1, 'o': 1})

Using freq_dist , you can return the number of occurrences for a character.

freq_dist.get('a')
1

# get() method returns None if character is not in dict
freq_dist.get('4')
None

It's giving zero because sample.count('K') will matches k as a string. It will not consider buk or luk . If u want to calculate frequency of character go like this

for i in range(100): 
     # random sample 500 words 
     sample = list(set(random.sample(lexicon, 500)))
     C1 = ['k']
     total = sum(len(i) for i in sample) # total words
     sample_count=sum([x.count(C1) for x in sample])
     sample_count_C1 = sampl_count / total

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM