简体   繁体   中英

Best way to count char occurences in a string

Hello I am trying to write these python lines in a single line but getting some errors due to the dictionary modifications the code is doing.

for i in range(len(string)):
    if string[i] in dict:
        dict[string[i]] += 1

the general syntax I believe is

abc = [i for i in len(x) if x[i] in array]

Would it be possible for someone to tell me how this might work considering that I am adding 1 to the value in a dictionary

Thanks

What you're trying to do can be done with dict , a generator expression and str.count() :

abc = dict((c, string.count(c)) for c in string)

Alternative using set(string) (from a comment down below by soulcheck ) :

abc = dict((c, string.count(c)) for c in set(string))

Timing

Seen the comments down below I performed a little testing among this and other answers. (with python-3.2)

Test functions:

@time_me
def test_dict(string, iterations):
    """dict((c, string.count(c)) for c in string)"""
    for i in range(iterations):
        dict((c, string.count(c)) for c in string)

@time_me
def test_set(string, iterations):
    """dict((c, string.count(c)) for c in set(string))"""
    for i in range(iterations):
        dict((c, string.count(c)) for c in set(string))

@time_me
def test_counter(string, iterations):
    """Counter(string)"""
    for i in range(iterations):
        Counter(string)

@time_me
def test_for(string, iterations, d):
    """for loop from cha0site"""
    for i in range(iterations):
        for c in string:
            if c in d:
                d[c] += 1

@time_me
def test_default_dict(string, iterations):
    """defaultdict from joaquin"""
    for i in range(iterations):
        mydict = defaultdict(int)
        for mychar in string:
            mydict[mychar] += 1

Test execution:

d_ini = dict((c, 0) for c in string.ascii_letters)
words = ['hand', 'marvelous', 'supercalifragilisticexpialidocious']

for word in words:
    print('-- {} --'.format(word))
    test_dict(word, 100000)
    test_set(word, 100000)
    test_counter(word, 100000)
    test_for(word, 100000, d_ini)
    test_default_dict(word, 100000)
    print()

print('-- {} --'.format('Pride and Prejudcie - Chapter 3 '))

test_dict(ch, 1000)
test_set(ch, 1000)
test_counter(ch, 1000)
test_for(ch, 1000, d_ini)
test_default_dict(ch, 1000)

Test results:

-- hand --
389.091 ms -  dict((c, string.count(c)) for c in string)
438.000 ms -  dict((c, string.count(c)) for c in set(string))
867.069 ms -  Counter(string)
100.204 ms -  for loop from cha0site
241.070 ms -  defaultdict from joaquin

-- marvelous --
654.826 ms -  dict((c, string.count(c)) for c in string)
729.153 ms -  dict((c, string.count(c)) for c in set(string))
1253.767 ms -  Counter(string)
201.406 ms -  for loop from cha0site
460.014 ms -  defaultdict from joaquin

-- supercalifragilisticexpialidocious --
1900.594 ms -  dict((c, string.count(c)) for c in string)
1104.942 ms -  dict((c, string.count(c)) for c in set(string))
2513.745 ms -  Counter(string)
703.506 ms -  for loop from cha0site
935.503 ms -  defaultdict from joaquin

# !!!: Do not compare this last result with the others because is timed
#      with 1000 iterations instead of 100000
-- Pride and Prejudcie - Chapter 3  --
155315.108 ms -  dict((c, string.count(c)) for c in string)
982.582 ms -  dict((c, string.count(c)) for c in set(string))
4371.579 ms -  Counter(string)
1609.623 ms -  for loop from cha0site
1300.643 ms -  defaultdict from joaquin

Alternative for Python 2.7+:

from collections import Counter

abc = Counter('asdfdffa')
print abc
print abc['a']

Output:

Counter({'f': 3, 'a': 2, 'd': 2, 's': 1})
2

This is a job for the collections module:


Option 1.- collections. defaultdict :

>>> from collections import defaultdict
>>> mydict = defaultdict(int)

then your loop becomes:

>>> for mychar in mystring: mydict[mychar] += 1

Option 2.- collections.Counter (From Felix comment):

An alternative that is better for this specific case, and from the same collections module:

>>> from collections import Counter

then you only need (!!!):

>>> mydict = Counter(mystring)

Counter is only available from python 2.7. So for python < 2.7 you should stay with defaultdict

That's not a good candidate for a list comprehension. You usually want to use list comprehensions to make list, and to have side-effects (changing global states) in them is not such a good idea.

One the other hand, your code might be better off like this:

for c in string:
    if c in dict:
        dict[c] += 1

Or if you really want to get functional (I've renamed dict to d because I need python's built-in dict function):

d.update(dict([ (c, d[c]+1, ) for c in string ]))

Notice how I did not change d within the list comprehension, but instead updated d outside of it.

>>> def count(s):
    global k
    list =[]
    for i in s:
        k=0
        if i not in list:
            list.append(i)      
            for j in range(len(s)):
                if i == s[j]:
                    k +=1

            print 'count of char {0}:{1}'.format(i,k)


>>> count('masterofalgorithm')
count of char m:2
count of char a:2
count of char s:1
count of char t:2
count of char e:1
count of char r:2
count of char o:2
count of char f:1
count of char l:1
count of char g:1
count of char i:1
count of char h:1
>>> 

Your original loop is hopelessly unPythonic. There's no need to iterate through range(len(string)) if all you want is to iterate through the letters in string . Do this instead:

for c in my_string:
    if c in my_dict:
        my_dict[c] += 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM