简体   繁体   中英

Check if specific characters are in a string

I need to find and count how many characters can be found in a string. I have divided the characters into chars1[a:m] and chars2[n:z], and have two counters.

The output should be 0/14, but it is 0/1 instead. I think it only checks to see if one and only one item is contained and then exits out the loop. Is that the case?

Here is the code.

string_1 = "aaabbbbhaijjjm"

def error_printer(s):
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = 0
    counter2 = 0

    if ((c in s) for c in chars1):
        counter1 += 1
    elif ((c in s) for c in chars2):
        counter2 += 1
    print(str(counter2) + "/" + str(counter1))

error_printer(string_1)

Number of characters in chars1 / chars2 that occur in s

That makes sense since you increment with an if condition . Since the if is not in a loop, you can increment it once.

Now we can unfold the generator into a for loop . This will solve one part of the problem and generate 0/6 :



        counter2 += 1

Nevertheless, this still will not be terribly efficient: it requires O(n) worst case to check if a character is in a string. You can construct a set first with the characters in the string, and then perform lookups (which are usually O(1) on average case:

def error_printer(s):
    
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = 0
    counter2 = 0
    for c in chars1:
        if c in :
            counter1 += 1
    for c in chars2:
        if c in :
            counter2 += 1
    print(str(counter2) + "/" + str(counter1))

Now we have improved the efficiency, but it is still not very elegantly: it takes a lot of code, and furthermore one has to inspect the code in order to know what it does. We can use a sum(..) construct to calculate the number of elements that satisfy a certain constraint like:

def error_printer(s):
    sset = set(s)
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    
    print(str(counter2) + "/" + str(counter1))

This produces 0/6 since there are six characters in the [AM] range that occur in s and 0 in the [NZ] range that occur in s .

Number of characters in s that occur in char1 / char2

Based on the body of the question however, you want to count the number of characters in s that occur in the two different ranges .

An other related problem is counting the number of characters that occur in char1 / char2 . In that case we simply have to swap the loops :

def error_printer(s):
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = sum(c in  for c in )
    counter2 = sum(c in  for c in )
    print(str(counter2) + "/" + str(counter1))

This produces 0/14 since there are 14 characters in s that occur in the [AM] range (if 'a' occurs twice in s , then we count it twice), and none of the characters in s occur in the [NZ] range.

Using range checks

Since we are working with ranges , we can use comparisons instead of element checks, and make it run with two comparison checks, like:

def error_printer(s):
    counter1 = sum( for c in s)
    counter2 = sum( for c in s)
    print(str(counter2) + "/" + str(counter1))

Try to increment using an if condition, with a single loop over s.

for c in s:
    if c in char1:
        counter1 += 1
    if c in char2:
        counter2 += 1

An alternative to the for loops:

string_1 = "aaabbbbhaijjjm"

def error_printer(s):
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"

    counter1 = sum(s.count(c) for c in chars1)
    counter2 = sum(s.count(c) for c in chars2)

    print(str(counter2) + "/" + str(counter1))

error_printer(string_1)

Where you count how many times "a" , "b" , "c" ... show in the string input, then you sum it up.

It's still inefficient, but leverages the string.count and sum functions, making it a bit easier to read and understand what is happening.

A single for loop over s for both counters:

string_1 = "aaabbbbhaijjjm"

def error_printer(s):
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    counter1 = 0
    counter2 = 0
    for c in s:
        if c in chars1:
            counter1 += 1
        if c in chars2:
            counter2 += 1
    print(str(counter2) + "/" + str(counter1))

error_printer(string_1)

You could use collections.Counter combined with sum :

from collections import Counter

def error_printer(s):
    cnts = Counter(s)
    chars1 = "abcdefghijklm"
    chars2 = "nopqrstuvwxyz"
    print(sum(cnts[c] for c in chars2), '/', sum(cnts[c] for c in chars1))

>>> error_printer("aaabbbbhaijjjm")
0 / 14

As was already suggested, for loops are the way to go. You used a generator expression as a boolean for the if statement, which will only ever run once. The first expression was evaluated True , but that won't make it run the enclosed code more than once. Because the first if did run, however, the elif never even evaluated its conditional. This is why you want for loops, but you shouldn't loop over char1 and char2, you want to loop over s:

for c in s:
    if c in char1:
        counter1 += 1
    if c in char2:
        counter2 += 1
print(str(counter2) + "/" + str(counter1))

This points us towards a few even slicker ways to do this, first by using the c in charX as the iterators:

for c in s:
    counter1 += c in char1
    counter2 += c in char2

Now this is getting to be a little bit less clear, but we can make it even cleaner by adding a second for loop:

char = [‘abcdefghijklm’,’nopqrstuvwxyz’]
counter = [0,0]
for c in s:
    for i in [0,1]:
        counter[i] += c in char[i]

This is probably pushing it a little bit too far, but I hope it helps you see how you can rearrange these things in python!

(edit based on comments below)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM