简体   繁体   中英

How can you check for specific characters in a string?

When I run the program it always prints true. For example, if I enter AAJJ it will print true because is only checking if the first letter is true. can someone point me in the right direction? Thanks!

squence_str = raw_input("Enter either A DNA, Protein or RNA sequence:")

def DnaCheck():

    for i in (squence_str):
        if string.upper(i) =="A":
            return True
        elif string.upper(i) == "T":
            return True
        elif string.upper(i) == "C":
            return True
        elif string.upper(i) == "G":
            return True
        else:
            return False

print "DNA ", DnaCheck()

You need to check that all of the bases in the DNA sequence are valid.

def DnaCheck(sequence):
    return all(base.upper() in ('A', 'C', 'T', 'G') for base in sequence)

I like @Alexander's answer, but for variety you could see if

def dna_check(sequence):
    return set(sequence.upper()).issubset("ACGT")
    # another possibility:
    # return set(sequence).issubset("ACGTacgt")

might be faster on long sequences, especially if the odds of being a legal sequence are good (ie most of the time you will have to iterate over the whole sequence anyway).

More at the level of your current learning ...

You have the logic reversed. You have to check all the positions. If any one of them fails to identify as a nucleotide in "ACTG", then you immediately return False for the string. Only when you've passed all of the characters, can you confidently return True .

import string

def DnaCheck(squence_str):

    for i in (squence_str):
        if string.upper(i) not in "ACTG":
            return False

    return True

test_cases = ["", "AAJJ", "ACTG", "AACTGTCAA", "AACTGTCAX"]
for strand in test_cases:
    print strand, DnaCheck(strand)

Output:

 True
AAJJ False
ACTG True
AACTGTCAA True
AACTGTCAX False

Check out this picture to see how this function works!

def DnaCheck(sequence):
    for base in sequence:
        if base.upper() in ('A', 'C', 'G', 'T'):
            continue
        else:
            print('False')
            return
    print('True')

As shown in the image above, we start with iterating through each base in a given sequence using a for loop. if a given base belongs to ('A', 'C', 'G', 'T') set (green signal), function will continue to the next base and check that (go back to beginning of for loop, without running the subsequent code). It will continue to check the subsequent bases unless it meets a base which doesn't meet criteria (red signal), at which point else statement will be executed to print 'False' and function will terminate using return ( print('True') will not be executed). In case of valid sequence, after checking the last base, for loop will end and print('True') will be executed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM