简体   繁体   中英

Multiple conditions to compare values of two dictionaries in python

I have the following two dictionaries:

fastaDict1 = {'seq1': 'NNNACACGT', 'seq2': 'NNNACACGT'}
fastaDict2 = {'seq1': 'NNNGCACGT'}

What I want to do is to pick one key in fastaDict1, check if the keys exists in fastaDict2, if the key exists, then I want to loop through each character in the value of the key of fastaDict1 and check whether the character match the corresponding one in fastaDict2. That is, if the same character has the same position in both strings.

I wrote the following code:

for keys in fastaDict1.keys():
    print keys
    for base in range(0, len(fastaDict1[keys])):
        if keys in fastaDict2 and fastaDict1[keys][base] != 'N' and fastaDict1[keys][base] == fastaDict2[keys][base]:
            print fastaDict1[keys][base] + '\t' + fastaDict2[keys][base] + '\t' + str(base)
        else:
            print 'Bases do not match' + '\t' + str(base)

But I get this output:

seq2
Bases do not match  0
Bases do not match  1
Bases do not match  2
Bases do not match  3
Bases do not match  4
Bases do not match  5
Bases do not match  6
Bases do not match  7
Bases do not match  8
seq1
Bases do not match  0
Bases do not match  1
Bases do not match  2
A   A   3
C   C   4
A   A   5
C   C   6
G   G   7
T   T   8

What I expect to get is:

seq1
Bases do not match  0
Bases do not match  1
Bases do not match  2
Bases do not match  3
C   C   4
A   A   5
C   C   6
G   G   7
T   T   8

I think I have a problem with the logic of the conditionals which I can't figure out, any help would be very appreciated, thanks!

You have the condition a bit wrong, according to what you want, you do not want to iterate over the values if the key is not in fastaDict2 , if that is the case, you need to move the condition - if keys in fastaDict2: before the inner for loop.

Example -

for keys in fastaDict1.keys():
    print keys
    if keys in fastaDict2:
        for base in range(0, len(fastaDict1[keys])):
            if fastaDict1[keys][base] != 'N' and fastaDict1[keys][base] == fastaDict2[keys][base]:
                print fastaDict1[keys][base] + '\t' + fastaDict2[keys][base] + '\t' + str(base)
            else:
                print 'Bases do not match' + '\t' + str(base)

Example/Demo -

>>> fastaDict1 = {'seq1': 'NNNACACGT', 'seq2': 'NNNACACGT'}
>>> fastaDict2 = {'seq1': 'NNNGCACGT'}
>>> for keys in fastaDict1.keys():
...     print(keys)
...     if keys in fastaDict2:
...         for base in range(0, len(fastaDict1[keys])):
...             if fastaDict1[keys][base] != 'N' and fastaDict1[keys][base] == fastaDict2[keys][base]:
...                 print(fastaDict1[keys][base] + '\t' + fastaDict2[keys][base] + '\t' + str(base))
...             else:
...                 print('Bases do not match' + '\t' + str(base))
...
seq2
seq1
Bases do not match      0
Bases do not match      1
Bases do not match      2
Bases do not match      3
C       C       4
A       A       5
C       C       6
G       G       7
T       T       8

If you're curious, I think this code is a bit simpler

fastaDict1 = {'seq1': 'NNNACACGT', 'seq2': 'NNNACACGT'}
fastaDict2 = {'seq1': 'NNNGCACGT'}
for key in set(fastaDict1.keys()).intersection(fastaDict2.keys()):
    print(key)
    for i, s in enumerate(fastaDict1[key]):
        if s!='N' and s==fastaDict2[key][i]:
            print('{}\t{}\t{}'.format(s,s,i))
        else:
            print('Bases do not match\t{}'.format(i))

Produces:

seq1
Bases do not match  0
Bases do not match  1
Bases do not match  2
Bases do not match  3
C   C   4
A   A   5
C   C   6
G   G   7
T   T   8

If your "problem" is the printing of seq2 lines, that is due to the check that you do AFTER you already goes into the key, also if dict2 doesn't have that key. So, move the if before like this:

for keys in fastaDict1.keys():
    print keys
    if fastaDict2.has_key(keys):
        for base in range(0, len(fastaDict1[keys])):
            if fastaDict1[keys][base] != 'N' and fastaDict1[keys][base] == fastaDict2[keys][base]:
                print fastaDict1[keys][base] + '\t' + fastaDict2[keys][base] + '\t' + str(base)
            else:
                print 'Bases do not match' + '\t' + str(base)

and your print will be:

seq2
seq1
Bases do not match      0
Bases do not match      1
Bases do not match      2
Bases do not match      3
C       C       4
A       A       5
C       C       6
G       G       7
T       T       8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM