简体   繁体   中英

How to compare keys in two different lists of tuples in Python?

I have two individual lists of tuples in the form of dictionaries. Located in the parentheses are my keys and in the brackets are my values.

File1:

('8', '158116110')
['0.00509']
('6', '44338625')
['0.00525']
('3', '127518469')
['2.56E-05']
('9', '141754441')
['0.00585']

File2:

('9', '154331672')
['0.165435473']
('8', '100949929')
['0.493410385']
('9', '120747803')
['0.969364472']
('1', '12152579')
['0.669831913']

Working specifically with the keys in both lists, I would like to count how many of those keys are within a 10000 range of one another.

If you notice, I have two keys per value. I would like my code to be formatted in such a way whereas: If the first digit of an individual key in File1 (for example '8') equals the first digit of an individual key in File2 (for example '8') AND if the second digits of those individual keys (for example '158116110' and '100949929') are in 10000 range of each other, count+=1

This is what I have thus far:

with open('filename1.txt') as f1, open('filename2.txt') as f2:
x, y = f1, f2
count = 0
for x, y in (f1, f2):
    if ((f2 - f1) < 10000) and (digit1_f1 == digit1_f2):
        count +=1
    break

However the code fails. I get this error:

Traceback (most recent call last):
  File "/Users/macbookpro/Desktop/compareDict.py", line 4, in <module>
for x, y in (f1, f2):
  ValueError: too many values to unpack (expected 2)

Both lists are of equal length containing 9524 rows each.

Why I am getting this error?

First of all, when you do for x, y in (f1, f2): what really happens is that you are creating a tuple of two file objects, and you are going to iterate over that tuple (not the file objects themselves) , so each iteration would return a single file object, but according to your syntax, it is trying to unpack the file object into x and y , two variables, causing the issue.

Secondly, when you do f2-f1 you are just trying to subtract two file objects (which is not possible).

I think since, according to your example the rows with same first individual key can be in different lines, it would be best to first create two dictionaries for each file. The dictionary can be of format like -

d1 = {<first key> : { <second key one>: value , <second key two>: value .... }}

Example -

d1 = {'8' : { '158116110' : '0.00509' } , '9' : { '141754441' : '0.00585' } ... }

Once both the dictionaries are created , you can then loop over one dictionary and then take same key from the other dictionary (get values of that key from both dictionary) and check if they have values that are within 10000 range of them.

Example code -

d = {}
for k,v in d1.items():
    v1 = d2.get(k)
    if v1:
        for k1 in v.keys():
            for k2 in v1.keys():
                if abs(int(k1) - int(k2)) < 10000:
                    if k in d:
                        d[k] += 1
                    else:
                        d[k] = 0

That isn't the right way to iterate over two things. Instead try:

for x, y in zip(f1, f2):

This should work, though I don't know about the meat of the loop, because you haven't provided what digit1_f1 and digit1_f2 are.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM