intersection of two lists of dictionaries by key/value pair

Question

I have two lists of dictionaries in the format:

systolic_sex = [
        {'attribute': u'bp', 'value_d': 133.0, 'value_s': u'133', 'sid': 6}, 
        {'attribute': u'bp', 'value_d': 127.0, 'value_s': u'127', 'sid': 17}, 
        {'attribute': u'bp', 'value_d': 121.0, 'value_s': u'121', 'sid': 18}, 
        {'attribute': u'bp', 'value_d': 127.0, 'value_s': u'127', 'sid': 27}, 
        {'attribute': u'bp', 'value_d': 120.0, 'value_s': u'120', 'sid': 42},
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 6},      
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 17},   
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 18},
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 27},   
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 42}
    ]



sex = [
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 6},      
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 17},   
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 42}
    ]

I want to match these lists by the value of the key 'sid,' so that if the same value of 'sid' is in both, I have a match, otherwise, I do not. If I have a match, I then append the matching dictionaries by 'sid' from both sets to a new list accordingly like so

new_set = [
        {'attribute': u'bp', 'value_d': 133.0, 'value_s': u'133', 'sid': 6}, 
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 6},
        {'attribute': u'bp', 'value_d': 127.0, 'value_s': u'127', 'sid': 17}, 
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 17},
        {'attribute': u'bp', 'value_d': 120.0, 'value_s': u'120', 'sid': 42},
        {'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 42}
    ]

I've tried various methods of intersecting these, including modifying answers from Match set of dictionaries , but I am looking to create a new list of dictionaries that have the matching sids, not replacing values between the two lists.

Answer 1

You may be interested in using pandas if you're dealing with data like this a lot. Your dictionaries are already in the form pandas likes, so you can do this:

import pandas

systolic_sex = pandas.DataFrame(systolic_sex)
sex = pandas.DataFrame(sex)

matches = systolic_sex[systolic_sex.sid.isin(sex.sid)]

If you want the data back in the same format as you supplied them, you can to

output = matches.to_dict(orient='records')

Answer 2

Going off the answer in the post you linked:

systolic_sex = dict((e['sid'], e) for e in systolic_sex)
sex = set(e['sid'] for e in sex)

matches = []
for sid,v in systolic_sex.items():
    if sid not in sex: continue
    matches.append(v)

Answer 3

>>> uniq=set(e['sid'] for e in sex) 
>>> filter(lambda d: d['sid'] in uniq, systolic_sex)
[{'attribute': u'bp', 'sid': 6L, 'value_s': u'133', 'value_d': 133.0},        
 {'attribute': u'bp', 'sid': 17L, 'value_s': u'127', 'value_d': 127.0},  
 {'attribute': u'bp', 'sid': 42L, 'value_s': u'120', 'value_d': 120.0}, 
 {'attribute': u'SEX', 'sid': 6L, 'value_s': u'M', 'value_d': 0.0}, 
 {'attribute': u'SEX', 'sid': 17L, 'value_s': u'M', 'value_d': 0.0}, 
 {'attribute': u'SEX', 'sid': 42L, 'value_s': u'M', 'value_d': 0.0}]

Answer 4

I ended up using the following (as per @chtohnicdaemon):

import pandas
#-----> code snipped here
#----->
# iterate over record sets returned by SQLAlchemy to populate list
    for result in query_right:
        data = {'sid': result.patient_sid,
                'value_s': result.string_value,
                'value_d': result.double_value,
                'attribute': result.attribute_value}

                result_right.append(data)

    for result in left_child:
        data = {'sid': result.patient_sid,
                'value_s': result.string_value,
                'value_d': result.double_value,
                'attribute': result.attribute_value}

                result_left.append(data)

# convert list of dictionaries to data frames
right = pandas.DataFrame(right_result)
left = pandas.DataFrame(left_result)

# get matches
matches_right  = right[right.sid.isin(left.sid)]
matches_left  = left[left.sid.isin(right.sid)]

# combine matched sets into single set
frames = [matches_right,matches_left]

# concatenate data, drop duplicates and convert back to a list of dictionaries
result = pd.concat(frames).drop_duplicates().to_dict(orient='records')

Worked like a charm!

intersection of two lists of dictionaries by key/value pair

Question

4 answers

solution1
3 ACCPTED 2015-10-25 04:15:46

solution2
1 2015-10-25 03:33:22

solution3
1 2015-10-25 04:04:32

solution4
1 2015-10-26 14:40:21

intersection of two lists of dictionaries by key/value pair

Question

4 answers

solution1 3 ACCPTED 2015-10-25 04:15:46

solution2 1 2015-10-25 03:33:22

solution3 1 2015-10-25 04:04:32

solution4 1 2015-10-26 14:40:21

solution1
3 ACCPTED 2015-10-25 04:15:46

solution2
1 2015-10-25 03:33:22

solution3
1 2015-10-25 04:04:32

solution4
1 2015-10-26 14:40:21