I have two lists of dictionaries in the format:
systolic_sex = [
{'attribute': u'bp', 'value_d': 133.0, 'value_s': u'133', 'sid': 6},
{'attribute': u'bp', 'value_d': 127.0, 'value_s': u'127', 'sid': 17},
{'attribute': u'bp', 'value_d': 121.0, 'value_s': u'121', 'sid': 18},
{'attribute': u'bp', 'value_d': 127.0, 'value_s': u'127', 'sid': 27},
{'attribute': u'bp', 'value_d': 120.0, 'value_s': u'120', 'sid': 42},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 6},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 17},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 18},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 27},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 42}
]
sex = [
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 6},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 17},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 42}
]
I want to match these lists by the value of the key 'sid,' so that if the same value of 'sid' is in both, I have a match, otherwise, I do not. If I have a match, I then append the matching dictionaries by 'sid' from both sets to a new list accordingly like so
new_set = [
{'attribute': u'bp', 'value_d': 133.0, 'value_s': u'133', 'sid': 6},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 6},
{'attribute': u'bp', 'value_d': 127.0, 'value_s': u'127', 'sid': 17},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 17},
{'attribute': u'bp', 'value_d': 120.0, 'value_s': u'120', 'sid': 42},
{'attribute': u'SEX', 'value_d': 0.0, 'value_s': u'M', 'sid': 42}
]
I've tried various methods of intersecting these, including modifying answers from Match set of dictionaries , but I am looking to create a new list of dictionaries that have the matching sids, not replacing values between the two lists.
You may be interested in using pandas if you're dealing with data like this a lot. Your dictionaries are already in the form pandas likes, so you can do this:
import pandas
systolic_sex = pandas.DataFrame(systolic_sex)
sex = pandas.DataFrame(sex)
matches = systolic_sex[systolic_sex.sid.isin(sex.sid)]
If you want the data back in the same format as you supplied them, you can to
output = matches.to_dict(orient='records')
Going off the answer in the post you linked:
systolic_sex = dict((e['sid'], e) for e in systolic_sex)
sex = set(e['sid'] for e in sex)
matches = []
for sid,v in systolic_sex.items():
if sid not in sex: continue
matches.append(v)
>>> uniq=set(e['sid'] for e in sex)
>>> filter(lambda d: d['sid'] in uniq, systolic_sex)
[{'attribute': u'bp', 'sid': 6L, 'value_s': u'133', 'value_d': 133.0},
{'attribute': u'bp', 'sid': 17L, 'value_s': u'127', 'value_d': 127.0},
{'attribute': u'bp', 'sid': 42L, 'value_s': u'120', 'value_d': 120.0},
{'attribute': u'SEX', 'sid': 6L, 'value_s': u'M', 'value_d': 0.0},
{'attribute': u'SEX', 'sid': 17L, 'value_s': u'M', 'value_d': 0.0},
{'attribute': u'SEX', 'sid': 42L, 'value_s': u'M', 'value_d': 0.0}]
I ended up using the following (as per @chtohnicdaemon):
import pandas
#-----> code snipped here
#----->
# iterate over record sets returned by SQLAlchemy to populate list
for result in query_right:
data = {'sid': result.patient_sid,
'value_s': result.string_value,
'value_d': result.double_value,
'attribute': result.attribute_value}
result_right.append(data)
for result in left_child:
data = {'sid': result.patient_sid,
'value_s': result.string_value,
'value_d': result.double_value,
'attribute': result.attribute_value}
result_left.append(data)
# convert list of dictionaries to data frames
right = pandas.DataFrame(right_result)
left = pandas.DataFrame(left_result)
# get matches
matches_right = right[right.sid.isin(left.sid)]
matches_left = left[left.sid.isin(right.sid)]
# combine matched sets into single set
frames = [matches_right,matches_left]
# concatenate data, drop duplicates and convert back to a list of dictionaries
result = pd.concat(frames).drop_duplicates().to_dict(orient='records')
Worked like a charm!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.