I'm trying to create a list of dictionaries where each dictionary key is a job and each value is a list of abilities associated with that job.
Ex:
[{'clerk': ['math ability','writing ability',...etc]},{'salesman':['charisma','writing ability','etc']}]
This is the data that I'm working with:
O*NET-SOC Code Element ID Element Name Scale ID Data Value N Standard Error Lower CI Bound Upper CI Bound Recommend Suppress Not Relevant Date Domain Source
11-1011.00 1.A.1.a.1 Oral Comprehension IM 4.5 8 0.19 4.13 4.87 N n/a Jun-06 Analyst
11-1011.00 1.A.1.a.1 Oral Comprehension LV 4.75 8 0.25 4.26 5.24 N N Jun-06 Analyst
11-1011.00 1.A.1.a.2 Written Comprehension IM 4.38 8 0.18 4.02 4.73 N n/a Jun-06 Analyst
And this is what I've done so far:
First I create a list of dictionaries, each representing a row in the data above with keys = to column names an vals = column values. Sample:
OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.19'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.13'), ('Date', '06/2006'), ('Data Value', '4.50'), ('Upper CI Bound', '4.87'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.25'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'LV'), ('Not Relevant', 'N'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.26'), ('Date', '06/2006'), ('Data Value', '4.75'), ('Upper CI Bound', '5.24'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.18'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Written Comprehension'), ('Lower CI Bound', '4.02'), ('Date', '06/2006'), ('Data Value', '4.38'), ('Upper CI Bound', '4.73'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.32'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'LV'),
And then I try to merge the dictionaries into fewer dictionaries where each key is job code and each value is a list of abilities associated with that job.
def add_abilites(abilites_m_l):
jobs_list = []
for ind, dict in enumerate(abilites_m_l):
activities_list = []
if abilities_m_l[ind-1]['O*NET-SOC Code'] == abilities_m_l[ind]['O*NET-SOC Code']:
if abilities_m_l[ind]['Element Name'] != abilities_m_l[ind-1]['Element Name']:
activities_list.append(abilities_m_l[ind]['Element Name'])
else: pass
else: list.append({abilities_m_l[ind]['O*NET-SOC Code']:activities_list})
return jobs_list
a_l_with_abilities = add_abilites(abilities_m_l)
print a_l_with_abilities
I get the following output:
[{'11-1011.00': []}, {'11-1021.00': []}, {'11-2011.00': []}, {'11-2021.00': []}, {'11-2022.00': []}, {'11-2031.00': []}, {'11-3011.00': []}, {'11-3021.00': []}, {'11-3031.01': []}, {'11-3031.02': []}, {'11-3051.00': []}, {'11-3051.01': []}, {'11-3051.02': []}, {'11-3051.04': []}, {'11-3061.00': []}, {'11-3071.01': []}, {'11-3071.02': []}, {'11-3071.03': []}, {'11-3111.00': []}, {'11-3121.00': []}, {'11-3131.00': []}, {'11-9013.01': []}, {'11-9013.03': []}, {'11-9021.00': []}, {'11-9031.00': []}, {'11-9032.00': []}, {'11-9033.00': []}, {'11-9041.00': []}, {'11-.....
In other words, my lists aren't being filled.
The core problem is that you're reassigning activities_list
to the empty list for each dictionary in your abilities_m_l
. So when you detect a changed 'O*NET-SOC code' value, you append the empty list you just reassigned.
Here's a cleaner way to do this:
def add_abilities(abilities_m_l):
jobs_dict = OrderedDict()
for data_dict in abilities_m_l:
o_code = data_dict['O*NET-SOC Code']
activity = data_dict['Element Name']
activities_so_far = jobs_dict.setdefault(o_code, OrderedDict())
activities_so_far[activity] = True
return [{o_code: activities.keys()} for o_code, activities in jobs_dict.iteritems()]
Or if you're on Python 3, where the keys
, values
and items
calls return iterables rather than lists:
return [{o_code: list(activities.keys())} for o_code, activities in jobs_dict.items()]
Or, if you don't need the order of the activities preserved, use a set
for the activities. That's preferable, but Python unfortunately does not have a native OrderedSet
so I approximated it above with an OrderedDict
containing True
for the activities found for a code.
def add_abilities(abilities_m_l):
jobs_dict = OrderedDict()
for data_dict in abilities_m_l:
o_code = data_dict['O*NET-SOC Code']
activity = data_dict['Element Name']
activities_so_far = jobs_dict.setdefault(o_code, set)
activities_so_far.add(activity)
return [{o_code: list(activities)} for o_code, activities in jobs_dict.iteritems()]
The point is to let Python's dictionaries gather the information about the shared keys, and to maintain uniqueness of the activities for each code.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.