简体   繁体   中英

Not appending to list

I'm trying to create a list of dictionaries where each dictionary key is a job and each value is a list of abilities associated with that job.

Ex:

[{'clerk': ['math ability','writing ability',...etc]},{'salesman':['charisma','writing ability','etc']}]

This is the data that I'm working with:

O*NET-SOC Code  Element ID  Element Name    Scale ID    Data Value  N   Standard Error  Lower CI Bound  Upper CI Bound  Recommend Suppress  Not Relevant    Date    Domain Source
11-1011.00  1.A.1.a.1   Oral Comprehension  IM  4.5 8   0.19    4.13    4.87    N   n/a Jun-06  Analyst
11-1011.00  1.A.1.a.1   Oral Comprehension  LV  4.75    8   0.25    4.26    5.24    N   N   Jun-06  Analyst
11-1011.00  1.A.1.a.2   Written Comprehension   IM  4.38    8   0.18    4.02    4.73    N   n/a Jun-06  Analyst

And this is what I've done so far:

First I create a list of dictionaries, each representing a row in the data above with keys = to column names an vals = column values. Sample:

OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.19'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.13'), ('Date', '06/2006'), ('Data Value', '4.50'), ('Upper CI Bound', '4.87'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.25'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'LV'), ('Not Relevant', 'N'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.26'), ('Date', '06/2006'), ('Data Value', '4.75'), ('Upper CI Bound', '5.24'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.18'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Written Comprehension'), ('Lower CI Bound', '4.02'), ('Date', '06/2006'), ('Data Value', '4.38'), ('Upper CI Bound', '4.73'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.32'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'LV'),

And then I try to merge the dictionaries into fewer dictionaries where each key is job code and each value is a list of abilities associated with that job.

def add_abilites(abilites_m_l):
    jobs_list = []
    for ind, dict in enumerate(abilites_m_l):
        activities_list = []
        if abilities_m_l[ind-1]['O*NET-SOC Code'] == abilities_m_l[ind]['O*NET-SOC Code']: 
            if abilities_m_l[ind]['Element Name'] != abilities_m_l[ind-1]['Element Name']:
                activities_list.append(abilities_m_l[ind]['Element Name'])
            else: pass
        else: list.append({abilities_m_l[ind]['O*NET-SOC Code']:activities_list})        
    return jobs_list
a_l_with_abilities = add_abilites(abilities_m_l)
print a_l_with_abilities

I get the following output:

[{'11-1011.00': []}, {'11-1021.00': []}, {'11-2011.00': []}, {'11-2021.00': []}, {'11-2022.00': []}, {'11-2031.00': []}, {'11-3011.00': []}, {'11-3021.00': []}, {'11-3031.01': []}, {'11-3031.02': []}, {'11-3051.00': []}, {'11-3051.01': []}, {'11-3051.02': []}, {'11-3051.04': []}, {'11-3061.00': []}, {'11-3071.01': []}, {'11-3071.02': []}, {'11-3071.03': []}, {'11-3111.00': []}, {'11-3121.00': []}, {'11-3131.00': []}, {'11-9013.01': []}, {'11-9013.03': []}, {'11-9021.00': []}, {'11-9031.00': []}, {'11-9032.00': []}, {'11-9033.00': []}, {'11-9041.00': []}, {'11-.....

In other words, my lists aren't being filled.

The core problem is that you're reassigning activities_list to the empty list for each dictionary in your abilities_m_l . So when you detect a changed 'O*NET-SOC code' value, you append the empty list you just reassigned.

Here's a cleaner way to do this:

def add_abilities(abilities_m_l):
    jobs_dict = OrderedDict()
    for data_dict in abilities_m_l:
        o_code = data_dict['O*NET-SOC Code']
        activity = data_dict['Element Name']
        activities_so_far = jobs_dict.setdefault(o_code, OrderedDict())
        activities_so_far[activity] = True
    return [{o_code: activities.keys()} for o_code, activities in jobs_dict.iteritems()]

Or if you're on Python 3, where the keys , values and items calls return iterables rather than lists:

    return [{o_code: list(activities.keys())} for o_code, activities in jobs_dict.items()]

Or, if you don't need the order of the activities preserved, use a set for the activities. That's preferable, but Python unfortunately does not have a native OrderedSet so I approximated it above with an OrderedDict containing True for the activities found for a code.

def add_abilities(abilities_m_l):
    jobs_dict = OrderedDict()
    for data_dict in abilities_m_l:
        o_code = data_dict['O*NET-SOC Code']
        activity = data_dict['Element Name']
        activities_so_far = jobs_dict.setdefault(o_code, set)
        activities_so_far.add(activity)
    return [{o_code: list(activities)} for o_code, activities in jobs_dict.iteritems()]

The point is to let Python's dictionaries gather the information about the shared keys, and to maintain uniqueness of the activities for each code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM