简体   繁体   中英

Getting a nested dictionary from excel table in Python

I have an excel data like that:

    Dep request date Code   Reason
    P41 15.02.2018  0060    Data Incomplete
    P41 02.02.2018  0060    Data Incomplete
    P21 11.01.2018  0060    Data Incomplete
    P41 14.02.2018  0060    Data Incomplete
    P01 13.03.2018  0060    Data Incomplete
    P21 09.02.2018  0030    Typing error -> technical mix-up
    P41 07.02.2018  0030    Typing error -> technical mix-up
    P31 28.02.2018  0030    Typing error -> technical mix-up

and this is my code:

def get_reasons(readfilename):
    act_sheet = read_excelfile(readfilename)
    deps = []
    reasons = []
    item_dict = {}
#    create a list of uppercase letters for A-Z
    col_header = [chr(one).upper() for one in range(97,123)]

    for idx, header in enumerate(col_header):
        head = header + str(1)

        if act_sheet[head].value == 'Dep':
            for j in range(2, act_sheet.max_row+1):
                deps.append(act_sheet[header + str(j)].value)

        if act_sheet[head].value == 'Reason':
            for m in range(2, act_sheet.max_row+1):
                items = act_sheet[header + str(m)].value
                reasons.append(items)           
                item_dict.setdefault(items, {})

                item_dict[items].setdefault('Departments', deps)

    amounts = Counter(reasons) 
    for k,v in amounts.items():
        item_dict[k]['Quantity'] = v

    return item_dict

I am trying to return a dictionary in this format:

{u'Data Incomplete': {'Departments': [P41, P41, P21, P41, P01], 'Quantity': 1},
 u'Typing error -> technical mix-up': {'Department': [P21, P41, P31], 'Quantity': 1}}

I am struggling to get the correct code, especially the part to get the list of departments. Can somebody help me?

The best way to do this would be to use a database. However, for a single use this is pretty easy to do with openpyxl. You should really study the examples in the documentation more closely so that you don't have write code as verbose as you currently have, which makes it difficult to understand exactly what you're trying to do.

The following should help you along.

headers = {c.value:c.col_idx for c in ws[1]}
reason_col = headers['Reason'] - 1
dep_col = headers['Dep'] - 1

reasons = defaultdict(set)

for row in ws.iter_rows(min_row=2):
    reason = row[reason_col].value
    dep = row[dep_col].value
    reasons[reason].add(dep)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM