简体   繁体   中英

How to convert CSV to a more structured dictionary in python?

Assuming I have the following CSV:

Type   Name        Application  

Vegetable   Lettuce    StoreA
Fruit       Apple      StoreB
Vegetable   Orange     StoreB
Fruit       Pear       StoreC
Dairy       Milk       StoreA
Fruit       Plum       StoreB
Fruit       Plum       StoreA

Is there some easy way in python to allow me to generate a structured dict based on certain fields I expect to be "collapsed?" For example, by specifying "Type", then "Application", then "Name", in that order... It would create a dict with only three Keys "Vegetable", "Fruit", "Dairy"...

Vegetable would only have "StoreA" and "StoreB" Fruit would have "Store B" and "Store C" (no duplicate Store B even though Plum is in Store B)

and drilling to the deepest level of the dict would be the fruit. What is the best way to accomplish this? Syntax is appreciated.

Since this doesn't seem like a problem about parsing CSV, I'm going to assume you can get your data into the following format using csv.DictReader or some other method:

rows = [{'Type': 'Vegetable', 'Name': 'Lettuce', 'Application': 'StoreA'},
        {'Type': 'Fruit', 'Name': 'Apple', 'Application': 'StoreB'},
        {'Type': 'Vegetable', 'Name': 'Orange', 'Application': 'StoreB'},
        {'Type': 'Fruit', 'Name': 'Pear', 'Application': 'StoreC'},
        {'Type': 'Dairy', 'Name': 'Milk', 'Application': 'StoreA'},
        {'Type': 'Fruit', 'Name': 'Plum', 'Application': 'StoreB'},
        {'Type': 'Fruit', 'Name': 'Plum', 'Application': 'StoreA'}]

Once you have that, here is one option for creating the nested dictionary you are looking for:

result = {}
for row in rows:
    stores = result.setdefault(row['Type'], {})
    names = stores.setdefault(row['Application'], [])
    names.append(row['Name'])

>>> pprint.pprint(result)
{'Dairy': {'StoreA': ['Milk']},
 'Fruit': {'StoreA': ['Plum'],
           'StoreB': ['Apple', 'Plum'],
           'StoreC': ['Pear']},
 'Vegetable': {'StoreA': ['Lettuce'],
               'StoreB': ['Orange']}}

You could of course put the contents of the for loop onto a single line:

for row in rows:
    result.setdefault(row['Type'], {}).setdefault(row['Application'], []).append(row['Name'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM