简体   繁体   中英

Python List Comprehension - extracting from nested data

I'm new to Python and was trying to extract out some nested data.

Here is the JSON for two products. A product can belong to zero or more categories

 {  
   "Item":[  
      {   
         "ID":"170",
         "InventoryID":"170",
         "Categories":[  
            {  
               "Category":[  
                  {  
                    "CategoryID":"444",
                    "Priority":"0",
                    "CategoryName":"Paper Mache"
                  },
                  {  
                     "CategoryID":"479",
                     "Priority":"0",
                     "CategoryName":"Paper Mache"
                  },
                  {  
                     "CategoryID":"515",
                     "Priority":"0",
                     "CategoryName":"Paper Mache"
                  }
               ]
            }
         ],
         "Description":"Approximately 9cm wide x 4cm deep.",
         "SKU":"111931"
      },
      {  
         "ID":"174",
         "InventoryID":"174",
     "    Categories":[  
            {  
                "Category":{  
                  "CategoryID":"888",
                  "Priority":"0",
                  "CategoryName":"Plaster"
                }
            }
         ],
         "Description":"Plaster Mould - Australian Animals",
         "SKU":"110546"
      }
   ],
   "CurrentTime":"2016-08-22 11:52:27",
   "Ack":"Success"
}

I want to work out which Categories a product belongs to.

My code for extraction is as follows:-

        for x in products: 
            productsInCategory = []
            for y in x['Categories']:
                for z in y['Category']:
                    if z['CategoryID'] == categories[i]['CategoryID']:
                        productsInCategory.append(x)

This issue is that in this case the second item only contains one Category, not an array of categories so this line

for z in y['Category']:

loops through the properties of a Category and not a Category array and hence causes my code to fail.

How can I protect against this? And can this be written more elegantly with list comprehension syntax?

That's a very poor document structure in that case; you shouldn't have to deal with this. If an item can contain multiple values, it should always be a list.

Be that as it may, you can still deal with it in your code by checking if it is a list or not.

for x in products: 
    productsInCategory = []
    for y in x['Categories']:
        category = y['Category']
        if isinstance(category, dict):
            category = [category]
        for z in category:
            ...

(You might want to consider using more descriptive variable names generally; x , y and z are not very helpful for people reading the code.)

I've run into this issue frequently before in JSON structures...frequently enough that I wrote a small library for it a few weeks ago...

nested key retriever (nkr)

Try the generator and see if it solves your problem. You should be able to simple:

for x in products: 
    if product_id_searching_for in list(nkr.find_nested_key_values(x, 'CategoryID')):
         productsInCategory.append(x)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM