简体   繁体   中英

flat nested json inside arrays with Python

I want to convert this:

{} Json
  {} 0
    [] variants
      {} 0
         fileName
         id
         {} mediaType
            baseFilePtah
            id
            name
         sortOrder
      {} 1
         fileName
         id
         {} mediaType
            baseFilePtah
            id
            name
         sortOrder

Into this:

{} Json
  {} 0
   [] variants
     {} 0
         fileName
         id
         mediaType_baseFilePath
         mediaType_id
         mediaType_name
         SortOrder
     {} 1
         fileName
         id
         mediaType_baseFilePath
         mediaType_id
         mediaType_name
         SortOrder

Basically each

 {}
   {}

should be merged together. But not rows numbers.

This is the code I wrote:

def flatten_json(y):
    out = {}
    def flatten(x, name=''):
        if type(x) is dict:
            print type(x), name
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            print type(x), name
            out[name[:-1]] = x
        else:
            out[name[:-1]] = x
    flatten(y)
    return out

def generatejson(response2):
    # response 2 is [(first data set), (second data set)]  convert it to dictionary {0: (first data set), 1: (second data set)}
    sample_object = {i: data for i, data in enumerate(response2)}
    # begin to flat (merge sub-jsons)
    flat = {k: flatten_json(v) for k, v in sample_object.items()}
    return json.dumps(flat, sort_keys=True)

This is the result of the code on my sample data: 在此处输入图片说明

As you can see manufacturer was merged but mediaType was not. The code prints:

<type 'dict'> 
<type 'list'> additionalLocaleInfos_
<type 'list'> variants_
<type 'dict'> manufacturer_

My aim was that type list will be further investigated in the recursion. The code suppose to detect that inside the variants list there is also a dict of mediaType but it doesn't.

Data sample for generatejson(response2) - is a list of this structure:

[{"additionalLocaleInfos": [], "approved": false, "approvedBy": null, "approvedOn": null, "catalogId": 4, "code": "611", 
"createdOn": "2018-03-24 09:39", "customsCode": null, "deletedOn": null, "id": 1, "invariantName": "Leisure Ali Baba Trousers", "isPermanent": false, "locale": null, "madeIn": null, 
"manufacturer": {"createdOn": "2018-02-23 18:20", "deletedOn": null, "id": 1, "invariantName": "Unknown", "updatedOn": "2018-02-23 18:20"},
 "onNoStockShowComingSoon": false, "season": "", "updatedOn": "2018-03-24 09:39",
 "variants": [{"assets": [{"fileName": "mu/2016/05/16/leisure-ali-baba-trousers-32956-0.jpg", "id": 1, 
 "mediaType": {"baseFilePath": "Catalog", "id": 7, "name": "Product Main Image"}, "sortOrder": 0}]} ]}]

Full example can be found here (but not mandatory for the question) http://www.filedropper.com/file_389

How can I make it look inside the list to check if it's made of more objects?

This code works only without arrays. For some reason It doesn't look inside the array to see what objects are in it.

Something like this will flatten a dict structure containing dicts, lists and tuples into a flat dict.

The json_data blob is an excerpt from the data you posted.

import json
import collections

json_data = """
{"additionalLocaleInfos":[],"approved":false,"approvedBy":null,"approvedOn":null,"catalogId":4,"code":"611","createdOn":"2018-03-24 09:39","customsCode":null,"deletedOn":null,"id":1,"invariantName":"Leisure Ali Baba Trousers","isPermanent":false,"locale":null,"madeIn":null,"manufacturer":{"createdOn":"2018-02-23 18:20","deletedOn":null,"id":1,"invariantName":"Unknown","updatedOn":"2018-02-23 18:20"},"onNoStockShowComingSoon":false,"season":"","updatedOn":"2018-03-24 09:39","variants":[{"assets":[{"fileName":"mu/2016/05/16/leisure-ali-baba-trousers-32956-0.jpg","id":1,"mediaType":{"baseFilePath":"Catalog","id":7,"name":"Product Main Image"},"sortOrder":0},{"fileName":"080113/3638.jpg","id":2,"mediaType":{"baseFilePath":"Catalog","id":8,"name":"Product Additional Image"},"sortOrder":0},{"fileName":"mu/2016/05/16/leisure-ali-baba-trousers-32956-1.jpg","id":3,"mediaType":{"baseFilePath":"Catalog","id":8,"name":"Product Additional Image"},"sortOrder":0},{"fileName":"mu/2015/07/21/leisure-ali-baba-trousers-13730-0.jpg","id":4,"mediaType":{"baseFilePath":"Catalog","id":8,"name":"Product Additional Image"},"sortOrder":0},{"fileName":"mu/2016/05/16/leisure-ali-baba-trousers-32956-2.jpg","id":5,"mediaType":{"baseFilePath":"Catalog","id":8,"name":"Product Additional Image"},"sortOrder":0},{"fileName":"mu/2015/07/29/leisure-ali-baba-trousers-13853-0.jpg","id":6,"mediaType":{"baseFilePath":"Catalog","id":8,"name":"Product Additional Image"},"sortOrder":0}],"attributes":[{"attribute":{"code":"COL","cultureNeutralName":"Color","id":1,"useAsFilter":false},"code":"BLACK","groupId":0,"id":3,"invariantValue":"BLACK","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"SZ","cultureNeutralName":"Size","id":2,"useAsFilter":false},"code":"ONE SIZE","groupId":0,"id":7,"invariantValue":"ONE SIZE","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"WEIGHT","cultureNeutralName":"WEIGHT","id":14,"useAsFilter":false},"code":"0.30","groupId":0,"id":2,"invariantValue":"0.30","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"STLPTND","cultureNeutralName":"OsStyleOptionId","id":25,"useAsFilter":false},"code":"2","groupId":0,"id":6,"invariantValue":"2","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"STLNMBR","cultureNeutralName":"OsStyleNumber","id":26,"useAsFilter":false},"code":"611-1412","groupId":0,"id":1,"invariantValue":"611-1412","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"SZFCTEN","cultureNeutralName":"SizeFacetEn","id":35,"useAsFilter":true},"code":"S","groupId":0,"id":8,"invariantValue":"S","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"SZFCTEN","cultureNeutralName":"SizeFacetEn","id":35,"useAsFilter":true},"code":"M","groupId":0,"id":9,"invariantValue":"M","locale":null,"sortOrder":0,"valueLocale":null},{"attribute":{"code":"SZFCTEN","cultureNeutralName":"SizeFacetEn","id":35,"useAsFilter":true},"code":"L","groupId":0,"id":10,"invariantValue":"L","locale":null,"sortOrder":0,"valueLocale":null}],"cost":0,"createdOn":"2018-03-24 09:39","deletedOn":null,"eaN1":"2500002822528","eaN2":null,"eaN3":null,"id":1,"isDefault":false,"locale":null,"sku":"611-1412-28","sortOrder":0,"upC1":null,"upC2":null,"upC3":null,"updatedOn":"2018-03-24 09:39","variantInventories":[{"defectiveQty":0,"id":1,"lastUpdate":"2018-03-24 09:39","orderLevelQty":0,"preorderQty":0,"qtyInStock":0,"reorderQty":0,"reservedQty":100,"transferredQty":0,"variantId":1,"warehouseId":1}],"variantPrices":[{"id":1,"price":5,"priceListId":1,"priceType":{"code":"Base price","id":1,"remarks":null},"validFrom":"2018-03-24 09:39","validUntil":"2068-03-24 09:39","variantId":1}]}]}
""".strip()

data = json.loads(json_data)

def flatten_object(d, out=None, name_path=()):
    out = (out or collections.OrderedDict())
    iterator = (d.items() if isinstance(d, dict) else enumerate(d))
    for index, value in iterator:
        i_path = name_path + (index,)
        if isinstance(value, (list, dict, tuple)):
            flatten_object(value, out, i_path)
        else:
            out[i_path] = value
    return out

for key, value in flatten_object(data).items():
    print('_'.join(str(atom) for atom in key), value)

The output here will be something like

approved False
approvedBy None
approvedOn None
[...]
variants_0_cost 0
variants_0_createdOn 2018-03-24 09:39
variants_0_deletedOn None
variants_0_eaN1 2500002822528
variants_0_eaN2 None
variants_0_eaN3 None
variants_0_assets_0_fileName mu/2016/05/16/leisure-ali-baba-trousers-32956-0.jpg
variants_0_assets_0_id 1
variants_0_assets_0_mediaType_baseFilePath Catalog
variants_0_assets_0_mediaType_id 7
variants_0_assets_0_mediaType_name Product Main Image
variants_0_assets_0_sortOrder 0
variants_0_assets_1_fileName 080113/3638.jpg
variants_0_assets_1_id 2
variants_0_assets_1_mediaType_baseFilePath Catalog
variants_0_assets_1_mediaType_id 8
variants_0_assets_1_mediaType_name Product Additional Image
variants_0_assets_1_sortOrder 0
variants_0_assets_2_fileName mu/2016/05/16/leisure-ali-baba-trousers-32956-1.jpg
[...]
variants_0_attributes_0_attribute_code COL
variants_0_attributes_0_attribute_cultureNeutralName Color
variants_0_attributes_0_attribute_id 1
variants_0_attributes_0_attribute_useAsFilter False
variants_0_attributes_0_code BLACK
variants_0_attributes_0_groupId 0
variants_0_attributes_0_id 3
variants_0_attributes_0_invariantValue BLACK
variants_0_attributes_0_locale None
variants_0_attributes_0_sortOrder 0
variants_0_attributes_0_valueLocale None
variants_0_attributes_1_attribute_code SZ
variants_0_attributes_1_attribute_cultureNeutralName Size
variants_0_attributes_1_attribute_id 2
variants_0_attributes_1_attribute_useAsFilter False
variants_0_attributes_1_code ONE SIZE
variants_0_attributes_1_groupId 0
variants_0_attributes_1_id 7
variants_0_attributes_1_invariantValue ONE SIZE
variants_0_attributes_1_locale None
variants_0_attributes_1_sortOrder 0
variants_0_attributes_1_valueLocale None
variants_0_attributes_2_attribute_code WEIGHT
variants_0_attributes_2_attribute_cultureNeutralName WEIGHT
variants_0_attributes_2_attribute_id 14
variants_0_attributes_2_attribute_useAsFilter False
variants_0_attributes_2_code 0.30
variants_0_attributes_2_groupId 0
[...]

but you'll probably only want to run this on a single object within variants , or a list of attributes .

variant = data['variants'][0]
merged_flattened_assets = dict()
for asset in variant['assets']:
    merged_flattened_assets.update({
        '_'.join(key): value
        for (key, value)
        in flatten_object(asset).items()
    })

for key, value in merged_flattened_assets.items():
    print(key, value)

outputs

fileName mu/2015/07/29/leisure-ali-baba-trousers-13853-0.jpg
id 6
mediaType_baseFilePath Catalog
mediaType_id 8
mediaType_name Product Additional Image
sortOrder 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM