简体   繁体   中英

Python - Method needed to flatten highly nested json, i.e. "class.properties.name.properties.firstname"

I am needing to take a highly nested json file (ie Elasticsearch mapping for an index) and produce a list of items.
Example Elasticsearch Mapping:

{
    "mappings": {
        "properties": {
            "class": {
                "properties": {
                    "name": {
                        "properties": {
                            "firstname": {
                                "type": "text"
                            },
                            "lastname": {
                                "type": "text"
                            }
                        }
                    },
                    "age": {
                        "type": "text "
                    }
                }
            }
        }
    }
}

Example Desired Result:

["mappings.properties.class.properties.name.properties.firstname",
 "mappings.properties.class.properties.name.properties.lastname",
 "mappings.properties.class.properties.age"]

I pandas.json_normalize() doesn't quite do what I want. Neither does glom()

You should be able to make a fairly short recursive generator to do this. I'm assuming you want all the keys until you see a dict with type in it:

d = {
    "mappings": {
        "properties": {
            "class": {
                "properties": {
                    "name": {
                        "properties": {
                            "firstname": {
                                "type": "text"
                            },
                            "lastname": {
                                "type": "text"
                            }
                        }
                    },
                    "age": {
                        "type": "text "
                    }
                }
            }
        }
    }
}

def all_keys(d, path=None):
    if path is None:
        path = []
    if not isinstance(d, dict) or 'type' in d:
        yield '.'.join(path)
        return
    for k, v in d.items():
        yield from all_keys(v, path + [k])

list(all_keys(d))

Which gives:

['mappings.properties.class.properties.name.properties.firstname',
 'mappings.properties.class.properties.name.properties.lastname',
 'mappings.properties.class.properties.age']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM