简体   繁体   中英

Convert Pandas Dataframe or csv file to Custom Nested JSON

I have a csv file with a DF with structure as follows:

my dataframe:

在此处输入图像描述

I want to enter the data to the following JSON format using python. I looked to couple of links (but I got lost in the nested part). The links I checked:

How to convert pandas dataframe to uniquely structured nested json

convert dataframe to nested json

"PHI": 2,
"firstname": "john",
"medicalHistory": {
  "allergies": "egg",
  
"event": {
    "inPatient":{
        "hospitalized": {
        "visit" : "7-20-20",
        "noofdays": "5",
         "test": {
            "modality": "xray"   
        } 
        "vitalSign": {
    "temperature": "32",
        "heartRate": "80"
  
  },
 "patientcondition": {
        "headache": "1",
        "cough": "0"
  }
        },
        "icu": {
            "visit" : "",
          "noofdays": "",
        },
    },
    "outpatient": {
        "visit":"5-20-20",
        "vitalSign": {
   "temperature": "32",
        "heartRate": "80"
  },
  "patientcondition": {
        "headache": "1",
        "cough": "1"
  },
  "test": {
            "modality": "blood"   
        }    
  }
    }

}

If anyone can help me with the nested array, that will be really helpful.

You need one or more helper functions to unpack the data in the table like this. Write main helper function to accept two arguments: 1. df and 2. schema. The schema will be used to unpack the df into a nested structure for each row in the df. The schema below is an example of how to achieve this for a subset of the logic you describe. Although not exactly what you specified in example, should be enough of hint for you to complete the rest of the task on your own.

from operator import itemgetter
groupby_idx = ['PHI', 'firstName']
groups = df.groupby(groupby_idx, as_index=False, drop=False)
schema = {
    "event": {
        "eventType": itemgetter('event'), 
        "visit": itemgetter('visit'),
        "noOfDays": itemgetter('noofdays'),
        "test": {
            "modality": itemgetter('test')
        },
        "vitalSign": {
            "temperature": itemgetter('temperature'),
            "heartRate": itemgetter('heartRate')
        },
        "patientCondition": {
            "headache": itemgetter('headache'),
            "cough": itemgetter('cough')
        }
    }
}

def unpack(obj, schema):
    tmp = {}
    for k, v in schema.items():
        if isinstance(v, (dict,)):
            tmp[k] = unpack(obj, v)
        if callable(v):
            tmp[k] = v(obj)
    return tmp

def apply_unpack(groups, schema):
    results = {}
    for gidx, df in groups:
        events = []
        for ridx, obj in df.iterrows():
            d = unpack(obj, schema)
            events.append(d)
        results[gidx] = events
    return results

unpacked = apply_unpack(groups, schema)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM