{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"region" : "USA",
"manager" : {
"age" : 3,
"name" : {
"First_name":[
{"first" : "Joh"},
{"first" : "Lion"},
],
"Last_name" :[
{"last" : "Johm"},
{"last" : "Smihg"}
]
}
}
}
}
I am trying this for so long. Please help me. Solution format:
region First_name.first Last_name.last
USA Joh Johm
USA lion Smihg
You can try:
df = pd.json_normalize(d['_source']).drop('manager.age', 1).set_index(['region']).apply(pd.Series.explode).reset_index()
df['manager.name.First_name'] = df['manager.name.First_name'].str['first']
df['manager.name.Last_name'] = df['manager.name.Last_name'].str['last']
You could use pd.json_normalize
to extract the relevant data, and concat to the dataframe:
df = pd.DataFrame()
df = pd.concat([df, pd.json_normalize(data['_source']['manager']['name']['First_name'])], axis=1)
df = pd.concat([df, pd.json_normalize(data['_source']['manager']['name']['Last_name'])], axis=1)
df['region'] = data['_source']['region']
Output df
is
first last region
0 Joh Johm USA
1 Lion Smihg USA
However, the codes will be simpler if your raw data is of the following format:
data = {
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"region" : "USA",
"manager" : {
"age" : 3,
"name" : [
{ "First_name": {"first" : "Joh"},
"Last_name" : {"last" : "Johm"} },
{ "First_name": {"first" : "Lion"},
"Last_name" : {"last" : "Smihg"} }
]
}
}
}
df = pd.json_normalize(data['_source']['manager']['name'])
Output
First_name.first Last_name.last
0 Joh Johm
1 Lion Smihg
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.