I'm trying to normalize some JSON
to flatten it for a SQL
table. The problem I've come across is that everything I've read has a standard name for nested items but I am working with unique ids that I want to put in as values in my dataframe.
Here's a sample of the JSON
.
{
"data": {
"138082239": [
{
"id": 275,
"name": "Sue",
"abbreviation": "SJ",
"active": true,
"changedByUserId": "11710250",
"statusUpdated": "2020-11-23T18:48:28+00:00",
"leadCreated": "2020-11-23T18:48:28+00:00",
"leadModified": "2020-11-23T18:48:29+00:00"
}
],
"138082238": [
{
"id": 276,
"name": "John",
"abbreviation": "JC",
"active": true,
"changedByUserId": "11710250",
"statusUpdated": "2020-11-23T18:48:25+00:00",
"leadCreated": "2020-11-23T18:48:25+00:00",
"leadModified": "2020-11-23T18:48:25+00:00"
}
],
I want to flatten this and add the index title (ex: 138082239) as a value [LeadId]
in my dataframe. When I try to use pd.json_normalize()
I just get a bunch of columns titled; data.138082239, data.138082238, etc.
I'm using requests
to pull this JSON
from an API.
r = requests.request("GET", url, data=payload, headers=headers)
j = r.json()
df = pd.json_normalize(j)
I want the dataframe to look like this:
LeadId id name abbreviation active
138082239 275 Sue SJ TRUE
138082238 276 John JC TRUE
Thanks in advance for your help!
One way using dict
comprehension with pandas.DataFrame.assign
:
df = pd.concat([pd.DataFrame(v).assign(LeadId=k) for k, v in j["data"].items()])
print(df[["LeadId", "id", "name", "abbreviation", "active"]])
Output:
LeadId id name abbreviation active
0 138082239 275 Sue SJ True
0 138082238 276 John JC True
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.