Assume that I have a pandas DataFrame called df
that looks something like:
source tables columns
src1 table1 col1
src1 table1 col2
src1 table2 col1
src2 table1 col1
src2 table1 col2
My current code below can iterate through the list of sources and nest the list of tables within each source as an object:
data = [
{k: v}
for k, v in df.groupby('source')['tables'].agg(
lambda x: {v: {} for v in x}).items()
]
with open('data.json', 'w') as f:
json.dump(data, f, indent = 2)
The output I'm receiving with this code is as follows:
[
{
"src1": {
"table1": {},
"table2": {}
}
},
{
"src2": {
"table1": {},
}
}
]
My desired output:
[
{
"src1": {
"table1": {
"col1": {},
"col2": {}
},
"table2": {
"col1": {}
}
}
},
{
"src2": {
"table1": {
"col1": {}
}
}
}
]
Any assistance in converting my 2-layer nested JSON file to 3 layers as shown above would be greatly appreciated. Thank you in advance.
Since you have multiple levels of grouping here, I'd recommend just using a for loop to iterate over your data.
from collections import defaultdict
def make_nested(df):
f = lambda: defaultdict(f)
data = f()
for row in df.to_numpy().tolist():
t = data
for r in row[:-1]:
t = t[r]
t[row[-1]] = {}
return data
print(json.dumps(make_nested(df), indent=2))
{
"src1": {
"table1": {
"col1": {},
"col2": {}
},
"table2": {
"col1": {}
}
},
"src2": {
"table1": {
"col1": {},
"col2": {}
}
}
}
This assumes your columns are arranged from left to right: outermost keys to innermost key.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.