Quite a similar question was asked there , and was brilliantly answered by user1609452 in R. Still, it was a specific problematic. I'd like to expand the question. let's take almost the same table (MyData):
ID Location L_size L_color Station S_size S_color Category C_size C_color
1 Alpha 6 #000000 Zeta 3 #333333 Big 0.63 #306100
2 Alpha 6 #000000 Zeta 3 #333333 Medium 0.43 #458b00
3 Alpha 6 #000000 Zeta 3 #333333 small 0.47 #6aa232
4 Alpha 6 #000000 Yota 3 #4c4c4c Big 0.85 #306100
5 Alpha 6 #000000 Yota 3 #4c4c4c Medium 0.19 #458b00
6 Alpha 6 #000000 Yota 3 #4c4c4c small 0.89 #6aa232
7 Beta 6 #191919 Theta 4 #666666 Big 0.09 #306100
8 Beta 6 #191919 Theta 4 #666666 Medium 0.33 #458b00
9 Beta 6 #191919 Theta 4 #666666 small 0.79 #6aa232
10 Beta 6 #191919 Theta 4 #666666 Big 0.89 #306100
11 Beta 6 #191919 Meta 3 #7f7f7f Medium 0.71 #458b00
12 Beta 6 #191919 Meta 3 #7f7f7f small 0.59 #6aa232
Each category has one or multiple attributes (here, only one: size). What I'd like, it's to report the size for each parent/children in the json file:
{
"name":"MyData",
"size":12,
"color":"#ffffff"
"children":[
{
"name":"Alpha",
"size":6,
"color":"#000000"
"children":[
{
"name":"Zeta",
"size":3,
"color":"#333333"
"children":[
{
"name":"Big",
"size":0.63,
"color":"#306100"
},
...
etc. I couldn't make it in R, nor in pandas... Any idea?
EDIT: My goal is to link diverse information to children, not only size. I added up a color column for each main column. My initial dataframe is big and has a lot of information, but I can't paste it here, for clarity sake.
SECOND EDIT: To chrisb answer It almost worked! Great update. Still the json file isn't properly uploaded into my javascript file. The file seems to be upside down (mydata is at the end), and the information from a parent is before and after children information:
{
"children":[
{
"color":"#000000",
"children":[
{
"color":"#4c4c4c",
"children":{
"color":"#306100",
"name":"Big",
"size":0.85
},
"name":"Yota",
"size":3
},
{
"color":"#333333",
"children":{
"color":"#306100",
"name":"Big",
"size":0.63
},
"name":"Zeta",
"size":3
}
],
"name":"Alpha",
"size":6
},
{
"color":"#191919",
"children":[
{
"color":"#7f7f7f",
"children":{
"color":"#458b00",
"name":"Medium",
"size":0.71
},
"name":"Meta",
"size":3
},
{
"color":"#666666",
"children":{
"color":"#306100",
"name":"Big",
"size":0.09
},
"name":"Theta",
"size":4
}
],
"name":"Beta",
"size":6
}
],
"name":"MyData",
"size":12
LAST EDIT: Works fine. Chris removed the last part of the script he wrote when he updated it, so here it is. Thanks Chris!
data = {'name': 'MyData',
'size': len(MyData),
'children': make_children(MyData, levels)}
print json.dumps(data)
First, you need some kind of mapping of what makes up each level. I'm using tuples of the column that defines the "name"
and the prefix of the other attributes you want from that level, like this.
levels = [('Location', 'L_'),
('Station', 'S_'),
('Category', 'C_')]
Then, it's a similar recursive function, only now the extra columns are being picked up at each step (finding columns that start with the prefix) and being added to the tree by zipping the the columns / values. There's room to clean this up, but should at least give an idea.
def make_children(df, levels):
if len(levels) == 1:
name, prefix = levels[0]
level_cols = [name] + [c for c in df if c.startswith(prefix)]
df = df[level_cols]
key_names = ['name'] + [c.strip(prefix) for c in level_cols[1:]]
return dict(zip(key_names, df.values[0]))
else:
h, tail = levels[0], levels[1:]
name, prefix = h
level_cols = [name] + [c for c in df if c.startswith(prefix)]
data = []
for keys, df_gb in df.groupby(level_cols):
key_names = ['name'] + [c.strip(prefix) for c in level_cols[1:]]
d = dict(zip(key_names, keys))
d['children'] = make_children(df_gb, tail)
data.append(d)
return data
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.