简体   繁体   English

从 Pandas 数据帧创建要在 D3 中使用的耀斑 json

[英]Creating a flare json to be used in D3 from pandas dataframe

I have a dataframe that I want to convert to a hierarchical flare json to be used in a D3 visulalization like this: D3 sunburst我有一个数据框,我想将其转换为分层耀斑 json,以用于 D3 可视化,如下所示: D3 sunburst

My dataframe contains a hierarchial data such as this:我的数据框包含一个分层数据,例如:

在此处输入图片说明

And the output I want should look like this:我想要的输出应该是这样的:

{"name": "flare","children": 
    [
        {"name": "Animal", "children": 
            [
                {"name": "Mammal", "children":
                    [
                        {"name": "Fox","value":35000}, 
                        {"name": "Lion","value":25000}
                    ]
                },
                {"name": "Fish", "children":
                    [
                        {"name": "Cod","value":35000} 
                    ]
                }
            ]
        },
        {"name": "Plant", "children": 
            [
                {"name": "Tree", "children":
                    [
                        {"name": "Oak","value":35000} 
                    ]
                }
            ]
        }
     ]
} 

I have tried several approaches, but cant get it right.我尝试了几种方法,但无法正确解决。 Here is my non-working code, inspired by this post: Pandas to D3.这是我的非工作代码,灵感来自这篇文章: Pandas to D3。 Serializing dataframes to JSON 将数据帧序列化为 JSON

from collections import defaultdict
import pandas as pd
df = pd.DataFrame({'group1':["Animal", "Animal", "Animal", "Plant"],'group2':["Mammal", "Mammal", "Fish", "Tree"], 'group3':["Fox", "Lion", "Cod", "Oak"],'value':[35000,25000,15000,1500]  })
tree = lambda: defaultdict(tree)  
d = tree()
for _, (group0,group1, group2, group3, value) in df.iterrows():
    d['name'][group0]['children'] = group1
    d['name'][group1]['children'] = group2
    d['name'][group2]['children'] = group3
    d['name'][group3]['children'] = value


json.dumps(d)

I am working on a similar visualization project that requires moving data from a Pandas DataFrame to a JSON file that works with D3.我正在开发一个类似的可视化项目,该项目需要将数据从 Pandas DataFrame 移动到与 D3 一起使用的 JSON 文件。

I came across your post while looking for a solution and ended up writing something based on this GitHub repository and with input from the link you provided in this post .我在寻找解决方案时遇到了您的帖子,并最终根据此GitHub 存储库和您在此帖子中提供的链接中输入的内容编写了一些内容。

The code is not pretty and is a bit hacky and slow.代码不漂亮,有点笨拙和缓慢。 But based on my project, it seems to work just fine for any amount of data as long as it has three levels and a value field.但是根据我的项目,只要它具有三个级别和一个值字段,它似乎对任何数量的数据都可以正常工作。 You should be able to simply fork the D3 Starburst notebook and replace the flare.json file with this code's output.您应该能够简单地 fork D3 Starburst notebook并使用此代码的输出替换flare.json 文件。

The modification that I made here, based on the original GitHub post, is to provide consideration for three levels of data.我这里在原 GitHub 帖子的基础上所做的修改是为了考虑三个级别的数据。 So, if the name of the level 0 node exists, then append from level 1 and on.因此,如果存在 0 级节点的名称,则从 1 级开始追加。 Likewise, if the name of the level 1 node exists, then append the level 2 node ( the third level ).同样,如果第 1 级节点的名称存在,则附加第 2 级节点(第三级)。 Otherwise, append the full path of data.否则,附加数据的完整路径。 If you need more, some kind of recursion might do the trick, or just keep hacking it to add more levels如果您需要更多,某种递归可能会奏效,或者只是继续修改它以添加更多级别

# code snip to format Pandas DataFrame to json for D3 Starburst Chart

# libraries
import json
import pandas as pd

# example data with three levels and a single value field
data = {'group1': ['Animal', 'Animal', 'Animal', 'Plant'],
        'group2': ['Mammal', 'Mammal', 'Fish', 'Tree'],
        'group3': ['Fox', 'Lion', 'Cod', 'Oak'],
        'value': [35000, 25000, 15000, 1500]}

df = pd.DataFrame.from_dict(data)

print(df)

""" The sample dataframe
group1  group2 group3  value
0  Animal  Mammal    Fox  35000
1  Animal  Mammal   Lion  25000
2  Animal    Fish    Cod  15000
3   Plant    Tree    Oak   1500
"""

# initialize a flare dictionary
flare = {"name": "flare", "children": []}

# iterate through dataframe values
for row in df.values:
    level0 = row[0]
    level1 = row[1]
    level2 = row[2]
    value = row[3]
    
    # create a dictionary with all the row data
    d = {'name': level0,
          'children': [{'name': level1,
                        'children': [{'name': level2,
                                      'value': value}]}]}
    # initialize key lists
    key0 = []
    key1 = []

    # iterate through first level node names
    for i in flare['children']:
        key0.append(i['name'])

        # iterate through next level node names
        key1 = []
        for _, v in i.items():
            if isinstance(v, list):
                for x in v:
                    key1.append(x['name'])

    # add the full row of data if the root is not in key0
    if level0 not in key0:
        d = {'name': level0,
              'children': [{'name': level1,
                            'children': [{'name': level2,
                                          'value': value}]}]}
        flare['children'].append(d)

    elif level1 not in key1:

        # if the root exists, then append only the next level children

        d = {'name': level1,
              'children': [{'name': level2,
                            'value': value}]}

        flare['children'][key0.index(level0)]['children'].append(d)

    else:

        # if the root exists, then only append the next level children
        
        d = {'name': level2,
             'value': value}

        flare['children'][key0.index(level0)]['children'][key1.index(level1)]['children'].append(d)

# uncomment next three lines to save as json file
# save to some file
# with open('filename_here.json', 'w') as outfile:
#     json.dump(flare, outfile)

print(json.dumps(flare, indent=2))

""" the expected output of this json data
{
  "name": "flare",
  "children": [
    {
      "name": "Animal",
      "children": [
        {
          "name": "Mammal",
          "children": [
            {
              "name": "Fox",
              "value": 35000
            },
            {
              "name": "Lion",
              "value1": 25000
            }
          ]
        },
        {
          "name": "Fish",
          "children": [
            {
              "name": "Cod",
              "value": 15000
            }
          ]
        }
      ]
    },
    {
      "name": "Plant",
      "children": [
        {
          "name": "Tree",
          "children": [
            {
              "name": "Oak",
              "value": 1500
            }
          ]
        }
      ]
    }
  ]
}
"""

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM