需要帮助从 json 文件格式化 pandas 数据帧

Question

Hi I need help formatting a json file that I converted to a pandas dataframe.嗨，我需要帮助格式化 json 文件，我将其转换为 pandas dataframe。

Json looks like Json 看起来像

{
  "test":
    { 
       "1":["test1_a", "test1_b", "test1_c"]
       "2":["test2_a", "test2_b", "test2_c"]
       "3":["test3_a", "test3_b", "test3_c"]
     }
}

And I need this json to be converted to a pandas dataframe and for it to be printed like this:我需要将此 json 转换为 pandas dataframe 并像这样打印：

col1     col2     col3
test1_a  test1_b  test1_c
test2_a  test2_b  test2_c
test3_a  test3_b  test3_c

How would I do this?我该怎么做？ I need it to be a pandas dataframe and need to define the column rows.我需要它是 pandas dataframe 并且需要定义列行。

So far I have tried:到目前为止，我已经尝试过：

json_file = open(json_file_path, 'r') 
data = json.load(json_file)
pandasDataframe = pd.Dataframe.from_dict(data)
print(pandasDataframe)

And it prints this, which I don't want:(它打印了这个，我不想要:(

1 ["test1_a", "test1_b", "test1_c"]
2 ["test2_a", "test2_b", "test2_c"]
3 ["test3_a", "test3_b", "test3_c"]

updated: when I do更新：当我这样做的时候

pd.DataFrame(data['test'])

It looks like [not quite what I want, but it's getting there]它看起来像 [不是我想要的，但它正在到达那里]

     1        2        3
0 test1_a   test2_a  test3_a
1 test1_b   test2_b  test3_b
2 test1_c   test2_c  test3_c

Update #2: when I transpose it looks like this:更新#2：当我转置时，它看起来像这样：

        0               2
1 test1_a test1_b test1_c
2 test2_a test2_b test2_c
3 test3_a test3_b test3_c

How would I get rid of the 0 and 2 at the top?我将如何摆脱顶部的 0 和 2 ？ And what does it mean?这是什么意思？ Also how do I get rid of the 1,2,3 (aka the first column altogether)另外我如何摆脱 1,2,3 （又名第一列）

desired output: the col names (col1, col2, col3) need to be added, but don't know how)所需的 output：需要添加列名称（col1、col2、col3），但不知道如何添加）

col1     col2     col3
test1_a  test1_b  test1_c
test2_a  test2_b  test2_c
test3_a  test3_b  test3_c

Answer 1

IIUC, you need add_prefix IIUC，你需要add_prefix

import pandas as pd

pd.DataFrame(data['test']).add_prefix('col')

      col1     col2     col3
0  test1_a  test2_a  test3_a
1  test1_b  test2_b  test3_b
2  test1_c  test2_c  test3_c

Answer 2

You could try with:您可以尝试：

pd.DataFrame(data['test']).T.rename(columns={0:'col1',1:'col2',2:'col3'})

Output: Output：

      col1     col2     col3
1  test1_a  test1_b  test1_c
2  test2_a  test2_b  test2_c
3  test3_a  test3_b  test3_c

需要帮助从 json 文件格式化 pandas 数据帧

问题描述

2 个解决方案

解决方案1
2 2020-07-27 03:45:32

解决方案2
1 已采纳 2020-07-27 02:21:48

需要帮助从 json 文件格式化 pandas 数据帧

问题描述

2 个解决方案

解决方案1 2 2020-07-27 03:45:32

解决方案2 1 已采纳 2020-07-27 02:21:48

解决方案1
2 2020-07-27 03:45:32

解决方案2
1 已采纳 2020-07-27 02:21:48