[英]Need help formatting pandas data frame from json file
Hi I need help formatting a json file that I converted to a pandas dataframe.嗨,我需要帮助格式化 json 文件,我将其转换为 pandas dataframe。
Json looks like Json 看起来像
{
"test":
{
"1":["test1_a", "test1_b", "test1_c"]
"2":["test2_a", "test2_b", "test2_c"]
"3":["test3_a", "test3_b", "test3_c"]
}
}
And I need this json to be converted to a pandas dataframe and for it to be printed like this:我需要将此 json 转换为 pandas dataframe 并像这样打印:
col1 col2 col3
test1_a test1_b test1_c
test2_a test2_b test2_c
test3_a test3_b test3_c
How would I do this?我该怎么做? I need it to be a pandas dataframe and need to define the column rows.
我需要它是 pandas dataframe 并且需要定义列行。
So far I have tried:到目前为止,我已经尝试过:
json_file = open(json_file_path, 'r')
data = json.load(json_file)
pandasDataframe = pd.Dataframe.from_dict(data)
print(pandasDataframe)
And it prints this, which I don't want:(它打印了这个,我不想要:(
1 ["test1_a", "test1_b", "test1_c"]
2 ["test2_a", "test2_b", "test2_c"]
3 ["test3_a", "test3_b", "test3_c"]
updated: when I do更新:当我这样做的时候
pd.DataFrame(data['test'])
It looks like [not quite what I want, but it's getting there]它看起来像 [不是我想要的,但它正在到达那里]
1 2 3
0 test1_a test2_a test3_a
1 test1_b test2_b test3_b
2 test1_c test2_c test3_c
Update #2: when I transpose it looks like this:更新#2:当我转置时,它看起来像这样:
0 2
1 test1_a test1_b test1_c
2 test2_a test2_b test2_c
3 test3_a test3_b test3_c
How would I get rid of the 0 and 2 at the top?我将如何摆脱顶部的 0 和 2 ? And what does it mean?
这是什么意思? Also how do I get rid of the 1,2,3 (aka the first column altogether)
另外我如何摆脱 1,2,3 (又名第一列)
desired output: the col names (col1, col2, col3) need to be added, but don't know how)所需的 output:需要添加列名称(col1、col2、col3),但不知道如何添加)
col1 col2 col3
test1_a test1_b test1_c
test2_a test2_b test2_c
test3_a test3_b test3_c
IIUC, you need add_prefix
IIUC,你需要
add_prefix
import pandas as pd
pd.DataFrame(data['test']).add_prefix('col')
col1 col2 col3
0 test1_a test2_a test3_a
1 test1_b test2_b test3_b
2 test1_c test2_c test3_c
You could try with:您可以尝试:
pd.DataFrame(data['test']).T.rename(columns={0:'col1',1:'col2',2:'col3'})
Output: Output:
col1 col2 col3
1 test1_a test1_b test1_c
2 test2_a test2_b test2_c
3 test3_a test3_b test3_c
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.