简体   繁体   English

从数据帧创建一个 json 结构

[英]Creating a json structure from dataframe

I have two excel files:我有两个excel文件:

2020-01-consumption.xlsx
2020-01-production.xlsx
print(pd.read_excel(2020-01-production.xlsx, index_col = 0).head(4))


                         build_1   build_2               build_3       build_4  ...  
date                                                                            ...
2020-01-01 00:00:00          1.2       4.2                   4.3           7.0  ...
2020-01-01 01:00:00          3.3       1.9                   5.3           3.5  ...
2020-01-01 02:00:00          4.1       2.7                   6.0           2.6  ...
2020-01-01 03:00:00          3.6       6.0                   7.1           7.2  ...



print(pd.read_excel(2020-01-consumption.xlsx, index_col = 0).head(4))


                         build_1   build_2               build_3       build_4  ...  
date                                                                            ...
2020-01-01 00:00:00          0.4       1.0                   0.1           1.0  ...
2020-01-01 01:00:00          0.3       0.9                   0.0           0.4  ...
2020-01-01 02:00:00          0.3       0.5                   0.0           0.4  ...
2020-01-01 03:00:00          0.1       0.5                   0.4           0.4  ...

Columns and indexes are the same.列和索引是一样的。 I'm trying to set up a for loop.我正在尝试设置一个 for 循环。 So under these circumstances, I want to save each column as a json file.所以在这种情况下,我想将每一列保存为一个 json 文件。 I want to change the structure of data to this:我想将数据结构更改为:

with open(build_1.json, encoding="utf8") as f:  #The name of the new file to be created must be the column name.
    content = json.load(f)

print(content)

{'build_1': {  #The key is column name.
    'date': [2020-01-01 00:00:00, 2020-01-01 01:00:00, 2020-01-01 02:00:00, 2020-01-01 03:00:00 ...],  #index name as a key.
    'production': [1.2, 3.3, 4.1, 3.6 ...],  #excel name is changed as a key.
    'consumption': [0.4, 0.3, 0.3, 0.1 ...]}}  #excel name is changed as a key.

I have a lot of dataframe like production and consumption.我有很多数据框,比如生产和消费。 I want to show only two as an example.我只想举两个例子。 How can I achieve this structure?我怎样才能实现这种结构? Is this possible?这可能吗?

You can use concat with keys parameter for MultiIndex in columns :您可以将concatMultiIndex in columns参数一起使用:

df1 = pd.read_excel('2020-01-production.xlsx', index_col = 0)
df2 = pd.read_excel('2020-01-consumption.xlsx', index_col = 0)

df = pd.concat([df1, df2], keys=['production','consumption'], axis=1)
print (df)

                    production                         consumption          \
                       build_1 build_2 build_3 build_4     build_1 build_2   
date                                                                         
2020-01-01 00:00:00        1.2     4.2     4.3     7.0         0.4     1.0   
2020-01-01 01:00:00        3.3     1.9     5.3     3.5         0.3     0.9   
2020-01-01 02:00:00        4.1     2.7     6.0     2.6         0.3     0.5   
2020-01-01 03:00:00        3.6     6.0     7.1     7.2         0.1     0.5   

                                     
                    build_3 build_4  
date                                 
2020-01-01 00:00:00     0.1     1.0  
2020-01-01 01:00:00     0.0     0.4  
2020-01-01 02:00:00     0.0     0.4  
2020-01-01 03:00:00     0.4     0.4  

And then loop by second level, selecting by DataFrame.xs , if necessary convert datetimes to strings or soem another way like need, create dictionary and last write to file:然后按第二级循环,通过DataFrame.xs选择,如有必要,将日期时间转换为字符串或另一种方式(如需要),创建字典并最后写入文件:

for lvl in df.columns.levels[1]:
    print (lvl)
    df1 = df.xs(lvl, axis=1, level=1).reset_index()
    df1['date'] = df1['date'].astype(str)
    d ={lvl: df1.to_dict(orient='list')}
    #print (d)
    
    with open(f'{lvl}.json', mode='w', encoding="utf8") as f:
        json.dump(d, f)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM