[英]Excel with multiple column indices and header rows into a Python dictionary via pandas
I am working with Pyomo and I am trying to input some 4-D data for some parameters. 我正在与Pyomo一起使用,并且尝试为某些参数输入一些4-D数据。
I have the data in an Excel spreadsheet that looks like this: 我在Excel电子表格中保存了如下数据:
A link to the original data can be found here: 原始数据的链接可以在这里找到:
I would like to import the data in Python and have each column index and header value in a tuple as the key of a dictionary and the values as the dictionary's values. 我想在Python中导入数据,并将元组中的每个列索引和标头值作为字典的键,并将值作为字典的值。
Essentially, the expected output should look like: 本质上,预期输出应如下所示:
p = {('Heat', 'Site 1', 1, 1): 14,
('Heat', 'Site 1', 1, 2): 16,
('Heat', 'Site 1', 1, 3): 10,
('Heat', 'Site 1', 2, 1): 13,
('Heat', 'Site 1', 2, 2): 13,
('Heat', 'Site 1', 2, 3): 13,
('Cool', 'Site 1', 1, 1): 5,
('Heat', 'Site 1', 1, 2): 6,
...
('Elec', 'Site 2', 2, 1): 11,
('Elec', 'Site 2', 2, 2): 15,
('Elec', 'Site 2', 2, 3): 15}
My idea was to import the excel file using pandas, first, and then use the to_dict
method. 我的想法是先使用pandas导入excel文件,然后使用
to_dict
方法。
What I did is the following: 我所做的是以下几点:
import pandas as pd
Loads = pd.read_excel("Time_series_parameters.xlsx", index_col=[0,1], header = [0,1])
That works well and I am able to get a data frame with two index columns and two header rows: 效果很好,我能够获得一个带有两个索引列和两个标题行的数据框:
Heat Cool Elec Heat Cool Elec
Time Site 1 Site 1 Site 1 Site 2 Site 2 Site 2
1 1 14 5 13 10 20 14
2 16 6 11 10 14 10
3 10 7 14 11 18 11
2 1 13 8 14 20 19 11
2 13 7 11 14 15 15
3 13 6 13 12 19 15
However, whatever I have tried from there to get to the expected result has failed... All the settings in the to_dict
method do not give me the expected result. 但是,从那里尝试达到预期结果的任何操作都失败了……
to_dict
方法中的所有设置都没有给我预期的结果。
Hence, I would appreciate it if someone could be of some help here. 因此,如果有人可以在这里提供帮助,我将不胜感激。
My solution for this would be: 我对此的解决方案是:
import pandas as pd
Loads = pd.read_excel("Time_series_parameters.xlsx", index_col=[0, 1], header=[0, 1])
out = {}
for index, inner in Loads.iteritems():
for sec_index, value in inner.iteritems():
out[index[0], index[1], sec_index[0], sec_index[1]] = value
The resulting output is: 结果输出为:
{('Heat', 'Site 1', 1, 1): 14,
('Cool', 'Site 1', 1, 1): 5,
('Elec', 'Site 1', 1, 1): 13,
('Heat', 'Site 2', 1, 1): 10,
('Cool', 'Site 2', 1, 1): 20,
('Elec', 'Site 2', 1, 1): 14,
('Heat', 'Site 1', 1, 2): 16,
('Cool', 'Site 1', 1, 2): 6,
('Elec', 'Site 1', 1, 2): 11,
('Heat', 'Site 2', 1, 2): 10,
...
I have also found another answer that essentially achieves the same results using some other pandas
functionality. 我还发现了另一个答案,该答案基本上可以使用其他一些
pandas
功能来达到相同的结果。 The code can be seen below: 该代码可以在下面看到:
Loads = pd.read_excel("Time_series_parameters.xlsx", sheet_name = "Loads", index_col=[0,1], header=[0, 1])
Loads = Loads.stack().stack()
Loads = Loads.reorder_levels([3,2,0,1])
p = Loads.to_dict()
The output looks again like this: 输出再次如下所示:
{('Cool', 'Site 1', 1, 1): 18,
('Elec', 'Site 1', 1, 1): 18,
('Heat', 'Site 1', 1, 1): 19,
('Cool', 'Site 2', 1, 1): 17,
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.