[英]How can I create a Pandas frame from data stored in multiple nested dictionaries?
I have a Python program in which I sweep multiple parameters and at each point I calculate a few results.我有一个 Python 程序,我在其中扫描多个参数,并在每个点计算一些结果。 I then want to export the results in the form of a CSV (or Excel) report that, on each row, contains the parameters and results.
然后我想以 CSV(或 Excel)报告的形式导出结果,该报告的每一行都包含参数和结果。 For example, here I sweep two parameters
i
and j
and calculated res1
and res2
as a function of i
and j
.例如,我在这里扫描两个参数
i
和j
并计算res1
和res2
作为i
和j
的函数。 (This is completely silly MWE though!) (虽然这完全是愚蠢的 MWE!)
res1 = dict()
res2 = dict()
for i in range(5):
res1[i] = dict()
res2[i] = dict()
for j in range(5):
res1[i][j] = i+j
res2[i][j] = i*j
And I would like to create a CSV with 25 rows and 4 columns where first two columns are (i, j)
combinations for which res1
and res2
are calculated and second two columns are res1
and res2
respectively.我想创建一个包含 25 行和 4 列的 CSV,其中前两列是
(i, j)
组合,计算res1
和res2
,后两列分别是res1
和res2
。 A naive way of exporting such a CSV is as follows:导出此类 CSV 的简单方法如下:
#### Naive CSV writing
print(', '.join(['i', 'j', 'res1', 'res2']))
for i in range(5):
for j in range(5):
print(', '.join([str(i), str(j), str(res1[i][j]), str(res2[i][j])]))
I was wondering if there is a way to create a pandas
frame from the dictionaries so that then I can export the reports more easily?我想知道是否有办法从字典中创建一个
pandas
框架,以便我可以更轻松地导出报告?
I know that pandas.DataFrame
constructor accepts a dictionary that maps column headers to column values.我知道
pandas.DataFrame
构造函数接受将列标题映射到列值的字典。 So, for example the following is a possible solution:因此,例如以下是一个可能的解决方案:
import pandas as pd
import sys
# generate results as before
d = dict([('i', list()),
('j', list()),
('res1', list()),
('res2', list())])
for i in range(5):
for j in range(5):
d['i'].append(i)
d['j'].append(j)
d['res1'].append(res1[i][j])
d['res2'].append(res2[i][j])
df = pd.DataFrame(data=d)
df.to_csv(sys.stdout, index=False)
Yet, the above does not look so elegant (and I think is not efficient either).然而,上面的代码看起来并不那么优雅(而且我认为效率也不高)。 Is there a better way to do so?
有更好的方法吗?
You could create normal list您可以创建普通列表
data = []
for i in range(5):
for j in range(5):
data.append([i, j, res1[i][j], res2[i][j]])
And then convert to DataFrame然后转换为DataFrame
import pandas as pd
df = pd.DataFrame(data, columns=['i', 'j', 'res1', 'res2'])
print(df)
Or directly write it using csv
module或者直接用
csv
模块写
import csv
fh = open("output.csv", 'w')
csvwriter = cvs.writer(fh)
csvwriter.writerow(['i', 'j', 'res1', 'res2'])
for i in range(5):
for j in range(5):
csvwriter.writerow([i, j, res1[i][j], res2[i][j]])
fh.close()
How about this:这个怎么样:
import pandas as pd
from itertools import product
p = np.array(list(product(range(5), range(5))))
df = pd.DataFrame(data={'i': p[:,0], 'j':p[:,1]})
def res(row):
row['res1'] = res1(row['i'], row['j'])
row['res2'] = res2(row['i'], row['j'])
return row
df = df.apply(res, axis=1)
Now you can write the dataframe directly into a csv现在您可以将数据帧直接写入 csv
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.