[英]List of list to pandas dataframe
I have a dataset that follows this format: 我有一个遵循这种格式的数据集:
data =[[[1, 0, 1000], [2, 1000, 2000]],
[[1, 0, 1500], [2, 1500, 2500], [2, 2500, 4000]]]
var1 = [10.0, 20.0]
var2 = ['ref1','ref2']
I want to convert it to a dataframe: 我想将其转换为数据帧:
dic = {'var1': var1, 'var2': var2, 'data': data}
import Pandas as pd
pd.DataFrame(dic)
The result: 结果:
However I'm trying to get something like this: 但是我想要得到这样的东西:
I've been trying to flatten the dictionary/list but with no success: 我一直试图压扁字典/列表,但没有成功:
pd.DataFrame([[col1, col2] for col1, d in dic.items() for col2 in d])
See the result: 看结果:
The different sizes of the list made the 'unpacking' complicated for another level. 列表的不同大小使得“拆包”复杂化为另一个级别。 I'm not sure if pandas could take care of this of it needs to be done before importing into pandas. 我不确定大熊猫是否可以在导入大熊猫之前完成这项工作。
Creating an appropriate list works: 创建适当的列表有效:
new_data = []
for x, v1, v2 in zip(data, var1, var2):
new_data.extend([y + [v1] + [v2] for y in x])
pd.DataFrame(new_data, columns=['data', 'min', 'max', 'var1', 'var2'])
gives: 得到:
data min max var1 var2
0 1 0 1000 10 ref1
1 2 1000 2000 10 ref1
2 1 0 1500 20 ref2
3 2 1500 2500 20 ref2
4 2 2500 4000 20 ref2
I can iterate over the rows in your temporary DataFrame. 我可以遍历临时DataFrame中的行。
df = pd.DataFrame(dic)
result = []
for i,d in df.iterrows():
temp = pd.DataFrame(d['data'], columns=['data', 'min', 'max'])
temp['var1'] = d['var1']
temp['var2'] = d['var2']
result += [temp]
pd.concat(result)
This produces 这产生了
data min max var1 var2
0 1 0 1000 10 ref1
1 2 1000 2000 10 ref1
0 1 0 1500 20 ref2
1 2 1500 2500 20 ref2
2 2 2500 4000 20 ref2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.