[英]DataFrame from dictionary with nested lists
I have a python dictionary with nested lists, that I would like to turn into a pandas DataFrame 我有一个带有嵌套列表的python字典,我想变成一个熊猫DataFrame
a = {'A': [1,2,3], 'B':['a','b','c'],'C':[[1,2],[3,4],[5,6]]}
I would like the final DataFrame to look like this: 我希望最终的DataFrame看起来像这样:
> A B C
> 1 a 1
> 1 a 2
> 2 b 3
> 2 b 4
> 3 c 5
> 3 c 6
When I use the DataFrame command it looks like this: 当我使用DataFrame命令时,它看起来像这样:
pd.DataFrame(a)
> A B C
>0 1 a [1, 2]
>1 2 b [3, 4]
>2 3 c [5, 6]
Is there anyway I make the data long by the elements of C? 无论如何,我是否会使用C的元素使数据变长?
This is what I came up with: 这是我想出的:
In [53]: df
Out[53]:
A B C
0 1 a [1, 2]
1 2 b [3, 4]
2 3 c [5, 6]
In [58]: s = df.C.apply(Series).unstack().reset_index(level=0, drop = True)
In [59]: s.name = 'C2'
In [61]: df.drop('C', axis = 1).join(s)
Out[61]:
A B C2
0 1 a 1
0 1 a 2
1 2 b 3
1 2 b 4
2 3 c 5
2 3 c 6
apply(Series)
gives me a DataFrame with two columns. apply(Series)
给我一个包含两列的DataFrame。 To join them into one while keeping the original index, I use unstack
. 为了在保持原始索引的同时将它们合并为一个,我使用unstack
。 reset_index
removes the first level of the index, which basically holds the index of the value in the original list which was in C. Then I join it back into the df. reset_index
删除索引的第一级,它基本上将值的索引保存在C中的原始列表中。然后将其重新加入df中。
Yes, one way is to deal with your dictionnary first ( I assume your dictionnary values contain either just list of values either list of nested lists - but not lists of both values and lists). 是的,一种方法是首先处理字典(我假设您的字典值要么只包含值列表,要么包含嵌套列表的列表,但不包含值和列表的列表)。 Step by step: 一步步:
def f(x, y): return x + y
res={k: reduce(f, v) if any(isinstance(i, list) for i in v) else v for k,v in a.items()}
will give you: {'A': [1, 2, 3], 'C': [1, 2, 3, 4, 5, 6], 'B': ['a', 'b', 'c']}
会给你: {'A': [1, 2, 3], 'C': [1, 2, 3, 4, 5, 6], 'B': ['a', 'b', 'c']}
Now you need to extend lists in your dictionnary: 现在,您需要在字典中扩展列表:
m = max([len(v) for v in res.values()])
res1 = {k: reduce(f, [(m/len(v))*[i] for i in v]) for k,v in res.items()}
And finally: 最后:
pd.DataFrame(res1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.