带有嵌套列表的字典中的DataFrame

Question

I have a python dictionary with nested lists, that I would like to turn into a pandas DataFrame 我有一个带有嵌套列表的python字典，我想变成一个熊猫DataFrame

a = {'A': [1,2,3], 'B':['a','b','c'],'C':[[1,2],[3,4],[5,6]]}

I would like the final DataFrame to look like this: 我希望最终的DataFrame看起来像这样：

> A  B  C
> 1  a  1
> 1  a  2
> 2  b  3
> 2  b  4
> 3  c  5
> 3  c  6

When I use the DataFrame command it looks like this: 当我使用DataFrame命令时，它看起来像这样：

pd.DataFrame(a)

>   A   B     C
>0  1   a   [1, 2]
>1  2   b   [3, 4]
>2  3   c   [5, 6]

Is there anyway I make the data long by the elements of C? 无论如何，我是否会使用C的元素使数据变长？

Answer 1

This is what I came up with: 这是我想出的：

In [53]: df
Out[53]: 
   A  B       C
0  1  a  [1, 2]
1  2  b  [3, 4]
2  3  c  [5, 6]
In [58]: s = df.C.apply(Series).unstack().reset_index(level=0, drop = True)

In [59]: s.name = 'C2'

In [61]: df.drop('C', axis = 1).join(s)
Out[61]: 
   A  B  C2
0  1  a   1
0  1  a   2
1  2  b   3
1  2  b   4
2  3  c   5
2  3  c   6

apply(Series) gives me a DataFrame with two columns. apply(Series)给我一个包含两列的DataFrame。 To join them into one while keeping the original index, I use unstack . 为了在保持原始索引的同时将它们合并为一个，我使用unstack 。 reset_index removes the first level of the index, which basically holds the index of the value in the original list which was in C. Then I join it back into the df. reset_index删除索引的第一级，它基本上将值的索引保存在C中的原始列表中。然后将其重新加入df中。

Answer 2

Yes, one way is to deal with your dictionnary first ( I assume your dictionnary values contain either just list of values either list of nested lists - but not lists of both values and lists). 是的，一种方法是首先处理字典（我假设您的字典值要么只包含值列表，要么包含嵌套列表的列表，但不包含值和列表的列表）。 Step by step: 一步步：

def f(x, y): return x + y

res={k: reduce(f, v) if any(isinstance(i, list) for i in v) else v for k,v in a.items()}

will give you: {'A': [1, 2, 3], 'C': [1, 2, 3, 4, 5, 6], 'B': ['a', 'b', 'c']} 会给你： {'A': [1, 2, 3], 'C': [1, 2, 3, 4, 5, 6], 'B': ['a', 'b', 'c']}

Now you need to extend lists in your dictionnary: 现在，您需要在字典中扩展列表：

m = max([len(v) for v in res.values()])

res1 = {k: reduce(f, [(m/len(v))*[i] for i in v]) for k,v in res.items()}

And finally: 最后：

pd.DataFrame(res1)

带有嵌套列表的字典中的DataFrame

问题描述

2 个解决方案

解决方案1
3 已采纳 2014-09-09 07:56:59

解决方案2
1 2014-09-09 07:58:41

带有嵌套列表的字典中的DataFrame

问题描述

2 个解决方案

解决方案1 3 已采纳 2014-09-09 07:56:59

解决方案2 1 2014-09-09 07:58:41

解决方案1
3 已采纳 2014-09-09 07:56:59

解决方案2
1 2014-09-09 07:58:41