Python Pandas：从具有列表列表值的字典创建 DataFrame

Question

I have dictionary like below:我有如下字典：

dict = {key_1:[[1, 2], [3, 4]], key_2:[[1, 2], [3, 4]]}

I want to convert this to a dataframe like below:我想将其转换为如下所示的 dataframe：

      colum_1 column_2 
key_1   1       2 
key_1   3       4 
key_2   1       2 
key_2   3       4

What is the most efficient way to do this.什么是最有效的方法来做到这一点。 Thanks for help=)感谢您的帮助=）

Answer 1

Let us try comprehension to unnest the key-val pairs让我们尝试理解以解除键值对的嵌套

pd.DataFrame((k, *l) for k, v in d.items() for l in v).set_index(0)

       1  2
0          
key_1  1  2
key_1  3  4
key_2  1  2
key_2  3  4

Answer 2

IIUC, you could use: IIUC，你可以使用：

cols = ['col1', 'col2']
df = pd.DataFrame({k: zip(*v) for k,v in d.items()}, index=cols).T.explode(cols)

output: output：

      col1 col2
key_1    1    2
key_1    3    4
key_2    1    2
key_2    3    4

Answer 3

Using pandas methods使用 pandas 方法

Here is a pure pandas way of doing this without using any list/dict comprehensions for anyone looking for this -这是一种纯粹的 pandas 方法，无需为寻找此内容的任何人使用任何列表/字典理解 -

d = {"key_1":[[1, 2], [3, 4]], "key_2":[[1, 2], [3, 4]]}
df = pd.DataFrame(d).T.stack().droplevel(-1).apply(pd.Series)
print(df)

       0  1
key_1  1  2
key_1  3  4
key_2  1  2
key_2  3  4

Benchmarks -基准 -

%%timeit
pd.DataFrame(d).T.stack().droplevel(-1).apply(pd.Series)

100 loops, best of 5: 2.56 ms per loop 100 个循环，5 个循环中的最佳循环：每个循环 2.56 毫秒

%%timeit
pd.DataFrame((k, *l) for k, v in d.items() for l in v).set_index(0)

1000 loops, best of 5: 719 µs per loop 1000 个循环，5 个循环中的最佳循环：每个循环 719 µs

%%timeit
cols = ['col1', 'col2']
pd.DataFrame({k: zip(*v) for k,v in d.items()}, index=cols).T.explode(cols)

100 loops, best of 5: 6.53 ms per loop 100 个循环，5 个循环中的最佳循环：每个循环 6.53 毫秒

Python Pandas：从具有列表列表值的字典创建 DataFrame

问题描述

3 个解决方案

解决方案1
4 2022-03-21 14:56:07

解决方案2
2 已采纳 2022-03-21 14:54:59

解决方案3
1 2022-03-21 15:16:15

Using pandas methods使用 pandas 方法

Benchmarks -基准 -

Python Pandas：从具有列表列表值的字典创建 DataFrame

问题描述

3 个解决方案

解决方案1 4 2022-03-21 14:56:07

解决方案2 2 已采纳 2022-03-21 14:54:59

解决方案3 1 2022-03-21 15:16:15

Using pandas methods使用 pandas 方法

Benchmarks -基准 -

解决方案1
4 2022-03-21 14:56:07

解决方案2
2 已采纳 2022-03-21 14:54:59

解决方案3
1 2022-03-21 15:16:15