pandas数据帧的元组列表列表

Question

I have this array (it's a result from similarity calcul) it's a list of tuples like this: 我有这个数组（这是相似性计算的结果）它是一个像这样的元组列表：

example = [[(a,b), (c,d)], [(a1,b1), (c1,d2)] …]

In example there is 121044 list of 30 tuples each. 在示例中，有121044个列表，每个元组有30个元组。

I want to have a pandas Dataframe like of just the second value of the tuples (ie : b, d, b1, d2) without spending to much time compute it 我希望有一个pandas Dataframe，就像元组的第二个值（即：b，d，b1，d2）而不花费太多时间来计算它

Do you have any ideas ? 你有什么想法？

Answer 1

Use nested list comprehension: 使用嵌套列表理解：

df = pd.DataFrame([[y[1] for y in  x] for x in example])
print (df)
    0   1
0   b   d
1  b1  d2

df = pd.DataFrame([[y[1] for y in  x] for x in example], columns=['col1','col2'])
print (df)
  col1 col2
0    b    d
1   b1   d2

Answer 2

For numeric data, you can use numpy indexing directly. 对于数字数据，您可以直接使用numpy索引。 This should be more efficient than a list comprehension, as pandas uses numpy internally to store data in contiguous memory blocks. 这应该比列表理解更有效，因为pandas在内部使用numpy将数据存储在连续的内存块中。

import pandas as pd, numpy as np

example = [[(1,2), (3,4)], [(5,6), (7,8)]]

df = pd.DataFrame(np.array(example)[..., 1],
                  columns=['col1', 'col2'])

print(df)

   col1  col2
0     2     4
1     6     8

pandas数据帧的元组列表列表

问题描述

2 个解决方案

解决方案1
1 已采纳 2018-05-22 12:11:56

解决方案2
1 2018-05-22 12:17:36

pandas数据帧的元组列表列表

问题描述

2 个解决方案

解决方案1 1 已采纳 2018-05-22 12:11:56

解决方案2 1 2018-05-22 12:17:36

解决方案1
1 已采纳 2018-05-22 12:11:56

解决方案2
1 2018-05-22 12:17:36