简体   繁体   English

pandas数据帧的元组列表列表

[英]List of list of tuples to pandas dataframe

I have this array (it's a result from similarity calcul) it's a list of tuples like this: 我有这个数组(这是相似性计算的结果)它是一个像这样的元组列表:

example = [[(a,b), (c,d)], [(a1,b1), (c1,d2)] …]

In example there is 121044 list of 30 tuples each. 在示例中,有121044个列表,每个元组有30个元组。

I want to have a pandas Dataframe like of just the second value of the tuples (ie : b, d, b1, d2) without spending to much time compute it 我希望有一个pandas Dataframe,就像元组的第二个值(即:b,d,b1,d2)而不花费太多时间来计算它

Do you have any ideas ? 你有什么想法 ?

Use nested list comprehension: 使用嵌套列表理解:

df = pd.DataFrame([[y[1] for y in  x] for x in example])
print (df)
    0   1
0   b   d
1  b1  d2

df = pd.DataFrame([[y[1] for y in  x] for x in example], columns=['col1','col2'])
print (df)
  col1 col2
0    b    d
1   b1   d2

For numeric data, you can use numpy indexing directly. 对于数字数据,您可以直接使用numpy索引。 This should be more efficient than a list comprehension, as pandas uses numpy internally to store data in contiguous memory blocks. 这应该比列表理解更有效,因为pandas在内部使用numpy将数据存储在连续的内存块中。

import pandas as pd, numpy as np

example = [[(1,2), (3,4)], [(5,6), (7,8)]]

df = pd.DataFrame(np.array(example)[..., 1],
                  columns=['col1', 'col2'])

print(df)

   col1  col2
0     2     4
1     6     8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM