简体   繁体   English

如何在python中使用条件有效地选择特定列

[英]How to efficiently select specific columns with condition in python

For example, I have the following 2d array: 例如,我有以下2d数组:

ls = [
    [1,2,3,4,'A',5],
    [1,2,3,4,'A',5],
    [1,2,3,4,'A',5],
    [-1,-2,-3,-4,'B',-5],
    [-1,-2,-3,-4,'B',-5],
    [-1,-2,-3,-4,'B',-5]
]

I want to select the 1st, 3rd, 4th column of ls , and respectively save each column into a new list. 我想选择ls的第一,第三,第四列,并将每一列分别保存到一个新列表中。 Moreover, I hope to select conditioned on the 5th column, ie checking whether 'A' or 'B' , as follows: 此外,我希望选择第5列的条件,即检查'A'还是'B' ,如下所示:

la1 = [int(x[0]) for x in ls if 'A' in x[4]]
la2 = [int(x[2]) for x in ls if 'A' in x[4]]
la3 = [float(x[3]) for x in ls if 'A' in x[4]]
lb1 = [int(x[0]) for x in ls if 'B' in x[4]]
lb2 = [int(x[2]) for x in ls if 'B' in x[4]]
lb3 = [float(x[3]) for x in ls if 'B' in x[4]]

I know my implementation is not efficient in large arrays. 我知道我的实现在大型阵列中效率不高。 Is there any better implementation? 有没有更好的实现方法? Thank you all for helping me!!! 谢谢大家的帮助!!!

You can merge your 6 list comprehensions into two: 您可以将6个列表理解合并为两个:

la1, la2, la3= zip(*((x[0], x[2], float(x[3])) for x in ls if 'A' in x[4]))
lb1, lb2, lb3= zip(*((x[0], x[2], float(x[3])) for x in ls if 'B' in x[4]))

This first creates a list of 3-tuples (x[0], x[2], float(x[3])) , then uses the old zip(*values) trick to transpose it and unpack it into the la1, la2, la3 variables. 这首先创建一个三元组的列表(x[0], x[2], float(x[3])) ,然后使用旧的zip(*values)技巧对其进行转置并将其解压缩到la1, la2, la3变量。


More efficient than that would be a simple loop: 比这更有效的是一个简单的循环:

la1, la2, la3 = [], [], []
lb1, lb2, lb3 = [], [], []
for x in ls:
    if 'A' in x[4]:
        la1.append(x[0])
        la2.append(x[2])
        la3.append(float(x[3]))
    if 'B' in x[4]:
        lb1.append(x[0])
        lb2.append(x[2])
        lb3.append(float(x[3]))

You can try to use numpy , it's highly efficient array library for python: 您可以尝试使用numpy ,它是python的高效数组库:

import numpy as np

ls = np.array([  # wrap ls into numpy array
    [1,2,3,4,'A',5],
    [1,2,3,4,'A',5],
    [1,2,3,4,'A',5],
    [-1,-2,-3,-4,'B',-5],
    [-1,-2,-3,-4,'B',-5],
    [-1,-2,-3,-4,'B',-5]
])

a_rows = ls[:,4] == 'A' # select rows with A in 4-th column
b_rows = ls[:,4] == 'B'
col_1 = ls[:,0]  # select first column
col_3 = ls[:,2]
col_4 = ls[:,3]
la1 = col_1[a_rows]  # first column with respect to the rows with A
la2 = col_3[a_rows]
la3 = col_4[a_rows]
lb1 = col_1[b_rows]
lb2 = col_3[b_rows]
lb3 = col_4[b_rows]

I think storing your lists in a dictionary would be wise if you have many of them, also a for loop might be quicker since you are kind of splitting the data based on a condition: 我认为如果您有很多列表,将列表存储在字典中将是明智的选择,因为您根据某种条件拆分数据,所以for循环可能会更快:

d = {'la1': [],
     'la3': [],
     'la4': [],
     'lb1': [],
     'lb3': [],
     'lb4': []}

ls = [[1,2,3,4,'A',5],
      [1,2,3,4,'A',5],
      [1,2,3,4,'A',5],
      [-1,-2,-3,-4,'B',-5],
      [-1,-2,-3,-4,'B',-5],
      [-1,-2,-3,-4,'B',-5]]

for sublist in ls:
    if sublist[4] == "A":
        d['la1'].append(int(sublist[0]))
        d['la3'].append(int(sublist[2]))
        d['la4'].append(float(sublist[3]))
    elif sublist[4] == "B":
        d['lb1'].append(int(sublist[0]))
        d['lb3'].append(int(sublist[2]))
        d['lb4'].append(float(sublist[3]))

print (d)

#{'lb4': [-4.0, -4.0, -4.0], 'lb1': [-1, -1, -1], 'la3': [3, 3, 3], 'la4': [4.0, 4.0, 4.0], 'la1': [1, 1, 1], 'lb3': [-3, -3, -3]}

Use numpy arrays They are faster than normal lists Try running each line of the code provided below 使用numpy数组它们比普通列表要快尝试运行下面提供的代码的每一行

ls = np.array([[1,2,3,4,'A',5],[1,2,3,4,'A',5],[1,2,3,4,'A',5],[-1,-2,-3,-4,'B',-5],[-1,-2,-3,-4,'B',-5],[-1,-2,-3,-4,'B',-5]])
filterA = (ls[:,4]=='A')
filterB = (ls[:,4]=='B')
newarrayA=ls[filterA]
newarrayB=ls[filterB]
selectedcolumnsA=newarrayA[:,(0,2,3)]
selectedcolumnsB=newarrayB[:,(0,2,3)]  
la1,la2,la3=selectedcolumnsA[:,0],selectedcolumnsA[:,1],selectedcolumnsA[:,2]
lb1,lb2,lb3=selectedcolumnsB[:,0],selectedcolumnsB[:,1],selectedcolumnsB[:,2]

Hope it helps.If you are uncomfortable with it,Try learning numpy.It will surely help you in future. 希望对您有所帮助。如果您对此感到不舒服,请尝试学习numpy。它一定会在将来对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM