简体   繁体   English

根据特定条件删除 numpy 数组的行

[英]Remove rows of a numpy array based on a specific condition

I have an array of four rows A = array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]]) .我有一个四行A = array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]]) In each row there are 4 numbers.每行有4数字。 How do I remove row#3 and row#4 ?如何删除row#3row#4 In row#3 and row#4 , 1 and 2 appear more than once respectively.row#3row#412分别出现不止一次。

Is there a faster way to do it for arbitrary number of rows and columns?对于任意数量的行和列,是否有更快的方法? The main aim is to remove those rows where a non negative number appear more than once.主要目的是删除那些非负数出现不止一次的行。

You can use something like this: first create dictionary of occurrences of each value in the sub arrays using np.unique and only keep arrays where no positive number appears more than once.您可以使用这样的方法:首先使用 np.unique 创建子数组中每个值出现的字典,并且只保留没有正数出现多次的数组。

A = np.array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]])

new_array = []

# loop through each array
for array in A:
    # Get a dictionary of the counts of each value
    unique, counts = np.unique(array, return_counts=True)
    counts = dict(zip(unique, counts))
    # Find the number of occurences of postive numbers
    positive_occurences = [value for key, value in counts.items() if key > 0]
    # Append to new_array if no positive number appears more than once
    if any(y > 1 for y in positive_occurences):
        continue
    else:
        new_array.append(array)

new_array = np.array(new_array)

this returns:这将返回:

array([[-1, -1, -1, -1],
       [-1, -1,  1,  2]])

My fully-vectorized approach:我的完全矢量化方法:

  • sort each row对每一行进行排序
  • detect duplicates by shifting the sorted array to the left by one and compare with itself通过将排序的数组向左移动一并与自身进行比较来检测重复项
  • mark rows with positive duplicates用正重复标记行
  • drop降低
import numpy as np
a = np.array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]])

# sort each row
b = np.sort(a)

# mark positive duplicates
drop = np.any((b[:,1:]>0) & (b[:,1:] == b[:,:-1]), axis=1)

# drop
aa = a[~drop, :]

Output:
array([[-1, -1, -1, -1],
       [-1, -1,  1,  2]])

I modified also to store the indices:我还修改了存储索引:

A = np.array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]])

new_array = []
**indiceStore = np.array([])**

# loop through each array
for array in A:
    # Get a dictionary of the counts of each value
    unique, counts = np.unique(array, return_counts=True)
    counts = dict(zip(unique, counts))
    # Find the number of occurences of postive numbers
    positive_occurences = [value for key, value in counts.items() if key > 0]
    # Append to new_array if no positive number appears more than once
    if any(y > 1 for y in positive_occurences):
        **indiceStore = np.append(indiceStore, int(array))**
        continue
    else:
        new_array.append(array)

new_array = np.array(new_array)

Let me kniow if this is right.让我知道这是否正确。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM