简体   繁体   English

如何在二维 numpy 数组中找到最长连续出现的非零元素

[英]How to find longest consecutive ocurrence of non-zero elements in 2D numpy array

I am simulating protein folding on a 2D grid where every angle is either ±90° or 0°, and have the following problem:我在 2D 网格上模拟蛋白质折叠,其中每个角度都是 ±90° 或 0°,并且有以下问题:

I have an n-by-n numpy array filled with zeros, except for certain places where the value is any integer from 1 to n.我有一个用零填充的 n×n numpy 数组,除了某些地方的值是从 1 到 n 的任何整数。 Every integer appears just once.每个整数只出现一次。 Integer k is always a nearest neighbour to k-1 and k + 1, except for the endpoints.整数 k 始终是 k-1k + 1 的最近邻,端点除外。 The array is saved as an object in the class Grid which I have created for doing energy calculations and folding the protein.该数组被保存为类 Grid 中的一个对象,我创建该类用于进行能量计算和折叠蛋白质。 Example array, with n=5:示例数组,n=5:

>>> from Grid import Grid
>>> a = Grid(5)
>>> a.show()
[[0 0 0 0 0]
 [0 0 0 0 0]
 [1 2 3 4 5]
 [0 0 0 0 0]
 [0 0 0 0 0]]

My goal is to find the longest consecutive line of non-zero elements withouth any bends.我的目标是找到最长的连续非零元素行,没有任何弯曲。 In the above case, the result should be 5.在上述情况下,结果应为 5。

My idea so far are something like this:到目前为止,我的想法是这样的:

def getDiameter(self):
    indexes = np.zeros((self.n, 2))
    for i in range(1, self.n + 1):
        indexes[i - 1] = np.argwhere(self.array == i)[0]

    for i in range(self.n):
         j = 1
        currentDiameter = 1
            while indexes[0][i] == indexes[0][i + j] and i + j <= self.n:
                currentDiameter += 1
                j += 1

        while indexes[i][0] == indexes[i + j][0] and i + j <= self.n:
            currentDiameter += 1
            j += 1

        if currentDiameter > diameter:
            diameter = currentDiameter

     return diameter

This has two problems: (1) it doesn't work, and (2) it is horribly inefficient if I get it to work.这有两个问题:(1)它不起作用,(2)如果我让它工作,它的效率会非常低。 I am wondering if anybody has a better way of doing this.我想知道是否有人有更好的方法来做到这一点。 If anything is unclear, please let me know.如果有什么不清楚的,请告诉我。

Edit: Less trivial example编辑:不太简单的例子

[[ 0  0  0  0  0  0  0  0  0  0]
[ 0  0  0  0  0  0  0  0  0  0]
[ 0  0  0  0  0  0 10  0  0  0]
[ 0  0  0  0  0  0  9  0  0  0]
[ 0  0  0  0  0  0  8  0  0  0]
[ 0  0  0  4  5  6  7  0  0  0]
[ 0  0  0  3  0  0  0  0  0  0]
[ 0  0  0  2  1  0  0  0  0  0]
[ 0  0  0  0  0  0  0  0  0  0]
[ 0  0  0  0  0  0  0  0  0  0]] 

The correct answer here is 4 (both the longest column and the longest row have four non-zero elements).这里的正确答案是 4(最长的列和最长的行都有四个非零元素)。

What I understood from your question is you need to find the length of longest occurance of consecutive elements in numpy array (row by row).我从您的问题中了解到,您需要找到 numpy 数组中连续元素的最长出现次数(逐行)。

So for this below one, the output should be 5 :所以对于下面的这个,输出应该是5

[[1 2 3 4 0]
 [0 0 0 0 0]
 [10 11 12 13 14]
 [0 1 2 3 0]
 [1 0 0 0 0]]

Because [10 11 12 13 14] are consecutive elements and they have the longest length comparing to any consecutive elements in any other row.因为[10 11 12 13 14]是连续元素,并且与任何其他行中的任何连续元素相比,它们的长度最长。

If this is what you are expecting, consider this:如果这是您的期望,请考虑:

import numpy as np
from itertools import groupby

a = np.array([[1, 2, 3, 4, 0],
 [0, 0, 0, 0, 0],
 [10, 11, 12, 13, 14],
 [0, 1, 2, 3, 0],
 [1, 0, 0, 0, 0]])

a = a.astype(float)
a[a == 0] = np.nan
b = np.diff(a)      # Calculate the n-th discrete difference. Consecutive numbers will have a difference of 1.
counter = []
for line in b:       # for each row.
    if 1 in line:    # consecutive elements differ by 1.
        counter.append(max(sum(1 for _ in g) for k, g in groupby(line) if k == 1) + 1)  # find the longest length of consecutive 1's for each row.
print(max(counter))  # find the max of list holding the longest length of consecutive 1's for each row.
# 5

For your particular example:对于您的特定示例:

[[0 0 0 0 0] 
[0 0 0 0 0] 
[1 2 3 4 5] 
[0 0 0 0 0] 
[0 0 0 0 0]]
# 5

Start by finding the longest consecutive occurrence in a list:首先查找列表中最长的连续出现:

def find_longest(l):
    counter = 0
    counters =[]
    for i in l:
        if i == 0:
            counters.append(counter)
            counter = 0
        else:
            counter += 1
    counters.append(counter)
    return max(counters)

now you can apply this function to each row and each column of the array, and find the maximum:现在您可以将此函数应用于数组的每一行和每一列,并找到最大值:

longest_occurrences = [find_longest(row) for row in a] + [find_longest(col) for col in a.T]
longest_occurrence = max(longest_occurrences)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM