简体   繁体   English

获取由其中之一标记的多个数组的所有组件状态

[英]Get all component stats of multiple arrays labeled by one of them

I already asked a similar question which got answered but now this is more in detail: 我已经问过一个类似的问题,但得到了解答,但现在对此进行了更详细的介绍:

I need a really fast way to get all important component stats of two arrays, where one array is labeled by opencv2 and gives the component areas for both arrays. 我需要一种非常快速的方法来获取两个数组的所有重要组件状态,其中一个数组由opencv2标记,并提供两个数组的组件区域。 The stats for all components masked on the two arrays should then saved to a dictionary. 然后应将在两个阵列上屏蔽的所有组件的统计信息保存到字典中。 My approach works but it is much too slow. 我的方法有效,但是速度太慢。 Is there something to avoid the loop or a better approach then the ndimage.öabeled_comprehension? 有什么需要避免的循环或比ndimage.öabeled_comprehension更好的方法吗?

from scipy import ndimage
import numpy as np
import cv2

def calculateMeanMaxMin(val):
    return np.array([np.mean(val),np.max(val),np.min(val)])

def getTheStatsForComponents(array1,array2):
    ret, thresholded= cv2.threshold(array2, 120, 255, cv2.THRESH_BINARY)
    thresholded= thresholded.astype(np.uint8)
    numLabels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresholded, 8, cv2.CV_8UC1)
    allComponentStats=[]
    meanmaxminArray2 = ndimage.labeled_comprehension(array2, labels, np.arange(1, numLabels+1), calculateMeanMaxMin, np.ndarray, 0)
    meanmaxminArray1 = ndimage.labeled_comprehension(array1, labels, np.arange(1, numLabels+1), calculateMeanMaxMin, np.ndarray, 0)
    for position, label in enumerate(range(1, numLabels)):
        currentLabel = np.uint8(labels== label)
        contour, _ = cv2.findContours(currentLabel, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
        (side1,side2)=cv2.minAreaRect(contour[0])[1]
        componentStat = stats[label]
        allstats = {'position':centroids[label,:],'area':componentStat[4],'height':componentStat[3],
                              'width':componentStat[2],'meanArray1':meanmaxminArray1[position][0],'maxArray1':meanmaxminArray1[position][1],
                              'minArray1':meanmaxminArray1[position][2],'meanArray2':meanmaxminArray2[position][0],'maxArray2':meanmaxminArray2[position][1],
                              'minArray2':meanmaxminArray2[position][2]}

        if side1 >= side2 and side1 > 0:
            allstats['elongation'] = np.float32(side2 / side1)
        elif side2 > side1 and side2 > 0:
            allstats['elongation'] = np.float32(side1 / side2)
        else:
            allstats['elongation'] = np.float32(0)
        allComponentStats.append(allstats)
    return allComponentStats

EDIT 编辑

The two arrays are 2d arrays: 这两个数组是2d数组:

array1= np.random.choice(255,(512,512)).astype(np.uint8)
array2= np.random.choice(255,(512,512)).astype(np.uint8)

EDIT2 EDIT2

small example of two arrays and the labelArray with two components(1 and 2, and background 0). 两个数组的小例子,带有两个组件(1和2,以及背景0)的labelArray。 Calculate the min,max mean with ndimage.labeled_comprhension. 用ndimage.labeled_comprhension计算最小值,最大值平均值。

from scipy import ndimage
import numpy as np

labelArray = np.array([[0,1,1,1],[2,2,1,1],[2,2,0,1]])
data = np.array([[0.1,0.2,0.99,0.2],[0.34,0.43,0.87,0.33],[0.22,0.53,0.1,0.456]])
data2 = np.array([[0.1,0.2,0.99,0.2],[0.1,0.2,0.99,0.2],[0.1,0.2,0.99,0.2]])
numLabels = 2

minimumDataForAllLabels = ndimage.labeled_comprehension(data, labelArray, np.arange(1, numLabels+1), np.min, np.ndarray, 0)
minimumData2ForallLabels = ndimage.labeled_comprehension(data2, labelArray, np.arange(1, numLabels+1), np.min, np.ndarray, 0)
print(minimumDataForAllLabels)
print(minimumData2ForallLabels)
print(bin_and_do_simple_stats(labelArray.flatten(),data.flatten()))

Output: 输出:

[0.2 0.22] ##minimum of component 1 and 2 from data
[0.2 0.1] ##minimum of component 1 and 2 from data2
[0.1  0.2  0.22] ##minimum output of bin_and_do_simple_stats from data

labeled_comprehension is definitely slow . labeled_comprehension 肯定很慢

At least the simple stats can be done much faster based on the linked post. 至少,基于链接的帖子,简单的统计信息可以更快地完成。 For simplicity I'm only doing one data array, but as the procedure returns sort indices it can be easily extended to multiple arrays: 为简单起见,我只做一个数据数组,但是当过程返回排序索引时,它可以轻松扩展到多个数组:

import numpy as np    
from scipy import sparse
try:
    from stb_pthr import sort_to_bins as _stb_pthr
    HAVE_PYTHRAN = True
except:
    HAVE_PYTHRAN = False

# fallback if pythran not available

def sort_to_bins_sparse(idx, data, mx=-1):
    if mx==-1:
        mx = idx.max() + 1    
    aux = sparse.csr_matrix((data, idx, np.arange(len(idx)+1)), (len(idx), mx)).tocsc()
    return aux.data, aux.indices, aux.indptr

def sort_to_bins_pythran(idx, data, mx=-1):
    indices, indptr = _stb_pthr(idx, mx)
    return data[indices], indices, indptr

# pick best available

sort_to_bins = sort_to_bins_pythran if HAVE_PYTHRAN else sort_to_bins_sparse

# example data

idx = np.random.randint(0,10,(100000))
data = np.random.random(100000)

# if possible compare the two methods

if HAVE_PYTHRAN:
    dsp,isp,psp = sort_to_bins_sparse(idx,data)
    dph,iph,pph = sort_to_bins_pythran(idx,data)

    assert (dsp==dph).all()
    assert (isp==iph).all()
    assert (psp==pph).all()

# example how to do simple vectorized calculations

def simple_stats(data,iptr):
    min = np.minimum.reduceat(data,iptr[:-1])
    mean = np.add.reduceat(data,iptr[:-1]) / np.diff(iptr)
    return min, mean

def bin_and_do_simple_stats(idx,data,mx=-1):
    data,indices,indptr = sort_to_bins(idx,data,mx)
    return simple_stats(data,indptr)

print("minima: {}\n mean values: {}".format(*bin_and_do_simple_stats(idx,data)))

If you have pythran (not required but a bit faster), compile this as <stb_pthr.py> : 如果您有pythran(不是必需的,但是要快一些), <stb_pthr.py>编译为<stb_pthr.py>

import numpy as np

#pythran export sort_to_bins(int[:], int)

def sort_to_bins(idx, mx):
    if mx==-1:
        mx = idx.max() + 1
    cnts = np.zeros(mx + 2, int)
    for i in range(idx.size):
        cnts[idx[i]+2] += 1
    for i in range(2, cnts.size):
        cnts[i] += cnts[i-1]
    res = np.empty_like(idx)
    for i in range(idx.size):
        res[cnts[idx[i]+1]] = i
        cnts[idx[i]+1] += 1
    return res, cnts[:-1]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 获取Python中所有线程的统计信息 - Get all stats of thread in Python 获取列出多个数组的所有组合 - Get all combinations of list multiple arrays 一个班轮有条件的,多个变量,并检查它们是否都为假? - One liner conditional, multiple variables and check that all of them are not false? 将多个正则表达式合并为一个可以“全部捕获”的正则表达式 - merge multiple regex into one which could "catch them all" 查找每个阵列由一行组成的所有组合 - Find all combinations consisting of one row per multiple arrays 读取多个txt文件并将其保存到一个numpy数组中:如何串联numpy数组 - Reading multiple txt files and saving them into one numpy array: how to concatenate numpy arrays 将多个整数存储在一个变量中,因此我可以将它们相加或将它们全部显示给用户 - Storing a multiple amounts of integers in a one variable, so i can sum them up or show all of them to a user 如何使用python获取连接的组件中标记为map的内核中唯一区域的数量? - How to get the number of unique area within a kernel in connected component labeled map using python? 如何将 append 多个 Excel 文件中的所有工作表放入一个 Excel 文件(不将它们合并或组合成一张工作表) - How to append all sheets in multiple Excel files into One Excel file (not consolidating or combining them into one sheet) 2D数组以及如何使用一维数组填充它们 - 2d arrays and how to populate them with one dimensional arrays
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM