简体   繁体   English

如何在单独的numpy数组中使用相同的值对numpy数组的元素进行分组

[英]How to group elements of a numpy array with the same value in separate numpy arrays

As usual intro, I am a tyro in python. 像往常一样介绍,我是python中的tyro。 However, I got quite a big project to code. 但是,我有一个很大的代码项目。 It is a surface flow model with Cell Automata. 它是Cell Automata的表面流动模型。 Anyway, I also want to include building roofs in my model. 无论如何,我还想在我的模型中包括建筑屋顶。 Imagine you have an ascii file indicating buildings with 1s, while the rest is 0. There are just those two states. 想象一下,你有一个ascii文件指示1s的建筑物,而其余的是0.只有这两种状态。 Now, I want to find all adjacent cells indicating the same building and store them (or rather the information of y,x and one more (maybe elevation),so 3 columns) in an individual building arrays. 现在,我想找到所有相邻的单元格,指示相同的建筑物并将它们(或者更确切地说是y,x和另外一个(可能是高程)的信息,因此在单个建筑物阵列中存储)。 Keep in mind that buildings can have all possible forms though diagonally connected cells doesn't belong to the same building. 请记住,尽管对角连接的单元不属于同一建筑物,但建筑物可以具有所有可能的形式。 So only northern, southern, western and eastern cells can belong to the same building. 因此,只有北部,南部,西部和东部的细胞可以属于同一建筑物。

I did my homework and googled it but so far I couldn't find a satisfying answer. 我做了我的家庭作业并用Google搜索,但到目前为止我找不到令人满意的答案。

example: initial land-cover array: 示例:初始土地覆盖数组:

([0,0,0,0,0,0,0]
 [0,0,1,0,0,0,0]
 [0,1,1,1,0,1,1]
 [0,1,0,1,0,0,1]
 [0,0,0,0,0,0,0])

output(I need to now the coordinates of the cells in my initial array): 输出(我现在需要初始数组中单元格的坐标):

 building_1=([1,2],[2,1],[2,2],[2,3],[3,1],[3,3])
 building_2=([2,5],[2,6],[3,6])

Any help is greatly appreciated! 任何帮助是极大的赞赏!

You can use the label function from scipy.ndimage to identify the distinct buildings. 您可以使用scipy.ndimagelabel功能来识别不同的建筑物。

Here's your example array, containing two buildings: 这是你的示例数组,包含两个建筑物:

In [57]: a
Out[57]: 
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0],
       [0, 1, 1, 1, 0, 1, 1],
       [0, 1, 0, 1, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 0]])

Import label . 导入label

In [58]: from scipy.ndimage import label

Apply label to a . labela It returns two values: the array of labeled positions, and the number of distinct objects (buildings, in this case) found. 它返回两个值:标记位置数组,以及找到的不同对象(在本例中为建筑物)的数量。

In [59]: lbl, nlbls = label(a)

In [60]: lbl
Out[60]: 
array([[0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0],
       [0, 1, 1, 1, 0, 2, 2],
       [0, 1, 0, 1, 0, 0, 2],
       [0, 0, 0, 0, 0, 0, 0]], dtype=int32)

In [61]: nlbls
Out[61]: 2

To get the coordinates of a building, np.where can be used. 要获得建筑物的坐标,可以使用np.where For example, 例如,

In [64]: np.where(lbl == 2)
Out[64]: (array([2, 2, 3]), array([5, 6, 6]))

It returns a tuple of arrays; 它返回一个数组元组; the k th array holds the coordinates of the k th dimension. k个数组保存第k个维的坐标。 You can use, for example, np.column_stack to combine these into an array: 例如,您可以使用np.column_stack将这些组合成一个数组:

 In [65]: np.column_stack(np.where(lbl == 2))
 Out[65]: 
 array([[2, 5],
        [2, 6],
        [3, 6]])

You might want a list of all the coordinate arrays. 您可能需要所有坐标数组的列表。 Here's one way to create such a list. 这是创建这样一个列表的一种方法。

For convenience, first create a list of labels: 为方便起见,首先要创建一个标签列表:

In [66]: labels = range(1, nlbls+1)

In [67]: labels
Out[67]: [1, 2]

Use a list comprehension to create the list of coordinate arrays. 使用列表推导来创建坐标数组列表。

In [68]: coords = [np.column_stack(where(lbl == k)) for k in labels]

In [69]: coords
Out[69]: 
[array([[1, 2],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 3]]),
 array([[2, 5],
       [2, 6],
       [3, 6]])]

Now your building data is in labels and coords . 现在您的建筑数据是labelscoords For example, the first building was labeled labels[0] , and its coordinates are in coords[0] : 例如,第一个建筑物标记为labels[0] ,其坐标位于coords[0]

In [70]: labels[0]
Out[70]: 1

In [71]: coords[0]
Out[71]: 
array([[1, 2],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 3]])

Thank you for the great answers! 谢谢你的答案! Here is a little correction. 这是一个小小的修正。 If you see the landcover array, I actually don't have 0 as background information but -9999 (0 is too precious for GIS people). 如果你看到landcover数组,我实际上没有0作为背景信息但是-9999(0对于GIS人来说太宝贵了)。 I forgot to mention that. 我忘了提到那个。 But thanks to machine yearning's hint, I made a work-around by assigning all -9999 with 0 through landcover = np.where(landcover > -9999, landcover, 0). 但是由于机器向往的暗示,我通过landcover = np.where(landcover> -9999,landcover,0)将所有-9999分配为0来进行解决方案。 After that I can use label. 之后我可以使用标签。 The actual aim was to find the lowest cell and to assign it as outlet. 实际目的是找到最低的细胞并将其指定为出口。 If somebody has a more efficient way, please let me know! 如果有人有更有效的方式,请告诉我!

import numpy as np
from scipy.ndimage import label

Original data set has -9999 as background information and 1 as building cells. 原始数据集具有-9999作为背景信息,1作为构建单元格。

landcover = np.array([[-9999,-9999,-9999,-9999,-9999,-9999,1], 
                       [-9999,-9999,1,-9999,-9999,-9999,-9999],
                       [-9999,1,1,1,-9999,1,1], 
                       [-9999,1,-9999,1,-9999,-9999,1], 
                       [-9999,-9999,-9999,-9999,-9999,-9999,-9999]],dtype=int)

Here is a random digital elevation map. 这是一个随机的数字高程图。

DEM = np.array([[7,4,3,2,4,5,4], 
               [4,5,5,3,5,6,7],
               [2,6,4,7,4,4,4],
               [3,7,8,8,10,9,7],
               [2,5,7,7,9,10,8]],dtype=float)

I changed all -9999 entries to 0 in order to use label @thanks to machine yearning 我将所有-9999条目更改为0,以便使用标签@thanks来加工渴望

 landcover = np.where(landcover > -9999, landcover, 0)

Then I labeled distinct buildings and counting those distinctions @Warren Weckesser, the rest pretty much yours. 然后我标记了不同的建筑物并计算了这些区别@Warren Weckesser,其余几乎是你的。 thanks! 谢谢!

 lbl, nlbls = label(landcover)
 bldg_number = range(1, nlbls+1)
 bldg_coord = [np.column_stack(where(lbl == k)) for k in bldg_no]
 outlets=np.zeros([nlbls,3])

I am iterating over the bldg_coord list in order to determine the lowest cells which will be assigned as outlet 我正在迭代bldg_coord列表,以确定将被指定为出口的最低单元格

 for i in range(0, nlbls):
     building=np.zeros([bldg_coord[i].shape[0],3])
     for j in range(0,bldg_coord[i].shape[0]):
         building[j][0]=bldg_coord[i][j][0]
         building[j][1]=bldg_coord[i][j][1]
         building[j][2]=DEM[bldg_coord[i][j][0],bldg_coord[i][j][1]]

I sort the building array in ascending order according to the DEM information of each building cell in order to find the lowest lying building cells. 我根据每个建筑单元的DEM信息按升序对建筑物阵列进行排序,以便找到最低层的建筑单元。

  building=building[building[:,2].argsort()]

The lowest building cell will be used as roof outlet for rainwater 最低的建筑单元将用作雨水的屋顶出口

  outlets[i][0]=building[0][0]
  outlets[i][1]=building[0][1]
  outlets[i][2]=bldg_coord[i].shape[0]

Here is the output. 这是输出。 The first two columns are indices in den landcover array and the last is the number of adjacent building cells. 前两列是den landcover数组中的索引,最后一列是相邻建筑单元的数量。

>>> outlets
array([[ 0.,  6.,  1.],
       [ 2.,  2.,  6.],
       [ 2.,  5.,  3.]])

It looks like this function does exactly what you're looking for (from the numpy documentation ): 看起来这个函数完全符合您的要求(来自numpy文档 ):

numpy.argwhere(a): numpy.argwhere(A):

Find the indices of array elements that are non-zero, grouped by element. 查找按元素分组的非零的数组元素的索引。

>>> x = np.arange(6).reshape(2,3)
>>> x
array([[0, 1, 2],
       [3, 4, 5]])
>>> np.argwhere(x>1)
array([[0, 2],
       [1, 0],
       [1, 1],
       [1, 2]])

Alternatively it seems like your use case requires using the returned coordinates to index arrays. 或者,似乎您的用例需要使用返回的坐标来索引数组。

The output of argwhere is not suitable for indexing arrays. argwhere的输出不适合索引数组。 For this purpose use where(a) instead. 为此目的,请使用(a)代替。

You might want to try numpy.where instead. 你可能想尝试numpy.where

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Converting numpy array which elements are numpy arrays to a list of numpy arrays - Converting numpy array which elements are numpy arrays to a list of numpy arrays 将 numpy arrays 的列表转换为 numpy 数组,其中元素为 numpy arrays - convert a list of numpy arrays into a numpy array in which elements are numpy arrays 根据值将 numpy 数组过滤为单独的数组,用于绘制等高线 - filter numpy array into separate arrays based on value, for contour plotting 查找两个中具有相同值的元素 numpy arrays python - Find the elements have the same value in two numpy arrays python 如何通过创建单独的二进制数组来有效地分割 NumPy 数组 - How to efficiently segment NumPy array by creating separate binary arrays 如何将 numpy 数组插入到 arrays 的 numpy 数组中? - How to insert a numpy array to a numpy array of arrays? Numpy 数组 numpy arrays - Numpy array of numpy arrays 如何根据两个元素索引对numpy数组进行排序并将数组中的相同类型元素分组 - How to sort numpy array according to the to the two element indexes and group the same type elements in the array 如何为 numpy 数组的元素比较构造 if 语句,以生成具有相同维度的新数组? - How can I construct an if statement for numpy arrays' elements comparison, to produce a new array with same dimension? 如何将 numpy arrays 列表转换为 numpy 数组 - How to convert list of numpy arrays to numpy array
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM