简体   繁体   English

如何使用2x2数组在Python中采样巨大的2D数组以创建字典? (适用于Python的模板算法)

[英]How to sample a huge 2D array in Python using 2x2 arrays to create a dictionary? (Stencil Algorithm for Python)

I am rather new to programming, so I apologise if this is a classic and trivial question. 我是编程的新手,如果这是一个经典而琐碎的问题,我深表歉意。 I have a 100x100 2D array of values which is plotted by means of matplotlib . 我有一个100x100 2D值数组,可通过matplotlib绘制。 In this image, each cell has its value (ranging 0.0 to 1.0 ) and ID (ranging 0 to 9999 starting from the upper left corner). 在此图像中,每个单元格都有其值(范围为0.01.0 )和ID(范围为09999从左上角开始)。 I want to sample the matrix by using a 2x2 moving window which produces two dictionaries : 我想使用产生两个字典的2x2移动窗口对矩阵进行采样:

  • 1st dictionary: the key represents the intersection of 4 cells; 第一个字典:键代表4个单元格的交集; the value represents the tuple with the IDs of the 4 neighboring cells (see image below - the intersection is represented by "N" ); 该值表示具有4个相邻单元格ID的元组(请参见下图- 交集由“ N”表示 );
  • 2nd dictionary: the key represents the intersection of 4 cells; 第二字典:键代表4个单元格的交集; the value represents the mean value of the 4 neighboring cells (see image below). 该值表示4个相邻单元的平均值(请参见下图)。

In the example below ( upper left panel ), where N has ID=0, the 1st dictionary would yield {'0': (0,1,100,101)} since the cells are numbered 0 to 99 toward the right hand side and 0 to 9900, step=100, downward. 在下面的示例( 左上图 )中,N的ID = 0,第一个字典将产生{'0': (0,1,100,101)}因为单元格的编号分别是:右侧的0到99和0到9900 ,步进= 100,向下。 The 2nd dictionary would yield {'0': 0.775} , as 0.775 is the average value of the 4 neighboring cells of N. Of course, these dictionaries must have as many keys as "intersections" I have on the 2D array. 第二个字典将产生{'0': 0.775} ,因为0.775是N的四个相邻单元的平均值。当然,这些字典必须具有与2D数组上的“交集”一样多的键。

How can this be accomplished? 如何做到这一点? And are dictionaries the best "tool" in this case? 在这种情况下,词典是最好的“工具”吗? Thank you guys! 感谢大伙们!

在此处输入图片说明

PS: I tried my own way but my code is incomplete, wrong, and I cannot get my head around it: PS:我尝试了自己的方式,但是我的代码不完整,错误,并且无法解决:

a=... #The 2D array which contains the cell values ranging 0.0 to 1.0
neigh=numpy.zeros(4)
mean_neigh=numpy.zeros(10000/4)
for k in range(len(neigh)):
    for i in a.shape[0]:
        for j in a.shape[1]:
            neigh[k]=a[i][j]
            ...

Well, dictionaries may in fact be the way in your case. 好吧,字典实际上可能就是您的情况。

Are you sure that the numpy.array format you're using is correct? 您确定使用的numpy.array格式正确吗? I don't find any array((int, int)) form in the API. 我在API中找不到任何array((int,int))形式。 anyway... 无论如何...

What to do once you have your 2D array declared 声明2D数组后该怎么办

To make things ordered, let's make two functions that will work with any square 2D array, returning the two dictionaries that you need: 为了使事情井井有条,让我们做两个可以与任何正方形2D数组一起使用的函数,返回您需要的两个字典:

#this is the one that returns the first dictionary
def dictionarize1(array):
    dict1 = {}
    count = 0
    for x in range(len(array[0]) - 1) :
        for y in range(len(array[0]) - 1):
            dict1[count] = [array[x][y], array[x][y+1], array[x+1][y], array[x + 1][y+1]]
            count = count + 1
    return dict1

def dictionarize2(array):
    dict2 = {}
    counter = 0
    for a in range(len(array[0]) - 1) :
        for b in range(len(array[0]) - 1):
            dict2[counter] = (array[a][b] + array[a][b+1] + array[a+1][b] + array[a + 1][b+1])/4
            counter = counter + 1
    return dict2

#here's a little trial code to see them working

eighties = [[2.0, 2.2, 2.6, 5.7, 4.7], [2.1, 2.3, 2.3, 5.8, 1.6], [2.0, 2.2, 2.6, 5.7, 4.7],[2.0, 2.2, 2.6, 5.7, 4.7],[2.0, 2.2, 2.6, 5.7, 4.7]]
print("Dictionarize1: \n")
print(dictionarize1(eighties))
print("\n\n")
print("Dictionarize2: \n")
print(dictionarize2(eighties))
print("\n\n")

Compared to the first code, i prefered using an integer as a key cause python will print the dictionary sorted in that case (dictionaries are by definition unsorted, but if they have int keys Python will print them out sorted by key). 与第一个代码相比,我更喜欢使用整数作为键,因为python将打印在这种情况下排序的字典(字典按定义是未排序的,但是如果它们具有int键,Python将按键将其打印出来)。 However, you can change it back to a string just using str(count) as I did before. 但是,您可以像以前一样使用str(count)将其更改回字符串。

I hope this will help, now I'm not very practical with math libraries, but the code that I wrote should work well with any 2D square array that you may want to put as an input! 我希望这会有所帮助,因为我现在对数学库不是很实用,但是我编写的代码应该可以与您要输入的任何2D方阵一起很好地工作!

Let's say data is the original numpy.array with dimension dr and dc for rows and columns. 假设data是原始的numpy.array ,行和列的维度为drdc

dr = data.shape[0]
dc = data.shape[1]

You could produce Keys as a function that return indices of interest and Values as a list with computed mean of 4 neighbouring cells. 您可以将Keys成为一个函数,该函数返回感兴趣的索引和Values作为具有4个相邻像元的计算平均值的列表。 In that case, Keys is equal to: 在这种情况下, Keys等于:

def Keys(x):
    xmod = x + (x+1)/dc  # dc is in scope
    return [xmod, xmod + 1, xmod + dc, xmod + 1 + dc]

The dimension of Values is equal to dr-1 * dc-1 since the last row and column is not included. 因为不包括最后一行和最后一列,所以“ Values ”的维等于dr-1 * dc-1 We can compute it as a moving average and reshape to 1D later, (inspiration from link ): 我们可以将其计算为移动平均值,然后再调整为1D (来自link的启发):

Values = ((d[:-1,:-1] + d[1:,:-1] + d[:-1,1:] + d[1:,1:])/4).reshape((dr-1)*(dc-1))

Example: 例:

dr = 3
dc = 5

In: np.array(range(dc*dr)).reshape((dr, dc))  # data
Out: 
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In: [Keys(x) for x in range((dr-1)*(dc-1))]
Out: 
    [[0, 1, 5, 6],
     [1, 2, 6, 7],
     [2, 3, 7, 8],
     [3, 4, 8, 9],
     [5, 6, 10, 11],
     [6, 7, 11, 12],
     [7, 8, 12, 13],
     [8, 9, 13, 14]]

In: Values
Out: array([ 3,  4,  5,  6,  8,  9, 10, 11])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM