简体   繁体   English

如何指定numpy矩阵的子集来放置较小的矩阵?

[英]How can I specify a subset of a numpy matrix for placement of smaller matrix?

I have an image pyramid with a down sample rate of 2. That is, the bottom of my pyramid is a an image of shape (256, 256) , where the next level is (128, 128) , etc. 我有一个向下采样率为2的图像金字塔。也就是说,我的金字塔的底部是形状为(256, 256)的图像,下一个级别是(128, 128)等。

My goal is to display this pyramid into a single image. 我的目标是将此金字塔显示为单个图像。 The first image is placed on the left. 第一张图片位于左侧。 The second is placed in the top right corner. 第二个放置在右上角。 Each subsequent image must be placed beneath the previous and wedged into the corner. 每个后续图像必须放置在前一个图像的下方并楔入角落。

Here is my current function: 这是我当前的功能:

def pyramid2img(pmd):
    '''
    Given a pre-constructed pyramid, this is a helper
    function to display the pyramid in a single image.
    '''

    # orignal shape (pyramid goes from biggest to smallest)
    org_img_shp = pmd[0].shape

    # the output will have to have 1.5 times the width
    out_shp = tuple(int(x*y) \
        for (x,y) in zip(org_img_shp, (1, 1.5)))
    new_img = np.zeros(out_shp, dtype=np.int8)

    # i keep track of the top left corner of where I want to 
    # place the current image matrix
    origin = [0, 0]
    for lvl, img_mtx in enumerate(pmd):

        # trying to specify the subset to place the next img_mtx in
        sub = new_img[origin[0]:origin[0]+pmd[lvl].shape[0],
            origin[1]:origin[1]+pmd[lvl].shape[1]]# = img_mtx

        # some prints to see exactly whats being called above ^
        print 'level {}, sub {}, mtx {}'.format(
            lvl, sub.shape, img_mtx.shape)
        print 'sub = new_img[{}:{}, {}:{}]'.format(
            origin[0], origin[0]+pmd[lvl].shape[0],
            origin[1], origin[1]+pmd[lvl].shape[1])

        # first shift moves the origin to the right
        if lvl == 0:
            origin[0] += pmd[lvl].shape[0]
        # the rest move the origin downward
        else:
            origin[1] += pmd[lvl].shape[1]

    return new_img

OUTPUT FROM THE PRINT STATEMENTS: 打印声明的输出:

level 0, sub (256, 256), mtx (256, 256)
sub = new_img[0:256, 0:256]


level 1, sub (0, 128), mtx (128, 128)
sub = new_img[256:384, 0:128]


level 2, sub (0, 64), mtx (64, 64)
sub = new_img[256:320, 128:192]


level 3, sub (0, 32), mtx (32, 32)
sub = new_img[256:288, 192:224]


level 4, sub (0, 16), mtx (16, 16)
sub = new_img[256:272, 224:240]


level 5, sub (0, 8), mtx (8, 8)
sub = new_img[256:264, 240:248]


level 6, sub (0, 4), mtx (4, 4)
sub = new_img[256:260, 248:252]

If you look at the output, you can see that I am trying to reference a 2d-slice of the output image so that I can place the next level of the pyramid inside it. 如果查看输出,可以看到我正在尝试引用输出图像的2切片,以便将金字塔的下一层放置在其中。

The problem is that the slicing I am performing is not giving a 2d-array with the shape I expect it too. 问题是我执行的切片没有给出我期望的形状的2d数组。 It thinks I am trying to put a (n,n) matrix into a (0, n) matrix. 它认为我正在尝试将(n,n)矩阵放入(0,n)矩阵中。

How come when I specify a slice like new_img[256:320, 128:192] , it returns an object with shape (0, 64) , NOT (64, 64) ? 当我指定像new_img[256:320, 128:192]这样的切片时,为什么返回形状为(0, 64) new_img[256:320, 128:192] (0, 64) ,NOT (64, 64)呢?

Is there an easier way to do what I am trying to do? 有没有更简单的方法可以做我想做的事情?

Here is an example. 这是一个例子。

Create a pyramid first: 首先创建金字塔:

import numpy as np
import pylab as pl
import cv2

img = cv2.imread("earth.jpg")[:, :, ::-1]

size = 512
imgs = []
while size >= 2:
    imgs.append(cv2.resize(img, (size, size)))
    size //= 2

And here is the code to merge the images: 这是合并图像的代码:

def align(size, width, loc):
    if loc in ("left", "top"):
        return 0
    elif loc in ("right", "bottom"):
        return size - width
    elif loc == "center":
        return (size - width) // 2

def resize_canvas(img, shape, loc, fill=255):
    new_img = np.full(shape + img.shape[2:], fill, dtype=img.dtype)
    y = align(shape[0], img.shape[0], loc[0])
    x = align(shape[1], img.shape[1], loc[1])
    new_img[y:y+img.shape[0], x:x+img.shape[1], ...] = img
    return new_img

def vbox(imgs, align="right", fill=255):
    width = max(img.shape[1] for img in imgs)
    return np.concatenate([
            resize_canvas(img, (img.shape[0], width), ("top", align), fill=fill) 
            for img in imgs
        ])

def hbox(imgs, align="top", fill=255):
    height = max(img.shape[0] for img in imgs)
    return np.concatenate([
            resize_canvas(img, (height, img.shape[1]), (align, "left"), fill=fill) 
            for img in imgs
        ], axis=1)

the output of: 输出:

pl.imshow(hbox([imgs[0], vbox(imgs[1:])]))

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM