简体   繁体   English

PyTorch:如何对多个图像应用相同的随机变换?

[英]PyTorch : How to apply the same random transformation to multiple image?

I am writing a simple transformation for a dataset which contains many pairs of images.我正在为包含多对图像的数据集编写一个简单的转换。 As a data augmentation, I want to apply some random transformation for each pair but the images in that pair should be transformed in the same way.作为数据增强,我想对每一对应用一些随机变换,但该对中的图像应该以相同的方式进行变换。 For example, given a pair of two images A and B , if A is flipped horizontally, B must be flipped horizontally as A .例如,给定一对两个图像AB ,如果A水平翻转,则B必须水平翻转为A Then the next pair C and D should be differently transformed from A and B but C and D are transformed in the same way.然后下一对CD应该从AB进行不同的转换,但CD的转换方式相同。 I am trying that in the way below我正在尝试以下方式

import random
import numpy as np
import torchvision.transforms as transforms
from PIL import Image

img_a = Image.open("sample_ajpg") # note that two images have the same size
img_b = Image.open("sample_b.png")
img_c, img_d = Image.open("sample_c.jpg"), Image.open("sample_d.png")

transform = transforms.RandomChoice(
    [transforms.RandomHorizontalFlip(), 
     transforms.RandomVerticalFlip()]
)
random.seed(0)
display(transform(img_a))
display(transform(img_b))

random.seed(1)
display(transform(img_c))
display(transform(img_d))

Yet、 the above code does not choose the same transformation and as I tested, it is dependent on the number of times transform is called.然而,上面的代码没有选择相同的转换,正如我测试的那样,它取决于调用transform的次数。

Is there any way to force transforms.RandomChoice to use the same transform when specified?有没有办法强制transforms.RandomChoice在指定时使用相同的转换?

Usually a workaround is to apply the transform on the first image, retrieve the parameters of that transform, then apply with a deterministic transform with those parameters on the remaining images.通常一种解决方法是在第一张图像上应用变换,检索该变换的参数,然后在剩余图像上应用带有这些参数的确定性变换。 However, here RandomChoice does not provide an API to get the parameters of the applied transform since it involves a variable number of transforms.但是,这里的RandomChoice不提供 API 来获取应用变换的参数,因为它涉及可变数量的变换。 In those cases, I usually implement an overwrite to the original function.在这些情况下,我通常实现对原始 function 的覆盖。

Looking at the torchvision implementation , it's as simple as:查看torchvision 的实现,它很简单:

class RandomChoice(RandomTransforms):
    def __call__(self, img):
        t = random.choice(self.transforms)
        return t(img)

Here are two possible solutions.这里有两种可能的解决方案。

  1. You can either sample from the transform list on __init__ instead of on __call__ :您可以从__init__而不是__call__上的转换列表中采样:

     import random import torchvision.transforms as T class RandomChoice(torch.nn.Module): def __init__(self): super().__init__() self.t = random.choice(self.transforms) def __call__(self, img): return self.t(img)

    So you can do:所以你可以这样做:

     transform = T.RandomChoice([ T.RandomHorizontalFlip(), T.RandomVerticalFlip() ]) display(transform(img_a)) # both img_a and img_b will display(transform(img_b)) # have the same transform transform = T.RandomChoice([ T.RandomHorizontalFlip(), T.RandomVerticalFlip() ]) display(transform(img_c)) # both img_c and img_d will display(transform(img_d)) # have the same transform

  1. Or better yet, transform the images in batch:或者更好的是,批量转换图像:

     import random import torchvision.transforms as T class RandomChoice(torch.nn.Module): def __init__(self, transforms): super().__init__() self.transforms = transforms def __call__(self, imgs): t = random.choice(self.transforms) return [t(img) for img in imgs]

    Which allows to do:允许这样做:

     transform = T.RandomChoice([ T.RandomHorizontalFlip(), T.RandomVerticalFlip() ]) img_at, img_bt = transform([img_a, img_b]) display(img_at) # both img_a and img_b will display(img_bt) # have the same transform img_ct, img_dt = transform([img_c, img_d]) display(img_ct) # both img_c and img_d will display(img_dt) # have the same transform

I dont know of a function to fix the random output.我不知道有一个 function 来修复随机 output。 maybe try a different logic, like creating the randomization yourself to be able to reuse the same transformation.也许尝试不同的逻辑,比如自己创建随机化以便能够重用相同的转换。 logic:逻辑:

  • generate a random number生成一个随机数
  • based on the number apply a transformation on both images基于数字对两个图像应用转换
  • generate another random number生成另一个随机数
  • do the same for the other two images try this:对其他两个图像执行相同的操作试试这个:
import random
import numpy as np
import torchvision.transforms as transforms
from PIL import Image

img_a = Image.open("sample_ajpg") # note that two images have the same size
img_b = Image.open("sample_b.png")
img_c, img_d = Image.open("sample_c.jpg"), Image.open("sample_d.png")

if random.random() > 0.5:
        image_a_flipped = transforms.functional_pil.vflip(img_a)
        image_b_flipped = transforms.functional_pil.vflip(img_b)
else:
    image_a_flipped = transforms.functional_pil.hflip(img_a)
    image_b_flipped = transforms.functional_pil.hflip(img_b)

if random.random() > 0.5:
        image_c_flipped = transforms.functional_pil.vflip(img_c)
        image_d_flipped = transforms.functional_pil.vflip(img_d)
else:
    image_c_flipped = transforms.functional_pil.hflip(img_c)
    image_d_flipped = transforms.functional_pil.hflip(img_d)
    
display(image_a_flipped)
display(image_b_flipped)

display(image_c_flipped)
display(image_d_flipped)

Simply, take the randomization part out of PyTorch into an if statement.简单地说,将 PyTorch 中的随机化部分放入if语句中。 Below code uses vflip .下面的代码使用vflip Similarly for horizontal or other transforms.对于水平或其他变换也是如此。

import random
import torchvision.transforms.functional as TF

if random.random() > 0.5:
    image = TF.vflip(image)
    mask  = TF.vflip(mask)

This issue has been discussed in PyTorch forum .这个问题已经在 PyTorch 论坛讨论过了。 Several solutions' pros and cons were discussed on the official GitHub repository page .在官方 GitHub 存储库页面上讨论了几种解决方案的优缺点。 PyTorch maintainers have suggested this simple approach. PyTorch 维护人员建议了这种简单的方法。

Do not use torchvision.transforms.RandomVerticalFlip(p=1) .不要使用torchvision.transforms.RandomVerticalFlip(p=1) Use torchvision.transforms.functional.vflip使用torchvision.transforms.functional.vflip

Functional transforms give you fine-grained control of the transformation pipeline.功能转换为您提供了对转换管道的细粒度控制。 As opposed to the transformations above, functional transforms don't contain a random number generator for their parameters.与上述变换相反,函数变换不包含用于其参数的随机数生成器。 That means you have to specify/generate all parameters, but you can reuse the functional transform.这意味着您必须指定/生成所有参数,但您可以重用功能转换。

I realize the OP requested a solution using torchvision and I think @Ivan's answer does a good job addressing this.我意识到 OP 要求使用torchvision提供解决方案,我认为@Ivan 的回答很好地解决了这个问题。

However, for those not tied to a specific augmentation library, I wanted to point out that Albumentations appears to handle these kind of situations nicely in a native fashion by allowing the user to pass multiple source images, boxes, etc into the same transform.然而,对于那些不绑定到特定增强库的人,我想指出,Albumentations 似乎可以通过允许用户将多个源图像、框等传递到同一个变换中,以 本机方式很好地处理这些情况。 The return is structured as a dict返回结构为字典

import albumentations as A

transform = A.Compose(
    transforms=[
        A.VerticalFlip(p=0.5),
        A.HorizontalFlip(p=0.5)],
    additional_targets={'image0': 'image', 'image1': 'image'}
)
transformed = transform(image=image, image0=image0, image1=image1)

Now you can access transformed['image0'] , transformed['image1'] , etc and all of them will have random parameters applied现在你可以访问transformed['image0']transformed['image1']等,它们都将应用随机参数

Referencing Random transforms for both input and target? 为输入和目标引用随机变换? I think this is probably the cleanest way to do it.我认为这可能是最干净的方法。 Save the random state before applying any transformation and the just restore it for each consequent call在应用任何转换之前保存随机 state 并为每个后续调用恢复它

t = transforms.RandomRotation(degrees=360)
state = torch.get_rng_state()
x = t(x)
torch.set_rng_state(state)
y = t(y)

I think I have a simple solution: If the images are concatenated, the transformations are applied to all of them identically:我想我有一个简单的解决方案:如果图像是连接的,则转换将相同地应用于所有图像:

import torch
import torchvision.transforms as T

# Create two fake images (identical for test purposes):
image = torch.randn((3, 128, 128))
target = image.clone()

# This is the trick (concatenate the images):
both_images = torch.cat((image.unsqueeze(0), target.unsqueeze(0)),0)

# Apply the transformations to both images simultaneously:
transformed_images = T.RandomRotation(180)(both_images)

# Get the transformed images:
image_trans = transformed_images[0]
target_trans = transformed_images[1]

# Compare the transformed images:
torch.all(image_trans == target_trans).item()

>> True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM