![](/img/trans.png)
[英]Tensorflow 2: apply one hot encoding on masks for semantic segmentation
[英]Semantic Image Segmentation with colored masks
所以我有一組帶有彩色面具的圖片,例如藍色代表椅子,紅色代表燈等。
由於我對這一切都不熟悉,因此我嘗試使用 unet model 執行此操作,我已經使用 keras 處理了圖像,就像這樣。
def data_generator(img_path,mask_path,batch_size):
c=0
n = os.listdir(img_path)
m = os.listdir(mask_path)
random.shuffle(n)
while(True):
img = np.zeros((batch_size,256,256,3)).astype("float")
mask = np.zeros((batch_size,256,256,1)).astype("float")
for i in range(c,c+batch_size):
train_img = cv2.imread(img_path+"/"+n[i])/255.
train_img = cv2.resize(train_img,(256,256))
img[i-c] = train_img
train_mask = cv2.imread(mask_path+"/"+m[i],cv2.IMREAD_GRAYSCALE)/255.
train_mask = cv2.resize(train_mask,(256,256))
train_mask = train_mask.reshape(256,256,1)
mask[i-c]=train_mask
c+=batch_size
if(c+batch_size>=len(os.listdir(img_path))):
c=0
random.shuffle(n)
yield img,mask
現在仔細看,我認為這種方式不適用於我的面具,我嘗試將面具處理為 rgb 顏色,但我的 model 不會像那樣訓練。
model。
def unet(pretrained_weights = None,input_size = (256,256,3)):
inputs = Input(input_size)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
drop5 = Dropout(0.5)(conv5)
up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
merge6 = concatenate([drop4,up6], axis = 3)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)
up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
merge7 = concatenate([conv3,up7], axis = 3)
conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)
up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
merge8 = concatenate([conv2,up8], axis = 3)
conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)
up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
merge9 = concatenate([conv1,up9], axis = 3)
conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
conv9 = Conv2D(2, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
conv10 = Conv2D(1, 1, activation = 'sigmoid')(conv9)
model = Model(input = inputs, output = conv10)
model.compile(optimizer = Adam(lr = 1e-4), loss = 'binary_crossentropy', metrics = ['accuracy'])
#model.summary()
if(pretrained_weights):
model.load_weights(pretrained_weights)
return model
所以我的問題是如何用彩色圖像蒙版訓練 model。
編輯,我擁有的數據示例。
以及每個這樣的面具的百分比。 {"water": 4.2, "building": 33.5, "road": 0.0}
在語義分割問題中,每個像素屬於任何目標 output 類/標簽。 因此,您的 output 層conv10
應將類總數 (n_classes) 作為 no._of_kernels 的值,並將softmax
作為激活 function 的值,如下所示:
conv10 = Conv2D(**n_classes**, 1, activation = 'softmax')(conv9)
在這種情況下,在編譯 u-net model 時,損失也應更改為categorical_crossentropy
。
model.compile(optimizer = Adam(lr = 1e-4), loss = 'categorical_crossentropy', metrics = ['accuracy'])
此外,您不應該標准化您的真實標簽/蒙版圖像,而是可以編碼如下:
train_mask = np.zeros((height, width, n_classes))
for c in range(n_classes):
train_mask[:, :, c] = (img == c).astype(int)
[我假設你有兩個以上真正的 output 類/標簽,因為你提到你的面具包含不同的 colors 用於水、道路、建築等; 如果您只有兩個類,那么您的 model 配置很好,除了 train_mask 處理。]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.