简体   繁体   English

如何计算 pytorch 中关键点检测 CNN model 的准确度?

[英]How can I calculate accuracy for keypoints detection CNN model in pytorch?

Can someone help me with this please,有人可以帮我解决这个问题吗,

def train_net(n_epochs):
    valid_loss_min = np.Inf    
    history = {'train_loss': [], 'valid_loss': [], 'epoch': []}

    for epoch in range(n_epochs):  
        train_loss = 0.0
        valid_loss = 0.0  
        net.train()
        running_loss = 0.0
        for batch_i, data in enumerate(train_loader):
            images = data['image']
            key_pts = data['keypoints']
            key_pts = key_pts.view(key_pts.size(0), -1)
            key_pts = key_pts.type(torch.FloatTensor).to(device)
            images = images.type(torch.FloatTensor).to(device)
            output_pts = net(images)
            loss = criterion(output_pts, key_pts)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            train_loss += loss.item()*images.data.size(0)      
        net.eval() 

        with torch.no_grad():
            for batch_i, data in enumerate(test_loader):
                images = data['image']
                key_pts = data['keypoints']
                key_pts = key_pts.view(key_pts.size(0), -1)
                key_pts = key_pts.type(torch.FloatTensor).to(device)
                images = images.type(torch.FloatTensor).to(device)
                output_pts = net(images)
                loss = criterion(output_pts, key_pts)          
                valid_loss += loss.item()*images.data.size(0) 
        train_loss = train_loss/len(train_loader.dataset)
        valid_loss = valid_loss/len(test_loader.dataset) 
        print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch+1,train_loss,valid_loss))

        if valid_loss <= valid_loss_min:
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(valid_loss_min,valid_loss))    
            torch.save(net,f'X:\\xxxx\\xxx\\xxx\\epoch{epoch + 1}_loss{valid_loss}.pth')
            valid_loss_min = valid_loss
        history['epoch'].append(epoch + 1)
        history['train_loss'].append(train_loss)
        history['valid_loss'].append(valid_loss)
    print('Finished Training')
    return history
'''

Above is the training code for reference!

Perhaps with the euclidean distance: True keypoint: (x, y) Predicted keypoint: (x_, y_) Distance d: sqrt((x_ - x)^2 + (y_ - y)^2).也许与欧几里得距离:真正的关键点:(x,y)预测的关键点:(x_,y_)距离d:sqrt((x_ - x)^2 +(y_ - y)^2)。 From that you have to get a percentage.从中你必须得到一个百分比。 if the d == 0 you have 100% for accuracy for that keypoint.如果 d == 0,则该关键点的准确度为 100%。 But whats 0%?但什么是 0%? I would say the distance from the true keypoint to the corner of the image which is the farest away from that keypoint.我会说从真正的关键点到离该关键点最远的图像角落的距离。 Lets call that distance R.让我们称这个距离为 R。 So your accuracy for your point is d / R.因此,您的观点的准确性是 d / R。 Do that for every keypoint and take the average.对每个关键点都这样做并取平均值。 I just came up with this so it might have some flaws, but I think you can work with that and check if its the right solution for you.我只是想出了这个,所以它可能有一些缺陷,但我认为你可以使用它并检查它是否适合你。

This is funny, I was just working on this minutes ago myself, As you probably realise.这很有趣,我自己几分钟前刚刚在做这个,你可能已经意识到了。 simply calculating the Euclidean distance between 2 sets of keypoints doesn't generalise well to cases where you need to compare across body shapes and sizes, So I would recommend using the Object Keypoint Similarity Score.简单地计算两组关键点之间的欧几里得距离并不能很好地推广到需要比较身体形状和大小的情况,所以我建议使用 Object 关键点相似度分数。 which measures the body joints distance normalised by the scale of the person.它测量由人的尺度归一化的身体关节距离。 As represented in this blog , OKS is defined as:如本博客所述,OKS 定义为:

在此处输入图像描述

Here (line 313 function computeOKS ) is Facebook research's implementation: 这里(第 313 行 function computeOKS )是 Facebook 研究的实现:

 def computeOks(self, imgId, catId): p = self.params # dimention here should be Nxm gts = self._gts[imgId, catId] dts = self._dts[imgId, catId] inds = np.argsort([-d['score'] for d in dts], kind='mergesort') dts = [dts[i] for i in inds] if len(dts) > p.maxDets[-1]: dts = dts[0:p.maxDets[-1]] # if len(gts) == 0 and len(dts) == 0: if len(gts) == 0 or len(dts) == 0: return [] ious = np.zeros((len(dts), len(gts))) sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0 vars = (sigmas * 2)**2 k = len(sigmas) # compute oks between each detection and ground truth object for j, gt in enumerate(gts): # create bounds for ignore regions(double the gt bbox) g = np.array(gt['keypoints']) xg = g[0::3]; yg = g[1::3]; vg = g[2::3] k1 = np.count_nonzero(vg > 0) bb = gt['bbox'] x0 = bb[0] - bb[2]; x1 = bb[0] + bb[2] * 2 y0 = bb[1] - bb[3]; y1 = bb[1] + bb[3] * 2 for i, dt in enumerate(dts): d = np.array(dt['keypoints']) xd = d[0::3]; yd = d[1::3] if k1>0: # measure the per-keypoint distance if keypoints visible dx = xd - xg dy = yd - yg else: # measure minimum distance to keypoints in (x0,y0) & (x1,y1) z = np.zeros((k)) dx = np.max((z, x0-xd), axis=0) + np.max((z, xd-x1), axis=0) dy = np.max((z, y0-yd), axis=0) + np.max((z, yd-y1), axis=0) e = (dx**2 + dy**2) / vars / (gt['area'] + np.spacing(1)) / 2 if k1 > 0: e=e[vg > 0] ious[i, j] = np.sum(np.exp(-e)) / e.shape[0] return ious

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM