I recently took an imagenet pre-trained VGG11 network and made predictions on the imagenet test dataset. Upon submitting this file to the evaluation server, I received an email with following text:
Error: 0.99607 (top-5) 0.99898 (top-1)
Per-class error (classes 1-1000):
1 1
1 1
1 1
...
Does this mean that my top-5 accuracy is 1-0.99607=0.393%
? If so then the score is too low.
Could you please point out where I could be going wrong? Here is the code for reference.
PS: I have checked that the images are loaded and predicted upon in alphabetical order.
vgg11 = models.vgg11(pretrained=True)
vgg11.to(torch.device("cuda"))
vgg11.eval()
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
test_loader = torch.utils.data.DataLoader(datasets.ImageFolder("test_dataset",
transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
normalize
])),
batch_size=32, shuffle=False)
fp = open("predictions.txt", "w")
for a, b in tqdm(test_loader):
preds = vgg11(a.cuda())
_, preds = torch.topk(preds, k=5, dim=1)
preds = preds.cpu().detach().numpy()
for i in range(len(preds)):
fp.write(" ".join(str(j) for j in preds[i])+"\n")
fp.close()
Based on your code, I believe the error is right because of the lack of normalization. I don't have the environment to test on the ImageNet test set, so I made a small example with 4 random cat images from the internet. (Link: image1 , image2 , image3 , image4 ).
The code test as below:
import torch
from torchvision import models
import numpy as np
import cv2
import os
with torch.no_grad():
vgg11 = models.vgg11(pretrained=True)
vgg11.eval()
mean=torch.tensor([0.485, 0.456, 0.406])
std=torch.tensor([0.229, 0.224, 0.225])
def read_image(image_path, size=224):
image = cv2.imread(image_path)
image = cv2.resize(image, (size,size))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
image = torch.tensor(image).permute(2,0,1).unsqueeze(0) / 255.
image = (image - mean[None, :, None, None])/std[None, :, None, None]
return image
from_path = './../test_image/'
cat_name = ['cat1','cat2','cat3','cat4']
images = torch.empty(0, 3, 224, 224)
for name in cat_name:
image_path = os.path.join(from_path, f'{name}.png')
image = read_image(image_path)
images = torch.cat((images, image), 0)
preds = vgg11(images.float()).detach().cpu().numpy()
result = np.argmax(preds, axis=1)
print(result)
Without normalization, the result is ['Egyptian cat', 'sock', 'Komodo dragon', 'doormat']
([285, 806, 48, 539]).
With normalization, the result is ['tabby cat', 'tabby cat', 'leopard', 'Egyptian cat']
([281 281 288 285]).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.