[英]Fastest way to convert list to dict with list values as keys, and list index as values
We can convert a list to dict, setting the list values as the keys, and setting the list indexes as the values thusly:我们可以将列表转换为 dict,将列表值设置为键,并将列表索引设置为值:
classes = ['car', 'bus', 'van']
reverse_classes = dict.fromkeys(classes)
for i, key in enumerate(reverse_classes):
reverse_classes[key] = i
print(reverse_classes)
{'car': 0, 'bus': 1, 'van': 2}
The question is: Is this the fastest way?问题是:这是最快的方法吗?
This is used to quickly get the index of the class in a __getitem__
implementation of a custom torch.utils.data.Dataset
while training:这用于在训练时快速获取自定义
torch.utils.data.Dataset
的__getitem__
实现中的类的索引:
from torch.utils.data import Dataset
## Load all training data from https://github.com/PKU-IMRE/VERI-Wild, and set all images minus 1 per class for training, and keep the last one for testing
class VeRIWild(Dataset):
def __init__(self, main_dir, transform, train=True, debug=False):
self.root_dir = main_dir
self.transform = transform
self.train = train
self.classes = natsort.natsorted(os.listdir(os.path.join(self.root_dir, 'images')))
self.total_imgs = []
for car in self.classes:
imgs = natsort.natsorted(os.listdir(os.path.join(self.root_dir, 'images', car)))
if train:
for im in imgs[:-1]: # keep the last image for test
self.total_imgs.append(os.path.join(car, im))
else:
self.total_imgs.append(os.path.join(car, imgs[-1]))
self.reverse_classes = dict.fromkeys(self.classes)
for i, key in enumerate(self.reverse_classes):
self.reverse_classes[key] = i
def __len__(self):
return len(self.total_imgs)
## Returns: Tuple (image, target) where target is the index of the target category.
def __getitem__(self, idx):
img_loc = os.path.join(self.root_dir, 'images', self.total_imgs[idx])
image = Image.open(img_loc).convert("RGB")
tensor_image = self.transform(image)
car_name = os.path.dirname(self.total_imgs[idx])
return (tensor_image, self.reverse_classes[car_name])
Maybe you can try也许你可以试试
out = dict(zip(data, range(len(data))))
Simple benchmark (try it on larger dataset):简单的基准测试(在更大的数据集上尝试):
from timeit import timeit
from itertools import count
# classes = ['car', 'bus', 'van']
classes = set('''Est voluptatum fuga natus ea officiis eveniet facere aut. Nihil eaque quia dolor officia. Et dolorem et aut laborum impedit accusantium consequatur. Atque tempora facilis iusto. Sit neque eligendi et accusantium et. Ut veritatis in voluptatum'''.split())
def f1(data):
reverse_classes = {c: i for i, c in enumerate(data)}
return reverse_classes
def f2(data):
reverse_classes = dict.fromkeys(data)
for i, key in enumerate(reverse_classes):
reverse_classes[key] = i
return reverse_classes
def f3(data):
return dict(zip(data, range(len(data))))
def f4(data):
return dict(zip(data, count()))
t1 = timeit(lambda: f1(classes), number=1000)
t2 = timeit(lambda: f2(classes), number=1000)
t3 = timeit(lambda: f3(classes), number=1000)
t4 = timeit(lambda: f4(classes), number=1000)
print(t1)
print(t2)
print(t3)
print(t4)
Prints:印刷:
0.006092605064623058
0.007285483996383846
0.004913415992632508
0.0048480971017852426
EDIT: Added version with itertools.count
(Thanks @HeapOverflow)编辑:添加了
itertools.count
版本(感谢@HeapOverflow)
Define fastest ?定义最快?
The simplest, and probably most pythonic way, would be to use a dict-comprehension:最简单,也可能是最 Pythonic 的方法是使用 dict-comprehension:
classes = [...]
reverse_classes = {item : idx for idx, item in enumerate(classes)}
Note that this will keep the last item only:请注意,这将仅保留最后一项:
>>> classes = ['car', 'bus', 'van', 'car']
>>> {item: idx for idx, item in enumerate(classes)}
{'car': 3, 'bus': 1, 'van': 2}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.