简体   繁体   中英

How to fix training error launch:90 in YOLOX?

!python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 16 --fp16  -c /content/yolox_s.pth
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - enabled                : True
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - opt_level              : O1
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - cast_model_type        : None
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - patch_torch_functions  : True
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - keep_batchnorm_fp32    : None
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - master_weights         : None
2022-03-29 19:24:57 | INFO     | apex.amp.frontend:356 - loss_scale             : dynamic
2022-03-29 19:24:57 | INFO     | yolox.core.trainer:297 - loading checkpoint for fine tuning
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.0.weight in model is torch.Size([3, 128, 1, 1]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.0.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.0.bias in model is torch.Size([3]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.1.weight in model is torch.Size([3, 128, 1, 1]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.1.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.1.bias in model is torch.Size([3]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.weight in checkpoint is torch.Size([80, 128, 1, 1]), while shape of head.cls_preds.2.weight in model is torch.Size([3, 128, 1, 1]).
2022-03-29 19:24:57 | WARNING  | yolox.utils.checkpoint:27 - Shape of head.cls_preds.2.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.2.bias in model is torch.Size([3]).
2022-03-29 19:24:57 | ERROR    | yolox.core.launch:90 - An error has been caught in function 'launch', process 'MainProcess' (1549), thread 'MainThread' (140243385931648):
Traceback (most recent call last):

  File "tools/train.py", line 125, in <module>
    args=(exp, args),
          │    └ Namespace(batch_size=16, ckpt='/content/yolox_s.pth', devices=1, dist_backend='nccl', dist_url=None, exp_file='exps/example/y...
          └ ╒══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════...

> File "/content/apex/YOLOX/yolox/core/launch.py", line 90, in launch
    main_func(*args)
    │          └ (╒══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════...
    └ <function main at 0x7f8cf10d3e60>

  File "tools/train.py", line 104, in main
    trainer.train()
    │       └ <function Trainer.train at 0x7f8bf2234d40>
    └ <yolox.core.trainer.Trainer object at 0x7f8bec4a7a90>

  File "/content/apex/YOLOX/yolox/core/trainer.py", line 69, in train
    self.before_train()
    │    └ <function Trainer.before_train at 0x7f8bec969710>
    └ <yolox.core.trainer.Trainer object at 0x7f8bec4a7a90>

  File "/content/apex/YOLOX/yolox/core/trainer.py", line 150, in before_train
    no_aug=self.no_aug,
           │    └ False
           └ <yolox.core.trainer.Trainer object at 0x7f8bec4a7a90>

  File "exps/example/yolox_voc/yolox_voc_s.py", line 36, in get_data_loader
    max_labels=50,

  File "/content/apex/YOLOX/yolox/data/datasets/voc.py", line 115, in __init__
    os.path.join(rootpath, "ImageSets", "Main", name + ".txt")
    │  │    │    │                              └ 'trainval'
    │  │    │    └ '/content/apex/YOLOX/datasets/VOCdevkit/VOC2007'
    │  │    └ <function join at 0x7f8cf31177a0>
    │  └ <module 'posixpath' from '/usr/lib/python3.7/posixpath.py'>
    └ <module 'os' from '/usr/lib/python3.7/os.py'>

FileNotFoundError: [Errno 2] No such file or directory: '/content/apex/YOLOX/datasets/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt'

The file is present in /content/YOLOX/datasets/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt But not in /content/apex/YOLOX/datasets/VOCdevkit/VOC2007/ImageSets/Main/trainval.txt

How do I fix this?

This is more of an addressing problem rather than Machine Learning problem; anyway, if you are using this and have the content folder with .pth checkpoints in your YOLOX folder you should run the command like the following (assuming your terminal path is at inside your YOlox folder (check with running pwd command)):

Assuming you want to do training on a custom dataset you should follow their guideline here ; for example, if your data is in coco you should put it ./datasets folder

Now if you have the downloaded weights at folder ./content/ then the following command starts training based on yolox_s.pth on images inside ./datasets assuming they are in coco format.

python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 16 --fp16 -c content/yolox_s.pth

note: / at the start of the path refers to the beginning of the file system but ./ ( or not using it) refers to the current folder.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM