简体   繁体   English

使用张量流训练对象检测模型时,错误索引 [0] = 0 不在 [0, 0) 中

[英]Error indices[0] = 0 is not in [0, 0) while training an object-detection model with tensorflow

So I am currently attempting to train a custom object-detection model on tensorflow to recognize images of a raspberrypi2.所以我目前正在尝试在 tensorflow 上训练一个自定义的对象检测模型来识别 raspberrypi2 的图像。 Everything is already set up and running on my hardware,but due to limitations of my gpu I settled for the cloud.一切都已经在我的硬件上设置并运行,但由于我的 gpu 的限制,我选择了云。 I have uploaded my data(train & test records ans csv-files) and my checkpoint model.我已经上传了我的数据(训练和测试记录和 csv 文件)和我的检查点模型。 That is what I get from the logs:这就是我从日志中得到的:

tensorflow:Restoring parameters from /mobilenet/model.ckpt

tensorflow:Starting Session.

tensorflow:Saving checkpoint to path training/model.ckpt

tensorflow:Starting Queues.

tensorflow:Error reported to Coordinator: <class tensorflow.python.framework.errors_impl.InvalidArgumentError'>, indices[0] = 0 is not in [0, 0)

I also have a folder called images with the actual .jpg files and it is also on the cloud, but for some reason I must specify every directory with a preceeding forward slash / and that might be a problem, as I currently do not know whether some of the files are trying to import these images ,but could not find the path because of the missing /.我还有一个名为 images 的文件夹,其中包含实际的 .jpg 文件,它也在云端,但出于某种原因,我必须用前面的正斜杠 / 指定每个目录,这可能是一个问题,因为我目前不知道是否一些文件正在尝试导入这些图像,但由于缺少 / 找不到路径。 If any of you happens to share a solution I would be really thankful.如果你们中的任何人碰巧分享了一个解决方案,我将非常感激。

EDIT : I fixed it by downloading an older version of the models folder in tensorflow and the model started training, so note to the tf team.编辑:我通过在 tensorflow 中下载旧版本的模型文件夹来修复它,模型开始训练,所以请注意 tf 团队。

Changing the way I created TF Records worked out for me.改变我创建 TF Records 的方式对我来说很有效。 Have a look at the following code -看看下面的代码——

 example = tf.train.Example(
                            features= tf.train.Features(
                                feature={
                                    'image/height': dataset_util.int64_feature(height),
                                    'image/width': dataset_util.int64_feature(width),
                                    'image/filename': dataset_util.bytes_feature(filename_str.encode('utf-8')),
                                    'image/source_id': dataset_util.bytes_feature(filename_str.encode('utf-8')),
                                    'image/format': dataset_util.bytes_feature(image_format),
                                    'image/encoded': dataset_util.bytes_feature(image_data),
                                    'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
                                    'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
                                    'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
                                    'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
                                    'image/object/class/text': dataset_util.bytes_list_feature(labels_text),
                                    'image/object/class/label': dataset_util.int64_list_feature(labels),
                                }
                            )
                        )

Make sure that the TF Records have the same keys as the ones shown above.确保 TF 记录具有与上面显示的相同的键。 This is due to the fact that the model that you use would be expecting keys similar to the ones above.这是因为您使用的模型需要与上述类似的密钥。 I hope this helps.我希望这会有所帮助。

Earlier, I had made use of the following, which did not work out-早些时候,我使用了以下内容,但没有成功-

example = tf.train.Example(
                            features= tf.train.Features(
                                feature={
                                    'image/height': dataset_util.int64_feature(shape[0]),
                                    'image/width': dataset_util.int64_feature(shape[1]),
                                    'image/channels': dataset_util.int64_feature(shape[2]),
                                    'image/shape': dataset_util.int64_list_feature(shape),
                                    'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
                                    'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
                                    'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
                                    'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
                                    'image/object/bbox/class/label': dataset_util.int64_list_feature(labels),
                                    'image/object/bbox/class/text': dataset_util.bytes_list_feature(labels_text),
                                    'image/object/bbox/difficult': dataset_util.int64_list_feature(difficult),
                                    'image/object/bbox/truncated': dataset_util.int64_list_feature(truncated),
                                    'image/format': dataset_util.bytes_feature(image_format),
                                    'image/encoded': dataset_util.bytes_feature(image_data),
                                    'image/filename': dataset_util.bytes_feature(filename_str.encode('utf-8')),
                                    'image/source_id': dataset_util.bytes_feature(filename_str.encode('utf-8'))
                                }
                            )
                        )

As you observe, I had written image/object/bbox/class/label instead of image/object/class/label.正如你所观察到的,我写的是 image/object/bbox/class/label 而不是 image/object/class/label。 I hope this helps.我希望这会有所帮助。

A place where you can check this out is the following link - https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md您可以查看以下链接的地方 - https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

I had the same issue using the centernet_mobilenetv2 model, but I just deleted the num_keypoints parameter in the pipeline.config file and then all was working fine.我在使用 centernet_mobilenetv2 模型时遇到了同样的问题,但我只是删除了 pipeline.config 文件中的num_keypoints参数,然后一切正常。 I don't know what is the problem with that parameter but I was able to run the training without it.我不知道该参数有什么问题,但我可以在没有它的情况下进行训练。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Tensorflow对象检测中尝试评估相关模型时出错 - Error when trying to evaluate pertained model in tensorflow object-detection Tensorflow 对象检测运行错误 - Tensorflow Object-detection running error 在Tensorflow对象检测中评估预训练模型时出错(tensorflow.python.framework.errors_impl.NotFoundError :) - Error when evaluating pretrained model in Tensorflow object-detection (tensorflow.python.framework.errors_impl.NotFoundError:) 训练张量流对象检测关于检查点错误的错误 - Error while training tensorflow object detection about checkpoint error Tensorflow 对象检测使我的系统过载 - Tensorflow object-detection overload my system Tensorflow对象检测API`indices [3] = 3不在[0,3)中错误 - Tensorflow Object Detection API `indices[3] = 3 is not in [0, 3)` error 当 num_of_stages: 1(仅限 RPN)在 tensorflow 对象检测 api 中时,导出推理图会出错 - export inference graph gives error when num_of_stages: 1 (RPN only) in tensorflow object-detection api TensorFlow对象检测API训练错误 - TensorFlow Object Detection API training error 无法为Tensorflow对象检测API编译.proto文件 - Unable to compile .proto files for Tensorflow object-detection API 自定义数据集上的tensorflow对象检测API评估 - tensorflow object-detection api eval on custom dataset
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM