Tensorflow Object Detection API `indices[3] = 3 is not in [0, 3)` error

Question

I am working on retraining TF Object Detection API's mobilenet(v1)-SSD, and having trouble with the error that I'm getting at the training step.

INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path xxxx/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, indices[3] = 3 is not in [0, 3)
         [[Node: cond_2/RandomCropImage/PruneCompleteleyOutsideWindow/Gather/Gather_1 = Gather[Tindices=DT_INT64, Tparams=DT_INT64, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](cond_2/Switch_3:1, cond_2/RandomCropImage/PruneCompleteleyOutsideWindow/Reshape)]]
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Caught OutOfRangeError. Stopping Training.
INFO:tensorflow:Finished training! Saving model to disk.
Traceback (most recent call last):
  File "object_detection/train.py", line 168, in <module>
    tf.app.run()
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 124, in run
    _sys.exit(main(argv))
  File "object_detection/train.py", line 165, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "xxxx/research/object_detection/trainer.py", line 361, in train
    saver=saver)
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 782, in train
    ignore_live_threads=ignore_live_threads)
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 826, in stop
    ignore_live_threads=ignore_live_threads)
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py", line 387, in join
    six.reraise(*self._exc_info_to_raise)
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py", line 250, in _run
    enqueue_callable()
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1251, in _single_operation_run
    self._session, None, {}, [], target_list, status, None)
  File "/home/khatta/.virtualenvs/dl/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[3] = 3 is not in [0, 3)
         [[Node: cond_2/RandomCropImage/PruneCompleteleyOutsideWindow/Gather/Gather_1 = Gather[Tindices=DT_INT64, Tparams=DT_INT64, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](cond_2/Switch_3:1, cond_2/RandomCropImage/PruneCompleteleyOutsideWindow/Reshape)]]

This error happens at the start when I prepare the TFRecords file with a comparatively large amount of data (around 16K images).
When I use small amount of data (around 1K images), the error happens after around 100 steps of training. The error code structure is the same.
The structure of the TFRecord creating script is like the below; I wanted to tile out the large images so that the annotations won't get too small at the 300x300 resizing step in SSD and I thought it will give better results:

import tensorflow as tf
import pandas as pd
import hashlib

def _tiling(image_array, labels, tile_size=(300,300)):  
    '''tile image according to the tile_size argument'''  
    <do stuff>  
    yield tiled_image_array, tiled_label

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def _int64_list_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))

def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _bytes_list_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))

def _float_list_feature(value):
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))

def _make_tfexample(tiled_image_array, tiled_label):
    img_str = cv2.imencode('.jpg', tiled_image_array)[1].tobytes()
    height, width, _ = tiled_image_array.shape
    # tiled_label's contents:
    # ['tilename', ['object_name', 'object_name', ...], 
    #  [xmin, xmin, ...], [ymin, ymin, ...], 
    #  [xmax, xmax, ...], [ymax, ymax, ...]]
    tile_name, object_names, xmins, ymins, xmaxs, ymaxs = tiled_label
    filename = bytes(tile_name, 'utf-8')
    image_format = b'jpeg'
    key = hashlib.sha256(img_str).hexdigest()
    xmins = [xmin/width for xmin in xmins]
    ymins = [ymin/height for ymin in ymins]
    xmaxs = [xmax/width for xmax in xmaxs]
    ymaxs = [ymax/height for ymax in ymaxs]
    classes_text = [bytes(obj, 'utf-8') for obj in object_names]
    # category => {'object_name': #id, ...}
    classes = [category[obj] for obj in obj_names]
    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': _int64_feature(height),
        'image/width': _int64_feature(width),
        'image/filename': _bytes_feature(filename),
        'image/source_id': _bytes_feature(filename),
        'image/key/sha256': _bytes_feature(key.encode('utf-8')),
        'image/encoded': _bytes_feature(img_str),
        'image/format': _bytes_feature(image_format),
        'image/object/bbox/xmin': _float_list_feature(xmins),
        'image/object/bbox/ymin': _float_list_feature(ymins),
        'image/object/bbox/xmax': _float_list_feature(xmaxs),
        'image/object/bbox/ymax': _float_list_feature(ymaxs),
        'image/object/class/text': _bytes_list_feature(classes_text),
        'image/object/class/label': _int64_list_feature(classes)
    }))
    return tf_example

def make_tfrecord(image_path, csv_path, tfrecord_path):  
    '''convert image and labels into tfrecord file'''  
    csv = pd.read_csv(csv_path)
    with tf.python_io.TFRecordWriter(tfrecord_path) as writer:
        for row in csv.itertuples():  
            img_array = cv2.imread(image_path + row.filename)
            img_array = cv2.cvtColor(img_array, cv2.COLOR_BGR2RGB)
            tile_generator = _tiling(image_array, row.label)
            for tiled_image_array, tiled_label in tile_generator:
                tf_example = _make_tfexample(tiled_image_array, tiled_labels)
                writer.write(tf_example.SerializeToString())

Any suggestions on why this error might be happening are welcome. Thank you in advance!

Answer 1

This was caused by the length of the obj_names list not matching the other list elements' lengths ( xmins, ymins, xmaxs, ymaxs, classes ).
The cause was a bug in my code, but I'm just posting this FYI if you get a similar error and need some hint to debug it.

In short, you need (in _make_tfexample function above)

xmins = [a_xmin, b_xmin, c_xmin]
ymins = [a_ymin, b_ymin, c_ymin]
xmaxs = [a_xmax, b_xmax, c_xmax]
ymaxs = [a_ymax, b_ymax, c_ymax]
classes_text = [a_class, b_class, c_class]
classes = [a_classid, b_classid, c_classid]

so that the lists' indices match with each other. But the error happens when the lengths of the lists don't match for some reason.

Answer 2

I also came across the same error and was going from page to page trying to find an answer. Unfortunately, the shape of the data and the labels was not the reason I was getting this error. I found the same question in multiple places over stackoverflow, so check this to see if this solves your problem.

Tensorflow Object Detection API `indices[3] = 3 is not in [0, 3)` error

Question

2 answers

solution1
3 ACCPTED 2018-04-10 05:15:59

solution2
0 2018-09-13 09:21:03

Tensorflow Object Detection API `indices[3] = 3 is not in [0, 3)` error

Question

2 answers

solution1 3 ACCPTED 2018-04-10 05:15:59

solution2 0 2018-09-13 09:21:03

solution1
3 ACCPTED 2018-04-10 05:15:59

solution2
0 2018-09-13 09:21:03