如何让 keras-ocr 默认模型只识别数字？

Question

我使用 python 和 keras ocr。 我希望 keras 只识别数字，所以在管道中我这样做。

recognizer = keras_ocr.recognition.Recognizer(alphabet="0123456789")
pipeline = keras_ocr.pipeline.Pipeline(recognizer=recognizer)

但它并没有像 tesseract 白名单那样将字母转换为数字并提高识别质量。 所以这些数字根本无法识别。

使用默认字母，结果会更好。 但有些数字与字母混淆。 但是，将字母更改为“replace("O", "0")" 之类的数字是一个非常糟糕的主意。

识别功能简单且可复制:)


    _image = keras_ocr.tools.read(_path)
    plt.figure(figsize=(10, 20))
    plt.imshow(_image)

    prediction = pipeline.recognize([_image])[0]
    fig, axs = plt.subplots(1, figsize=(10, 20))
    keras_ocr.tools.drawAnnotations(image=_image, predictions=prediction, ax=axs)
    plt.show()

Answer 1

我没有找到比使用 keras ocr 工具学习模型更简单的方法。 然而，合成数据的文本生成器使用书籍、期刊或具有想法、意义的东西中的文本（我不知道用英语说：））。 所以数字很少，有时如果你的字母是“0123456789”，生成器会返回空字符串。 所以我写了我自己的生成器，它只用数字生成字符串。 https://keras-ocr.readthedocs.io/en/latest/examples/end_to_end_training.html https://colab.research.google.com/drive/1PxxXyH3XaBoTgxKIoC9dKIRo4wUo-QDg#scrollTo=I7SF5VeoLulc

如何让 keras-ocr 默认模型只识别数字？

问题描述

1 个解决方案

解决方案1
0 2022-06-18 12:37:59

如何让 keras-ocr 默认模型只识别数字？

问题描述

1 个解决方案

解决方案1 0 2022-06-18 12:37:59

解决方案1
0 2022-06-18 12:37:59