简体   繁体   中英

AWS Textract custom font

I am coding a project in Python that should be able to read text in an image. I need to be able to detect separate lines and spaces in the image consistently for my program to work.

I am using a font that looks slightly different from most normal fonts. I have decided to try implementing this using AWS Textract. Some characters such as "k", "j", and ";" are not being recognized properly though because they look different in this custom font. Also, sometimes, spaces are being recognized where they don't exist.

How would I train Textract to work properly with my custom font? I am flexible to change my app design if this is not possible with Textract. My application should be able to run on my users computer though without any additional installs.

Given your problem statement Texteract is not your solution. You can't train texteract as it has pre-trained models which we can just consume. If you want to train your models then SageMaker is the service. You can do a lot of things with SageMaker from scratch or use some pre-trained models as well.

But I don't think that you can use sagemaker fulfils the criteria of running on user machines. Though you can export models from sagemaker and run them on Tensorflow. Check this thread on discussion of exporting models from sagemaker. https://github.com/aws/sagemaker-python-sdk/issues/200

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM