简体   繁体   English

无法读取 EBCDIC 037 解码图像 (Java)

[英]Unable to read the EBCDIC 037 decoded image (Java)

I have a EBCDIC file from which i extracted images.我有一个 EBCDIC 文件,我从中提取了图像。 However, there is some data on the images which is key source in identifying my transactions.但是,图像上有一些数据是识别我的交易的关键来源。 Assume that i have an image as "stackoverflow logo" stored under name "img1.jpg" on my desktop and when i read it using the following code, it works假设我在桌面上以名称“img1.jpg”存储了一个图像为“stackoverflow logo”,当我使用以下代码阅读它时,它可以工作

String inputImage = "C:\\Desktop\\img1.jpg";
File imageFile = new File(inputImage);
BufferedImage image1 = ImageIO.read(imageFile);
System.out.println(image1);

However, when i attempt the same with an image decoded from EBCDIC conversion, it returns null.但是,当我尝试对从 EBCDIC 转换解码的图像进行相同操作时,它会返回 null。

The difference i observed is that there is no color associated in the decoded image.我观察到的不同之处在于解码图像中没有关联的颜色。 Is there any way to read these images and retrieve the text on the image.有什么方法可以读取这些图像并检索图像上的文本。 Following is not the exact image which i am working on, but just to give an idea i am sharing a sample from internet.以下不是我正在处理的确切图像,只是为了给出一个想法,我正在分享来自互联网的样本。 Note: The image am working on looks like a Scanned image (Grayscale) Example:注意:正在处理的图像看起来像扫描图像(灰度)示例: 在此处输入图像描述

Also, I observed that if i open the decode file and do a screen capture via snipping tool and store it as jpg file (which already is jpg) and read it, system is reading that file.另外,我观察到如果我打开解码文件并通过截图工具进行屏幕截图并将其存储为 jpg 文件(已经是 jpg)并读取它,系统正在读取该文件。 not sure where is the issue, is it compression or color coding or format.不确定问题出在哪里,是压缩还是颜色编码或格式。

Thank you everyone.谢谢大家。 I used Tess4j to decode the TIFF image.我使用 Tess4j 解码 TIFF 图像。 Unfortunately the information i was looking for isn't available in the decoded text.不幸的是,我正在寻找的信息在解码文本中不可用。 But, done with the POC.但是,完成了 POC。 used the following library and added eng.traineddata in the folder where images exist使用以下库并在存在图像的文件夹中添加了 eng.traineddata

import net.sourceforge.tess4j.*;
String inputImage = "C:\\Desktop\\img1.tiff";
File imageFile = new File(inputImage);
ITesseract imageRead = new Tesseract();
imageRead.setDataPath("C:\\Desktop\\");
imageRead.setLanguage("eng");
String imageText = imageRead.doOCR(imageFile);
System.out.println(imageText);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM