简体   繁体   English

在PHP中从pdf文件获取文本和图像的最简单方法?

[英]Easiest way to get both text and images from a pdf file in PHP?

I want to EXTRACT both the text and tbe images of a PDF file using PHP. 我想使用PHP提取PDF文件的文本和图像。 All the libraries seem to be about reading, and most of the other solutions either only produce text, or only produce images, or is command line based. 所有库似乎都是关于阅读的,而大多数其他解决方案要么只生成文本,要么只生成图像,或者基于命令行。 I'm looking for a complete solution in PHP. 我正在寻找一个完整的PHP解决方案。 Is this possible? 这可能吗?

At this point in time, I'm also open to other suggestions, such as perhaps there is a site with an API that you can submit the file to? 在这个时间点,我也对其他建议持开放态度,比如可能有一个带有API的站点,你可以将文件提交给? Or perhaps someone can give instructions on a modern solution using the OpenOffice command line tool, of that's even possible? 或者也许有人可以使用OpenOffice命令行工具给出现代解决方案的说明,甚至可能?

What about the Google Docs API? 那么Google Docs API怎么样? They have an OCR that you might be able to work with. 他们有一个你可以使用的OCR。

https://developers.google.com/google-apps/documents-list/#uploading_documents_using_optical_character_recognition_ocr https://developers.google.com/google-apps/documents-list/#uploading_documents_using_optical_character_recognition_ocr

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM