简体   繁体   English

如何在 Python 3 中安装 textract?

[英]How to install textract in Python 3?

我想从 pdf 中提取,但pypdf2没有提取所有信息,并且由于以下错误, textract无法在 3.7 中安装:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 1671: character maps to <undefined>

  1. Download the source file for textract from: https://pypi.python.org/pypi/textract从以下位置下载textract的源文件: https : textract

  2. pip3 install pdfminer3k

  3. untar the downloaded file untar下载的文件

  4. cd into the directory cd进入目录

  5. run: python3 setup.py install运行: python3 setup.py install

Hope it works for you :)希望它对你:)

I have installed textract on windows 10 with following steps: -我已经通过以下步骤在 Windows 10 上安装了textract :-

  1. pip install textract
  2. install poppler :安装poppler
  3. Installation Complete安装完成
  4. Test by - import textract测试方式 - import textract
  5. textract.process('path_to_file_with_extension')

For further reference, you can click here如需进一步参考,您可以点击这里

Hope it will be helpful to you!希望对你有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM