简体   繁体   English

Python 3.6从.msg文件提取文本

[英]Python 3.6 Extract text from .msg files

I'm currently using the module textract (making use of msg-extractor) to get all text content from msg files. 我当前正在使用模块textract(利用msg-extractor)从msg文件中获取所有文本内容。 But I get some encoding errors for some files which seem to be related to the open issues for textract ( based on the link ) 但是我发现某些文件的编码错误似乎与textract的未解决问题有关( 基于链接

Are there other modules I can use to extract text from msg files? 我还可以使用其他模块从msg文件提取文本吗? I'm using Python 3.6 for my development 我正在使用Python 3.6进行开发

You can use extract_msg module for extracting the metadata from the .MSG files as well as the body. 您可以使用extract_msg模块从.MSG文件以及正文中提取元数据。

import extract_msg
with extract_msg.Message(filepath) as msg:
     msg_body = msg.body
     msg_subject = msg.subject
     print(msg_body)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM