[英]python imap: how to parse multipart mail content
A mail can contain different blocks like: 邮件可以包含不同的块,如:
--0016e68deb06b58acf04897c624e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
content_1
...
--0016e68deb06b58acf04897c624e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
content_2
... and so on
How can I get content of each block with python? 如何使用python获取每个块的内容?
And also how to get properties of each block? 还有如何获得每个块的属性? (content-type, etc..) (内容类型等)
For parsing emails I have used Message.walk()
method like this: 为了解析电子邮件,我使用了像这样的Message.walk()
方法:
if msg.is_multipart():
for part in msg.walk():
...
For content you can try: part.get_payload()
. 对于内容,您可以尝试: part.get_payload()
。 For content-type there is: part.get_content_type()
对于content-type,有: part.get_content_type()
You will find documetation here: http://docs.python.org/library/email.message.html 你会在这里找到文档: http ://docs.python.org/library/email.message.html
You can also try email
module with its iterators. 您还可以尝试使用其迭代器的email
模块。
http://docs.python.org/library/email.html http://docs.python.org/library/email.html
A very simple example (msg_as_str contains the raw bytes you got from the imap server): 一个非常简单的例子(msg_as_str包含从imap服务器获得的原始字节):
import email
msg = email.message_from_string(msg_as_str)
print msg["Subject"]
I have wrote this code. 我写了这段代码。 You can use it if you like it for parsing multipart content: 如果您喜欢它,可以使用它来解析多部分内容:
if mime_msg.is_multipart():
for part in mime_msg.walk():
if part.is_multipart():
for subpart in part.get_payload():
if subpart.is_multipart():
for subsubpart in subpart.get_payload():
body = body + str(subsubpart.get_payload(decode=True)) + '\n'
else:
body = body + str(subpart.get_payload(decode=True)) + '\n'
else:
body = body + str(part.get_payload(decode=True)) + '\n'
else:
body = body + str(mime_msg.get_payload(decode=True)) + '\n'
body = bytes(body,'utf-8').decode('unicode-escape')
And if you want to take out in plain text then convert body into html2text.HTML2Text()
如果你想用纯文本取出然后将body转换为html2text.HTML2Text()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.