简体   繁体   English

python imap:如何解析多部分邮件内容

[英]python imap: how to parse multipart mail content

A mail can contain different blocks like: 邮件可以包含不同的块,如:

--0016e68deb06b58acf04897c624e
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
content_1
...

--0016e68deb06b58acf04897c624e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
content_2
... and so on

How can I get content of each block with python? 如何使用python获取每个块的内容?
And also how to get properties of each block? 还有如何获得每个块的属性? (content-type, etc..) (内容类型等)

For parsing emails I have used Message.walk() method like this: 为了解析电子邮件,我使用了像这样的Message.walk()方法:

if msg.is_multipart():
    for part in msg.walk():
        ...

For content you can try: part.get_payload() . 对于内容,您可以尝试: part.get_payload() For content-type there is: part.get_content_type() 对于content-type,有: part.get_content_type()

You will find documetation here: http://docs.python.org/library/email.message.html 你会在这里找到文档: http ://docs.python.org/library/email.message.html

You can also try email module with its iterators. 您还可以尝试使用其迭代器的email模块。

http://docs.python.org/library/email.html http://docs.python.org/library/email.html

A very simple example (msg_as_str contains the raw bytes you got from the imap server): 一个非常简单的例子(msg_as_str包含从imap服务器获得的原始字节):

import email
msg = email.message_from_string(msg_as_str)
print msg["Subject"]

I have wrote this code. 我写了这段代码。 You can use it if you like it for parsing multipart content: 如果您喜欢它,可以使用它来解析多部分内容:

if mime_msg.is_multipart():
        for part in mime_msg.walk():
            if part.is_multipart():
                for subpart in part.get_payload():
                    if subpart.is_multipart():
                        for subsubpart in subpart.get_payload():
                            body = body + str(subsubpart.get_payload(decode=True)) + '\n'
                    else:
                        body = body + str(subpart.get_payload(decode=True)) + '\n'
            else:
                body = body + str(part.get_payload(decode=True)) + '\n'
else:
    body = body + str(mime_msg.get_payload(decode=True)) + '\n'

body = bytes(body,'utf-8').decode('unicode-escape')

And if you want to take out in plain text then convert body into html2text.HTML2Text() 如果你想用纯文本取出然后将body转换为html2text.HTML2Text()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM