如何使用imaplib从python电子邮件中获取纯文本

Question

I am wondering how to get pure text form python email using imaplib. 我想知道如何使用imaplib获取纯文本形式的python电子邮件。 What i have so far: 我到目前为止所拥有的：

from datetime import datetime
import imaplib ,email
IMAP_SERVER = 'imap.gmail.com'
EMAIL_ACCOUNT = "example@gmail.com"
PASSWORD = "password"
   rv, data = M.search(None, "ALL")
    if rv != 'OK':
        print("No messages found!")
        return

    if data != ['']:  # if not empty list means messages exist
        for num in data[0].split():
            rv, data = M.fetch(num, '(RFC822)') #(BODY[HEADER.FIELDS (SUBJECT FROM)])
            if rv != 'OK':
                print("ERROR getting message", num)
                return

            message = email.message_from_bytes(data[0][1])
            text = ""
            if message.is_multipart():
                for payload in message.get_payload():
                    text = payload.get_payload()
            else:
                    text = message.get_payload()

            res = {
                'From': email.utils.parseaddr(message['From'])[1],
                'From name': email.utils.parseaddr(message['From'])[0],
                'Time': datetime.fromtimestamp(email.utils.mktime_tz(email.utils.parsedate_tz(message['Date']))),
                'To': message['To'],
                'Subject': email.header.decode_header(message["Subject"])[0][0],
                'Text': text
            }
            print(res['Text'])

    else:
        print("Nothing to work with.")

If i do it this way, the code works, but i get 如果我这样做，代码可以工作，但是我得到了

<div dir="ltr">test 3 body</div>

as an output. 作为输出。 Is there any way to get purely "test 3 body" out? 有什么方法可以让您完全“测试3个身体”吗？

Answer 1

Look for the plain text part of the email message. 查找电子邮件的纯文本部分。

for payload in message.walk():
    if payload.get_content_type().lower() == 'text/plain':
        print(payload.get_payload())

Answer 2

If you just stack on removing html tags from string you have to use regular expression like here: 如果您只是堆叠从字符串中删除html标签，则必须使用正则表达式，例如：

import re

s = '<div dir="ltr">test 3 body</div>'
print(re.sub('<[^<]+?>', '', s))

Output: test 3 body 输出： test 3 body

s has to be your res['Text'] . s必须是您的res['Text'] 。

如何使用imaplib从python电子邮件中获取纯文本

问题描述

2 个解决方案

解决方案1
2 2017-10-26 17:07:53

解决方案2
1 已采纳 2017-04-24 11:23:54

如何使用imaplib从python电子邮件中获取纯文本

问题描述

2 个解决方案

解决方案1 2 2017-10-26 17:07:53

解决方案2 1 已采纳 2017-04-24 11:23:54

解决方案1
2 2017-10-26 17:07:53

解决方案2
1 已采纳 2017-04-24 11:23:54