简体   繁体   English

无法使用 file.readlines() 读取原始电子邮件数据

[英]Unable to read raw email data using file.readlines()

I was trying to parse raw email data from a specific file path.我试图从特定文件路径解析原始电子邮件数据。 But I am getting an error whenever I use file.readlines() for reading the file with email library.但是每当我使用 file.readlines() 读取带有电子邮件库的文件时,我都会收到错误消息。 And, if I used file.read() it only parses the data from the first mail sent.而且,如果我使用 file.read() 它只解析发送的第一封邮件中的数据。 How do I parse and analyze the raw mail data?如何解析和分析原始邮件数据?

with open(file_path, "r") as file:
    content = file.readlines()
    email_to_string = email.message_from_string(content)

    headers = email_to_string._headers

    header_contents = {}
    for header in headers:
        if "From" in header:
            header_contents['From'] = header[-1]
        elif "To" in header:
            header_contents['To'] = header[-1]
        elif "Date" in header:
            header_contents['Date'] = header [-1]
        elif "Subject" in header:
            header_contents['Subject'] = header[-1]
        print("HEADER CONTENTS", header_contents)

    if email_to_string.is_multipart():
        body = []
        for lines in body.get_payload():
            body.append(lines)
        body = " ".join(body)
    else:
        body = email_to_string.get_payload()


    print("HEADER", headers)
    print("HEADER CONTENTS", header_contents)
    print("BODY", body)

**Error ** **错误 **

    Traceback (most recent call last):
    File "test.py", line 7, in <module>
        email_to_string = email.message_from_string(content)
      File "/usr/lib/python3.6/email/__init__.py", line 38, in message_from_string
        return Parser(*args, **kws).parsestr(s)
      File "/usr/lib/python3.6/email/parser.py", line 68, in parsestr
        return self.parse(StringIO(text), headersonly=headersonly)
    TypeError: initial_value must be str or None, not list

The method email.message_from_string() is expecting a string data type but file.readlines() returns a list.方法email.message_from_string()需要一个字符串数据类型,但file.readlines()返回一个列表。

Try using file.read() to return a string.尝试使用file.read()返回一个字符串。 Here's a link to its documentation.这是其文档的链接

with open(file_path, 'r') as file_:
    content = file_.read().replace('\n', '')
    email_to_string = email.message_from_string(content)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM