简体   繁体   中英

Unable to read raw email data using file.readlines()

I was trying to parse raw email data from a specific file path. But I am getting an error whenever I use file.readlines() for reading the file with email library. And, if I used file.read() it only parses the data from the first mail sent. How do I parse and analyze the raw mail data?

with open(file_path, "r") as file:
    content = file.readlines()
    email_to_string = email.message_from_string(content)

    headers = email_to_string._headers

    header_contents = {}
    for header in headers:
        if "From" in header:
            header_contents['From'] = header[-1]
        elif "To" in header:
            header_contents['To'] = header[-1]
        elif "Date" in header:
            header_contents['Date'] = header [-1]
        elif "Subject" in header:
            header_contents['Subject'] = header[-1]
        print("HEADER CONTENTS", header_contents)

    if email_to_string.is_multipart():
        body = []
        for lines in body.get_payload():
            body.append(lines)
        body = " ".join(body)
    else:
        body = email_to_string.get_payload()


    print("HEADER", headers)
    print("HEADER CONTENTS", header_contents)
    print("BODY", body)

**Error **

    Traceback (most recent call last):
    File "test.py", line 7, in <module>
        email_to_string = email.message_from_string(content)
      File "/usr/lib/python3.6/email/__init__.py", line 38, in message_from_string
        return Parser(*args, **kws).parsestr(s)
      File "/usr/lib/python3.6/email/parser.py", line 68, in parsestr
        return self.parse(StringIO(text), headersonly=headersonly)
    TypeError: initial_value must be str or None, not list

The method email.message_from_string() is expecting a string data type but file.readlines() returns a list.

Try using file.read() to return a string. Here's a link to its documentation.

with open(file_path, 'r') as file_:
    content = file_.read().replace('\n', '')
    email_to_string = email.message_from_string(content)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM