简体   繁体   中英

Python IMAP download all attachments

I need to iterate over all the mail into a GMAIL inbox. Also I need to download all the attachments for each mail (some mails have 4-5 attachments). I found some helps here : https://stackoverflow.com/a/27556667/8996442

def save_attachments(self, msg, download_folder="/tmp"):
    for part in msg.walk():
        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-Disposition') is None:
            continue

        filename = part.get_filename()
        print(filename)
        att_path = os.path.join(download_folder, filename)
        if not os.path.isfile(att_path):
            fp = open(att_path, 'wb')
            fp.write(part.get_payload(decode=True))
            fp.close()
        return att_path

But, it download only one attachment per e-mail (but the author of the post mention that norammly it download all, no?). The print(filename) show me only one attachment Any idea why ?

from imap_tools import MailBox

# get all attachments from INBOX and save them to files
with MailBox('imap.my.ru').login('acc', 'pwd', 'INBOX') as mailbox:
    for msg in mailbox.fetch():
        for att in msg.attachments:
            print(att.filename, att.content_type)
            with open('C:/1/{}'.format(att.filename), 'wb') as f:
                f.write(att.payload)

https://pypi.org/project/imap-tools/

As already pointed out in comments, the immediate problem is that return exits the for loop and leaves the function, and you do this immediately when you have saved the first attachment.

Depending on what exactly you want to accomplish, change your code so you only return when you have finished all iterations of msg.walk() . Here is one attempt which returns a list of attachment filenames:

def save_attachments(self, msg, download_folder="/tmp"):
    att_paths = []

    for part in msg.walk():
        if part.get_content_maintype() == 'multipart':
            continue
        if part.get('Content-Disposition') is None:
            continue

        filename = part.get_filename()
        # Don't print
        # print(filename)
        att_path = os.path.join(download_folder, filename)
        if not os.path.isfile(att_path):
            # Use a context manager for robustness
            with open(att_path, 'wb') as fp:
                fp.write(part.get_payload(decode=True))
            # Then you don't need to explicitly close
            # fp.close()
        # Append this one to the list we are collecting
        att_paths.append(att_path)

    # We are done looping and have processed all attachments now
    # Return the list of file names
    return att_paths

See the inline comments for explanations of what I changed and why.

In general, avoid print() ing stuff from inside a worker function; either use logging to print diagnostics in a way that the caller can control, or just return the information and let the caller decide whether or not to present it to the user.

Not all MIME parts have a Content-Disposition: ; in fact, I would expect this to miss the majority of attachments, and possibly extract some inline parts. A better approach is probably to look whether the part has Content-Disposition: attachment and otherwise proceed to extract if either there is no Content-Disposition: or the Content-Type: is not either text/plain or text/html . Perhaps see also What are the "parts" in a multipart email?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM