简体   繁体   中英

parse and decode mail text in aiosmtpd, perform string substitution, and reinject

I began with smtpd in order to process mailqueue, parse inbound emails and send them back to recipients (using smtpdlib.sendmail ). I switched to aiosmtpd since i needed multithread processing (while smtpd is single-threaded, and besides that looks like discontinued).

By the way I'm puzzled by aiosmtpd management of mail envelope contents , that seems much more granular than before, so good if you need really fine tuning, but somewhat oversized if you just want to process body without modifying the rest.

To make an example, smtpd process_message method just needed data_decode=True parameter to process and decode mail body without touching anything, while aiosmtpd HANDLE_data method seems unable to automagically decode mail envelope and often gives exceptions with embedded images, attachments, and so on...

EDIT added code examples, smtpd first: following code will instantiate smtp server waiting for mail on port 10025 and delivering to 10027 via smtplib (both localhost). It is safe to work on data variable (basically perform string substitutions, my goal) for all kind of mail (text/html based, with embedded images, attachments...)

class PROXY_SMTP(smtpd.SMTPServer):
        def process_message(self, peer, mailfrom, rcpttos, data, decode_data=True):
        server = smtplib.SMTP('localhost', 10027)
        server.sendmail(mailfrom, rcpttos, data)
        server.quit()
server = PROXY_SMTP(('127.0.0.1', 10025), None)
asyncore.loop()

Previous code works well but in a single thread fashion (= 1 mail at once), so i switched to aiosmtpd to have concurrent mail processing. Same example with aiosmtpd would be roughly:

class MyHandler:
        async def handle_DATA(self, server, session, envelope):
                peer = session.peer
                mailfrom = envelope.mail_from
                rcpttos = envelope.rcpt_tos
                data = envelope.content.decode()
                server = smtplib.SMTP('localhost', 10027)
                server.sendmail(mailfrom, rcpttos, data)
                server.quit()

my_handler = MyHandler()

async def main(loop):
        my_controller = Controller(my_handler, hostname='127.0.0.1', port=10025)
        my_controller.start()
loop = asyncio.get_event_loop()
loop.create_task(main(loop=loop))
try:
     loop.run_forever()

This code works well for text emails, but will give exceptions when decoding envelope.content with any complex mail (mime content, attachments...)

How could I parse and decode mail text in aiosmtpd, perform string substitution as I did with smtpd, and reinject via smtplib?

You are calling decode() on something whose encoding you can't possibly know or predict in advance. Modifying the raw RFC5322 message is extremely problematic anyway, because you can't easily look inside quoted-printable or base64 body parts if you want to modify the contents. Also watch out for RFC2047 encapsulation in human-visible headers, file names in RFC2231 (or some dastardly non-compliant perversion - many clients don't get this even almost right) etc. See below for an example.

Instead, if I am guessing correctly what you want, I would parse it into an email object, then take it from there.

from email import message_from_bytes
from email.policy import default

class MyHandler:
    async def handle_DATA(self, server, session, envelope):
        peer = session.peer
        mailfrom = envelope.mail_from
        rcpttos = envelope.rcpt_tos
        message = message_from_bytes(envelope.content, policy=default)
        # ... do things with the message,
        # maybe look into the .walk() method to traverse the MIME structure
        server = smtplib.SMTP('localhost', 10027)
        server.send_message(message, mailfrom, rcpttos)
        server.quit()
        return '250 OK'

The policy argument selects the modern email.message.EmailMessage class which replaces the legacy email.message.Message class from Python 3.2 and earlier. (A lot of online examples still promote the legacy API; the new one is more logical and versatile, so you want to target that if you can.)

This also adds the missing return statement which each handler should provide as per the documentation.


Here's an example message which contains the string "Hello" in two places. Because the content-transfer-encoding obscures the content, you need to analyze the message (such as by parsing it into an email object) to be able to properly manipulate it.

From: me <me@example.org>
To: you <recipient@example.net>
Subject: MIME encapsulation demo
Mime-Version: 1.0
Content-type: multipart/alternative; boundary="covfefe"

--covfefe
Content-type: text/plain; charset="utf-8"
Content-transfer-encoding: quoted-printable

You had me at "H=
ello."

--covfefe
Content-type: text/html; charset="utf-8"
Content-transfer-encoding: base64

PGh0bWw+PGhlYWQ+PHRpdGxlPkhlbGxvLCBpcyBpdCBtZSB5b3UncmUgbG9va2luZyBmb3I/PC
90aXRsZT48L2hlYWQ+PGJvZHk+PHA+VGhlIGNvdiBpbiB0aGUgZmUgZmU8L3A+PC9ib2R5Pjwv
aHRtbD4K

--covfefe--

The OP incorrectly added this text to the question; I'm moving it here as a (half) answer.

--- SOLVED ---

This is what i gotten so far, minor adjustments are still needed (mainly for mime content separate handling and "rebuilding") but this solves my main problem: receive mail on separated threads, make room for text processing, sleep for fixed amount of time before final delivery . Thanks to tripleee answers and comments I found correct way.

import asyncio
from aiosmtpd.controller import Controller
import smtplib
from email import message_from_bytes
from email.policy import default
class MyHandler:
    async def handle_DATA(self, server, session, envelope):
        peer = session.peer
        mailfrom = envelope.mail_from
        rcpttos = envelope.rcpt_tos
        message = message_from_bytes(envelope.content, policy=default)
        #HERE MAYBE WOULD BE SAFER TO WALK CONTENTS AND PARSE/MODIFY ONLY MAIL BODY, BUT NO SIDE EFFECTS UNTIL NOW WITH MIME, ATTACHMENTS...
        messagetostring = message.as_string() ### smtplib.sendmail WANTED BYTES or STRING, NOT email OBJECT.
        ### HERE HAPPENS TEXT PROCESSING, STRING SUBSTITUTIONS...
        ### THIS WAS MY CORE NEED, ASYNCWAIT ON EACH THREAD
        await asyncio.sleep(15)
        server = smtplib.SMTP('localhost', 10027)
        server.send_message(mailfrom, rcpttos, messagetostring) ### NEEDED TO INVERT ARGS ORDER
        server.quit()
        return '250 OK' ### ADDED RETURN
    
 my_handler = MyHandler()
    
 async def main(loop):
        my_controller = Controller(my_handler, hostname='127.0.0.1', port=10025)
        my_controller.start()
 loop = asyncio.get_event_loop()
 loop.create_task(main(loop=loop))
 try:
        loop.run_forever()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM