简体   繁体   中英

How to split single mail with procmail?

I have a quarantine folder that I periodically have to download and split by recipient inbox or even better split each message in a text file. I have ca 10.000 mails per day and I'm coding something with fetchmail and procmail. The problem is that i can't find out how to split message-by-message in procmail; they all end up in the same inbox.

I tried to pass every message in a script via a recipe like:

    :0
    | script_processing_messages.sh

Which contained

    read varname
    echo "$varname" > test_file

To try to see if I could obtain a single message in the $varname variable but nope, I only obtain a single line of a message each time.

Right now I use

    fetchmail --keep

where .fetchmailrc is

    poll mail.mymta.my protocol pop3 username "my@inbox.com" password "****" mda "procmail /root/.procmailrc"

and .procmailrc is

    VERBOSE=0
    DEFAULT=/root/inbox.quarantine

I would like to obtain a file for each message, so:

1.txt
2.txt
3.txt
[...]
10000.txt

I have many recipients and many domains, so I can't let's say write 5000 rules to match every recipient. It would be good if there was some kind of

^To: $USER 

that redirect to

/$USER.inbox

so that procmail itself takes care of reading and creating dinamically these inbox

I'm not very expert in fetchmail and procmail recipes, I'm trying hard but I'm not going so far.

You seem to have two or three different questions; proper etiquette on Stack Overflow would be to ask each one separately - this also helps future visitors who have just one of your problems.

First off, to split a Berkeley mbox file containing multiple messages and run Procmail on each separately, try

formail -s procmail -m <file.mbox

You might need to read up on the mailbox formats supported by Procmail . A Berkeley mailbox is a single file which contains multiple messages, simply separated by a line beginning with From (with a space after the four alphabetic characters). This separator has to be unique, and so a message which contains those five characters at beginning of a line in the body will need to be escaped somehow (typically by writing a > before From ).

To save each message in a separate file, choose a different mailbox format than the single-file Berkeley format. Concretely, if the destination is a directory, Procmail will create a new file in that directory. How exactly the new file is named depends on the contents of the directory (if it contains the Maildir subdirectories new , tmp , and cur , the new file is created in new in accordance with Maildir naming conventions) and on how exactly the directory is specified (trailing slash and dot selects MH format; otherwise, mail directory format).

Saving to one mailbox per recipient has a number of pesky corner cases. What if the message was sent to more than one of your local recipients? What if the recipient address is not visible in the headers? etc (the Procmail Mini-FAQ has a section about this , in the context of virtual hosting of a domain, which this is basically a variation of). But if we simply ignore these, you might be able to pull it off with something like

:0  # whitespace before ] is a literal tab
* ^TO_\/[^ @    ]+@(yourdomain\.example|example\.info)\>
{
    # Trim domain part from captured MATCH
    :0
    * MATCH ?? ^\/[^@]+
    ./$MATCH/
}

This will capture into $MATCH the first address which matches the regex, then perform another regex match on the captured string to capture just the part before the @ sign. This obviously requires that the addresses you want to match are all in a set of specific domains (here, I used yourdomain.example and example.info ; obviously replace those with your actual domain names) and that capturing the first matching address is sufficient (so if a message was To: alice@yourdomain.example and Cc: bob@example.info , whichever one of those is closer to the top of the message will be picked out by this recipe, and the other one will be ignored).

In some more detail, the \\/ special token causes Procmail to copy the text which matched the regex after this point into the internal variable MATCH . As this recipe demonstrates, you can then perform a regex match on that variable itself to extract a substring of it (or, in other words, discard part of the captured match).

The action ./$MATCH/ uses the captured string in MATCH as the name of the folder to save into. The leading ./ specifies the current directory (which is equal to the value of the Procmail variable MAILDIR ) and the trailing / selects mail directory format.

If your expected recipients cannot be constrained to be in a specific set of domains or otherwise matched by a single regex, my recommendation would be to ask a new question with more limited scope, and enough details to actually identify what you want to accomplish.

I found a solution to a part of my problem.

It seems that there is no way in procmail to let procmail itself recognize the For recipient without specifying it in a recipe, so I just obtained a list and create a huge recipe file.

But then I just discovered that to save single mails and to avoid huge mailboxes filled with a lot of mails, one could just write a recipe like:

:0
* ^To: recipient@mail.it
/inbox/folder/recipient@mail.it/

Note the / at the end: this will make procmail creating a folder structure instead of writing everywhing in a single file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM