简体   繁体   English

如何使用procmail拆分单个邮件?

[英]How to split single mail with procmail?

I have a quarantine folder that I periodically have to download and split by recipient inbox or even better split each message in a text file. 我有一个隔离文件夹,我必须定期下载该文件夹,并按收件人收件箱进行拆分,甚至最好将每个邮件拆分为文本文件。 I have ca 10.000 mails per day and I'm coding something with fetchmail and procmail. 我每天大约有10.000封邮件,并且正在使用fetchmail和procmail进行编码。 The problem is that i can't find out how to split message-by-message in procmail; 问题是我不知道如何在procmail中按消息拆分消息。 they all end up in the same inbox. 它们都放在同一个收件箱中。

I tried to pass every message in a script via a recipe like: 我试图通过以下配方传递脚本中的所有消息:

    :0
    | script_processing_messages.sh

Which contained 其中包含

    read varname
    echo "$varname" > test_file

To try to see if I could obtain a single message in the $varname variable but nope, I only obtain a single line of a message each time. 为了尝试查看是否可以在$ varname变量中获得一条消息,但没有,我每次仅获得一条消息。

Right now I use 现在我用

    fetchmail --keep

where .fetchmailrc is .fetchmailrc在哪里

    poll mail.mymta.my protocol pop3 username "my@inbox.com" password "****" mda "procmail /root/.procmailrc"

and .procmailrc is .procmailrc是

    VERBOSE=0
    DEFAULT=/root/inbox.quarantine

I would like to obtain a file for each message, so: 我想为每个消息获取一个文件,所以:

1.txt
2.txt
3.txt
[...]
10000.txt

I have many recipients and many domains, so I can't let's say write 5000 rules to match every recipient. 我有许多收件人和许多域,所以我不能说写5000条规则来匹配每个收件人。 It would be good if there was some kind of 如果有某种

^To: $USER 

that redirect to 重定向到

/$USER.inbox

so that procmail itself takes care of reading and creating dinamically these inbox 因此procmail本身会负责阅读和创建这些收件箱

I'm not very expert in fetchmail and procmail recipes, I'm trying hard but I'm not going so far. 我在fetchmail和procmail食谱方面不是很熟练,我正在努力尝试,但我没有走那么远。

You seem to have two or three different questions; 您似乎有两个或三个不同的问题。 proper etiquette on Stack Overflow would be to ask each one separately - this also helps future visitors who have just one of your problems. 堆栈溢出的适当礼节是分别询问每个人-这还可以帮助那些只遇到您问题之一的未来访客。

First off, to split a Berkeley mbox file containing multiple messages and run Procmail on each separately, try 首先,要拆分包含多个消息的Berkeley mbox文件并分别对每个消息运行Procmail,请尝试

formail -s procmail -m <file.mbox

You might need to read up on the mailbox formats supported by Procmail . 您可能需要阅读Procmail支持邮箱格式 A Berkeley mailbox is a single file which contains multiple messages, simply separated by a line beginning with From (with a space after the four alphabetic characters). 伯克利邮箱是一个包含多个消息的文件,仅由以From开头的行分隔(四个字母字符后有空格)。 This separator has to be unique, and so a message which contains those five characters at beginning of a line in the body will need to be escaped somehow (typically by writing a > before From ). 该分隔符必须是唯一的,因此,需要以某种方式对正文中一行的开头包含这五个字符的消息进行转义(通常通过在From之前写一个> )。

To save each message in a separate file, choose a different mailbox format than the single-file Berkeley format. 要将每个邮件保存在单独的文件中,请选择与单文件Berkeley格式不同的邮箱格式。 Concretely, if the destination is a directory, Procmail will create a new file in that directory. 具体而言,如果目标位置是目录,则Procmail将在该目录中创建一个新文件。 How exactly the new file is named depends on the contents of the directory (if it contains the Maildir subdirectories new , tmp , and cur , the new file is created in new in accordance with Maildir naming conventions) and on how exactly the directory is specified (trailing slash and dot selects MH format; otherwise, mail directory format). 新文件的确切命名方式取决于目录的内容(如果它包含Maildir子目录newtmpcur ,则根据Maildir命名约定以new方式创建新文件)以及如何精确指定目录(后跟斜杠和点选择MH格式;否则,选择邮件目录格式)。

Saving to one mailbox per recipient has a number of pesky corner cases. 每个收件人保存到一个邮箱有很多麻烦的极端情况。 What if the message was sent to more than one of your local recipients? 如果邮件发送给您的多个本地收件人怎么办? What if the recipient address is not visible in the headers? 如果收件人地址在标题中不可见怎么办? etc (the Procmail Mini-FAQ has a section about this , in the context of virtual hosting of a domain, which this is basically a variation of). 等等(Procmail Mini-FAQ在虚拟托管域的上下文中有一个关于此的部分 ,这基本上是其变体)。 But if we simply ignore these, you might be able to pull it off with something like 但是,如果我们只是忽略这些,则可以通过类似

:0  # whitespace before ] is a literal tab
* ^TO_\/[^ @    ]+@(yourdomain\.example|example\.info)\>
{
    # Trim domain part from captured MATCH
    :0
    * MATCH ?? ^\/[^@]+
    ./$MATCH/
}

This will capture into $MATCH the first address which matches the regex, then perform another regex match on the captured string to capture just the part before the @ sign. 这会将与正则表达式$MATCH的第一个地址捕获到$MATCH ,然后在捕获的字符串上执行另一个正则表达式匹配,以捕获@符号之前的部分。 This obviously requires that the addresses you want to match are all in a set of specific domains (here, I used yourdomain.example and example.info ; obviously replace those with your actual domain names) and that capturing the first matching address is sufficient (so if a message was To: alice@yourdomain.example and Cc: bob@example.info , whichever one of those is closer to the top of the message will be picked out by this recipe, and the other one will be ignored). 显然,这显然要求您要匹配的地址都在一组特定的域中(在这里,我使用了yourdomain.exampleexample.info ;显然用您的实际域名替换了这些地址),并且捕获了第一个匹配的地址就足够了(因此,如果一条消息是To: alice@yourdomain.exampleCc: bob@example.info ,则该食谱将选择其中一个靠近消息顶部的消息,而另一个消息将被忽略)。

In some more detail, the \\/ special token causes Procmail to copy the text which matched the regex after this point into the internal variable MATCH . 更详细地讲, \\/特殊标记使Procmail将与此后的正则表达式匹配的文本复制到内部变量MATCH As this recipe demonstrates, you can then perform a regex match on that variable itself to extract a substring of it (or, in other words, discard part of the captured match). 如本食谱所示,然后您可以对该变量本身执行正则表达式匹配以提取它的子字符串(或者换句话说,丢弃捕获的匹配的一部分)。

The action ./$MATCH/ uses the captured string in MATCH as the name of the folder to save into. 操作./$MATCH/使用MATCH捕获的字符串作为要保存到的文件夹的名称。 The leading ./ specifies the current directory (which is equal to the value of the Procmail variable MAILDIR ) and the trailing / selects mail directory format. 前导./指定当前目录(等于Procmail变量MAILDIR的值),后缀/选择邮件目录格式。

If your expected recipients cannot be constrained to be in a specific set of domains or otherwise matched by a single regex, my recommendation would be to ask a new question with more limited scope, and enough details to actually identify what you want to accomplish. 如果不能将您的预期收件人限制在一组特定的域中,或者不能通过单个正则表达式来匹配您的收件人,那么我的建议是提出一个范围更有限的新问题,并提供足够的详细信息来实际确定您想要完成的工作。

I found a solution to a part of my problem. 我找到了部分问题的解决方案。

It seems that there is no way in procmail to let procmail itself recognize the For recipient without specifying it in a recipe, so I just obtained a list and create a huge recipe file. 似乎procmail中没有办法让procmail本身无需在配方中指定即可让Formail本身识别For收件人,因此我只是获得了一个列表并创建了一个巨大的配方文件。

But then I just discovered that to save single mails and to avoid huge mailboxes filled with a lot of mails, one could just write a recipe like: 但是后来我才发现,要保存一封邮件并避免装满大量邮件的巨大邮箱,可以编写如下食谱:

:0
* ^To: recipient@mail.it
/inbox/folder/recipient@mail.it/

Note the / at the end: this will make procmail creating a folder structure instead of writing everywhing in a single file. 注意最后的/ :这将使procmail创建文件夹结构,而不是将每个文件都写入一个文件中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM