简体   繁体   English

仅解码文本电子邮件文件的一部分以进行bash处理

[英]decoding only a portion of a text-email file for bash processing

I am scanning the /home/vmail/ sub-directories for received email text files, and deleting them if a string is matched. 我正在扫描/home/vmail/子目录以查找收到的电子邮件文本文件,如果匹配字符串则将其删除。 An optimized script is thanks to this answer . 一个优化的脚本要感谢这个答案

my_new_del() {
    find /home/vmail -type f -name '*.some.file.pattern*' -exec grep -i -H -l -s "$1" {} + |
    xargs rm -f {}
}

It works like a charm, and deletes the files that match the string I pass over. 它像一个超级按钮一样工作,并删除与我传递过来的字符串匹配的文件。 However, I just realized that some of the files have content that is base64-encided. 但是,我只是意识到某些文件具有base64编码的内容。 This is a spam email, and the contents is spam, but it looks as follows: 这是一封垃圾邮件,内容为垃圾邮件,但外观如下:

Return-Path: <Bartybeve@aznetwork.net>
X-Original-To: info@my_domain.com
Delivered-To: info@my_domain.com
Received: by some.qdmn.com (Postfix, from userid 5000)
        id D47C87F8CB; Thu, 11 Oct 2018 04:21:11 -0400 (EDT)
X-Original-To: info@my_domain.com
Delivered-To: info@my_domain.com
Received: from vlan131-44.aznetwork.net (unknown [185.129.1.44])
        by some.qdmn.com (Postfix) with ESMTP id 1F1077F8C9
        for info@my_domain.com Thu, 11 Oct 2018 04:21:05 -0400 (EDT)
Received: from unknown (60.233.87.144)
        by mmx09.tilkbans.com with ESMTP; Thu, 11 Oct 2018 00:16:37 -0700
Received: from unknown (124.156.103.124)
        by mailout.endmonthnow.com with ASMTP; Thu, 11 Oct 2018 00:10:28 -0700
Message-ID: <7B6B9A4E.9D85F307@aznetwork.net>
Date: Thu, 11 Oct 2018 00:10:28 -0700
Reply-To: "Anja" <Bartybeve@aznetwork.net>
From: "Anja" <Bartybeve@aznetwork.net>
User-Agent: Opera/7.02 (Windows ME; U)
MIME-Version: 1.0
To: "Anja" <info@my_domain.com>
Subject: I could not resist and pass by!
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: base64

PCFkb2N0eXBlIGh0bWw+DQo8aHRtbD4NCjxoZWFkPg0KPG1ldGEgY2hhcnNldD0idXRmLTgiPg0K
PC9oZWFkPg0KDQo8Ym9keT4NCjxwPjx0YWJsZSB3aWR0aD0iMTMlIiBib3JkZXI9IjAiPjx0Ym9k
eT48dHI+PHRkPjwvdGQ+PHRkPjwvdGQ+PHRkPjwvdGQ+PHRkPjwvdGQ+PHRkPjwvdGQ+PC90cj48
L3Rib2R5PjwvdGFibGU+PC9wPg0KPHA+V2FudCBtZT8gd2FubmEgZnVjayBtZT8gT2hoaGguLi4u
IG9rLCBjb21lIHRvIG1lICkpIEhlcmUgbXkgZm90byBhbmQgYWRkcmVzcywgZmluZCBtZSA6KSA8
L3A+DQo8cD48dGFibGUgd2lkdGg9IjcyJSIgYm9yZGVyPSIwIj48dGJvZHk+PHRyPjx0ZD48L3Rk
PjwvdHI+PC90Ym9keT48L3RhYmxlPjwvcD4NCjxhICAgaHJlZj0iaHR0cDovL2xvdmVmb3J5b3Uu
c3UiIHRhcmdldD0iX2JsYW5rIiBzdHlsZT0iZm9udC13ZWlnaHQ6IG5vcm1hbDtsZXR0ZXItc3Bh
Y2luZzogbm9ybWFsO2xpbmUtaGVpZ2h0OiAxMDAlO3RleHQtZGVjb3JhdGlvbjogbm9uZTtjb2xv
cjogIzc3NzsiPmh0dHA6Ly9sb3ZlZm9yeW91LnN1PC9hPg0KPHA+PHRhYmxlIHdpZHRoPSIyNyUi
IGJvcmRlcj0iMCI+PHRib2R5Pjx0cj48dGQ+PC90ZD48dGQ+PC90ZD48dGQ+PC90ZD48dGQ+PC90
ZD48L3RyPjwvdGJvZHk+PC90YWJsZT48L3A+DQo8YSBocmVmPSJodHRwOi8vbG92ZWZvcnlvdS5z
dSI+PGltZyBzcmM9Imh0dHBzOi8vNzgubWVkaWEudHVtYmxyLmNvbS83ZTU3ZjBlMDUzZWNlYjA2
MGQwZDMyMzQ3NmQxZWI3MS90dW1ibHJfb3kycmd4TkRFYzF3MmtqZGRvMV80MDAuZ2lmIiBhbHQ9
ImNsaWNrIGhlcmUgYW5kIHNlZSBteSBwaG90byIgYm9yZGVyPSIwIiA+PC9hPg0KPHA+PHRhYmxl
IHdpZHRoPSI3NiUiIGJvcmRlcj0iMCI+PHRib2R5Pjx0cj48dGQ+PC90ZD48dGQ+PC90ZD48dGQ+
PC90ZD48dGQ+PC90ZD48dGQ+PC90ZD48L3RyPjwvdGJvZHk+PC90YWJsZT48L3A+DQo8YSBocmVm
PSJodHRwOi8vbG92ZWZvcnlvdS5zdSI+dW5zdWJzY3JpYmU8L2E+DQo8cD48dWw+PC91bD48L3A+
DQo8L2JvZHk+DQo8L2h0bWw+DQo=

Therefore, when I am trying to find a file that has contents that match a string using the aliased bash command - email-files like the one above will not be marked. 因此,当我尝试使用别名bash命令查找具有与字符串匹配的内容的文件时,将不会标记上述电子邮件文件。

I know I can use echo 'some-base64-encoded-text' | base64 --decode 我知道我可以使用echo 'some-base64-encoded-text' | base64 --decode echo 'some-base64-encoded-text' | base64 --decode to decode the message. echo 'some-base64-encoded-text' | base64 --decode解码消息。 And a web decoding tool does show me that the decoded text has the portion of spam. 网络解码工具确实向我展示了解码后的文本包含垃圾邮件部分。

I was thinking to first grep for a Content-Transfer-Encoding: base64 match, and then find the index of the Content-Transfer-Encoding: base64 string, and from there on, decode the message, echo it out, and then grep for a match and delete the file if a match is found. 我在想先grep进行Content-Transfer-Encoding: base64匹配,然后找到Content-Transfer-Encoding: base64字符串的索引,然后从那里解码消息,将其回显,然后进行grep匹配项,如果找到匹配项,则删除文件。

But, is there a simple way to do it on the fly? 但是,有没有一种简单的方法可以即时进行呢?

Here's some perl. 这是一些perl。 It requires MIME::Base64 ( cpan install MIME::Base64 ) 它需要MIME :: Base64( cpan install MIME::Base64

#!perl
use strict; 
use warnings; 
use autodie;
use MIME::Base64;
$/ = "";
for my $file (@ARGV) {
    open my $fh, "<", $file; 
    my @paragraphs = <$fh>; 
    close $fh; 
    my $header = shift @paragraphs;
    my $content;
    if ($header =~ /Content-Transfer-Encoding: base64/) {
        $content = decode_base64($paragraphs[0]);
    }
    else {
        $content = join "\n\n", @paragraphs;
    }
    if ($content =~ /$ENV{pattern}/) { 
        print "delete: $file\n";
        ## unlink $file;   # uncomment to really delete the file
    }
}

And then you can do: 然后您可以执行以下操作:

find ... -exec env pattern="$1" perl email_scanner.pl +

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM