[英]HTML email to plain text with MIME::Entity
I'm using a perl script to convert HTML mails to plain text.我正在使用 perl 脚本将 HTML 邮件转换为纯文本。
The current code (for multipart mails) looks like this:当前代码(用于多部分邮件)如下所示:
my $parser = new MIME::Parser;
my $entity = $parser->parse(\*STDIN) or die "parse failed\n";
for my $part ($entity->parts()) {
if ($part->mime_type eq 'text/html') {
my $bh = $part->bodyhandle;
my $tree = HTML::TreeBuilder->new();
$tree->utf8_mode();
$tree->parse($bh->as_string);
my $formatter = HTML::FormatText->new(leftmargin => 0, rightmargin => 72);
my $txt = $formatter->format($tree);
my $txtEntity=MIME::Entity->build(Data => $txt,
Type => "text/plain",
Encoding => "8bit"
);
$entity->add_part($txtEntity,0);
}
}
$entity->print(\*STDOUT);
It works but it adds just adds the plain text part to the existing parts and doesn't replace the HTML part.它可以工作,但它只会将纯文本部分添加到现有部分,而不会替换 HTML 部分。
So I came up with this:所以我想出了这个:
my $head = $entity->head;
my $txtEntity=MIME::Entity->build(Data => $txt,
Type => "text/plain",
Encoding => "8bit",
From => $head->get('From',0),
To => $head->get('To',0),
Subject => $head->get('Subject',0),
Cc => $head->get('Cc',0)
);
$txtEntity->print(\*STDOUT);
But that could remove some parts of the email header.但这可能会删除电子邮件标题的某些部分。 Is there a function to replace the HTML body completely with the plain text one?
是否有一种功能可以用纯文本完全替换 HTML 正文?
Thanks!谢谢!
If you don't have a way to replace the body instead of adding a new part, this might be a job for the formail utility (part of procmail) which can generate a new email with the headers of the old email, replacing the things you want to replace (like the encoding and content-type headers).如果您没有办法替换正文而不是添加新部分,这可能是 formail 实用程序(procmail 的一部分)的工作,它可以生成带有旧电子邮件标题的新电子邮件,替换内容您要替换(如编码和内容类型标头)。
Also, you might just try changing the encoding to text-plain.此外,您可以尝试将编码更改为纯文本。 You will still see the HTML code, but it will not render and you will also see your plain/text addition, though I grant this is a poor solution.
您仍然会看到 HTML 代码,但它不会呈现,您还会看到纯文本/文本添加,尽管我承认这是一个糟糕的解决方案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.