简体   繁体   English

PHP imap:如何解码和转换 Windows-1252 字符集电子邮件?

[英]PHP imap: how to decode and convert Windows-1252 charset emails?

My PHP app processes incoming emails.我的 PHP 应用程序处理传入的电子邮件。 The processing code usually works fine, but the app crashed recently with the below exception:处理代码通常工作正常,但应用程序最近崩溃,出现以下异常:

Unexpected encoding - UTF-8 or ASCII was expected (View: /home/customer/www/gonativeguide.com/gng2-core/vendor/laravel/framework/src/Illuminate/Mail/resources/views/html/panel.blade.php) {"exception":"[object] (Facade\\Ignition\\Exceptions\\ViewException(code: 0): Unexpected encoding - UTF-8 or ASCII was expected (View: /home/customer/www/gonativeguide.com/gng2-core/vendor/laravel/framework/src/Illuminate/Mail/resources/views/html/panel.blade.php) at /home/customer/www/gonativeguide.com/gng2-core/vendor/league/commonmark/src/Input/MarkdownInput.php:30)

It seems that there was an incoming email whose text was not properly decoded and this made the app crash later on.似乎有一封传入电子邮件的文本未正确解码,这导致应用程序稍后崩溃。

I realized that the email had a Windows-1252 encoding:我意识到该电子邮件具有 Windows-1252 编码:

Content-Type: text/html; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable

The email decoding code looks currently like this:电子邮件解码代码目前如下所示:

// DECODE DATA
        $data = ($partno)?
            imap_fetchbody($mbox,$mid,$partno):  // multipart
            imap_body($mbox,$mid);  // simple
        // Any part may be encoded, even plain text messages, so check everything.
        if ($p->encoding==4)
            $data = quoted_printable_decode($data);
        elseif ($p->encoding==3)
            $data = base64_decode($data);

I checked this page to understand what I need to change to decode emails with Windows-1252, but it not clear to me which value corresponds to Windows-1252 and how to decode and convert the data to UTF-8.我查看了 页面以了解使用 Windows-1252 解码电子邮件需要更改什么,但我不清楚哪个值对应于 Windows-1252 以及如何解码数据并将其转换为 UTF-8。 I would highly appreciate any hints, preferably with suggested code on this.我非常感谢任何提示,最好是在这方面提供建议的代码。

Thanks, W.谢谢,W。

In your case, this line:在你的情况下,这一行:

$data = quoted_printable_decode($data);

needs to be adapted like this:需要这样调整:

$data = mb_convert_encoding(quoted_printable_decode($data), 'UTF-8', 'Windows-1252');

More generally, to cope with non-UTF-8 encodings, you may want to extract the charset of the body part:更一般地,为了处理非 UTF-8 编码,您可能需要提取正文部分的charset

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM