[英]PHP imap: how to decode and convert Windows-1252 charset emails?
My PHP app processes incoming emails.我的 PHP 应用程序处理传入的电子邮件。 The processing code usually works fine, but the app crashed recently with the below exception:处理代码通常工作正常,但应用程序最近崩溃,出现以下异常:
Unexpected encoding - UTF-8 or ASCII was expected (View: /home/customer/www/gonativeguide.com/gng2-core/vendor/laravel/framework/src/Illuminate/Mail/resources/views/html/panel.blade.php) {"exception":"[object] (Facade\\Ignition\\Exceptions\\ViewException(code: 0): Unexpected encoding - UTF-8 or ASCII was expected (View: /home/customer/www/gonativeguide.com/gng2-core/vendor/laravel/framework/src/Illuminate/Mail/resources/views/html/panel.blade.php) at /home/customer/www/gonativeguide.com/gng2-core/vendor/league/commonmark/src/Input/MarkdownInput.php:30)
It seems that there was an incoming email whose text was not properly decoded and this made the app crash later on.似乎有一封传入电子邮件的文本未正确解码,这导致应用程序稍后崩溃。
I realized that the email had a Windows-1252 encoding:我意识到该电子邮件具有 Windows-1252 编码:
Content-Type: text/html; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
The email decoding code looks currently like this:电子邮件解码代码目前如下所示:
// DECODE DATA
$data = ($partno)?
imap_fetchbody($mbox,$mid,$partno): // multipart
imap_body($mbox,$mid); // simple
// Any part may be encoded, even plain text messages, so check everything.
if ($p->encoding==4)
$data = quoted_printable_decode($data);
elseif ($p->encoding==3)
$data = base64_decode($data);
I checked this page to understand what I need to change to decode emails with Windows-1252, but it not clear to me which value corresponds to Windows-1252 and how to decode and convert the data to UTF-8.我查看了 此页面以了解使用 Windows-1252 解码电子邮件需要更改什么,但我不清楚哪个值对应于 Windows-1252 以及如何解码数据并将其转换为 UTF-8。 I would highly appreciate any hints, preferably with suggested code on this.我非常感谢任何提示,最好是在这方面提供建议的代码。
Thanks, W.谢谢,W。
In your case, this line:在你的情况下,这一行:
$data = quoted_printable_decode($data);
needs to be adapted like this:需要这样调整:
$data = mb_convert_encoding(quoted_printable_decode($data), 'UTF-8', 'Windows-1252');
More generally, to cope with non-UTF-8 encodings, you may want to extract the charset
of the body part:更一般地,为了处理非 UTF-8 编码,您可能需要提取正文部分的charset
:
imap_bodystruct()
, or来自身体部位结构,由imap_bodystruct()
返回,或imap_fetchmime()
.来自正文部分 MIME headers ,由imap_fetchmime()
返回。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.