[英]How to really decode a 7Bit email message using PHP?
我有代码从服务器读取电子邮件,然后在将数据插入数据库之前解析它们。
我正在使用 PHP 中的IMAP 扩展来帮助我解决这个问题。
这是我正在做的读取数据的事情
//read new messages
private function _getNewMessages(){
// Checks the inbox
if ($messages = imap_search($this->conn,'ALL'))
{
// Sorts the messages newest first
sort($messages);
// Loops through the messages
foreach ($messages as $id)
{
// Grabs the overview and body
//$overview = imap_fetch_overview($this->conn, $id, 0);
$struct = imap_fetchstructure($this->conn, $id, 0);
$header = imap_headerinfo($this->conn, $id);
$message = imap_fetchbody($this->conn, $id, 1);
//decode the message
if(isset($struct->encoding)){
$message = $this->_decodeMessage($message, $struct->encoding);
}
$from = $header->from[0]->mailbox . '@' . $header->from[0]->host;
$subject = $header->subject;
echo $message;
}
}
else
{
exit('No messages to process');
}
}
我看到的问题是,当消息以值 0“7 位”编码时,它返回黑色。 解码似乎无法正确解码消息。
我正在使用这个 function 来解码 7 位
// function to decode 7BIT encoded message
private function _decode7Bit($text) {
// If there are no spaces on the first line, assume that the body is
// actually base64-encoded, and decode it.
$lines = explode("\r\n", $text);
$first_line_words = explode(' ', $lines[0]);
if ($first_line_words[0] == $lines[0]) {
$text = base64_decode($text);
}
// Manually convert common encoded characters into their UTF-8 equivalents.
$characters = array(
'=20' => ' ', // space.
'=E2=80=99' => "'", // single quote.
'=0A' => "\r\n", // line break.
'=A0' => ' ', // non-breaking space.
'=C2=A0' => ' ', // non-breaking space.
"=\r\n" => '', // joined line.
'=E2=80=A6' => '…', // ellipsis.
'=E2=80=A2' => '•', // bullet.
);
// Loop through the encoded characters and replace any that are found.
foreach ($characters as $key => $value) {
$text = str_replace($key, $value, $text);
}
return $text;
}
我也试过这种方法来解码消息
/**
* decoding 7bit strings to ASCII
* @param string $text
* @return string
*/
function decode7bit($text){
$ret = '';
$data = str_split(pack('H*', $text));
$mask = 0xFF;
$shift = 0;
$carry = 0;
foreach ($data as $char) {
if ($shift == 7) {
$ret .= chr($carry);
$carry = 0;
$shift = 0;
}
$a = ($mask >> ($shift+1)) & 0xFF;
$b = $a ^ 0xFF;
$digit = ($carry) | ((ord($char) & $a) << ($shift)) & 0xFF;
$carry = (ord($char) & $b) >> (7-$shift);
$ret .= chr($digit);
$shift++;
}
if ($carry) $ret .= chr($carry);
return $ret;
}
但消息是空白的。
我在这里做错了什么? 我能做些什么来确保消息被正确解码?
谢谢
我解决了这个问题。 我检查邮件是否为base64编码的方式不好。 所以我改变了这个功能
private function _isEncodedBase64($date){
if ( base64_encode(base64_decode($data)) === $data){
return true;
}
return false;
}
// function to decode 7BIT encoded message
private function _decode7Bit($text) {
// If there are no spaces on the first line, assume that the body is
// actually base64-encoded, and decode it.
if($this->_isEncodedBase64($text)){
$text = base64_decode($text);
}
// Manually convert common encoded characters into their UTF-8 equivalents.
$characters = array(
'=20' => ' ', // space.
'=E2=80=99' => "'", // single quote.
'=0A' => "\r\n", // line break.
'=A0' => ' ', // non-breaking space.
'=C2=A0' => ' ', // non-breaking space.
"=\r\n" => '', // joined line.
'=E2=80=A6' => '…', // ellipsis.
'=E2=80=A2' => '•', // bullet.
);
// Loop through the encoded characters and replace any that are found.
foreach ($characters as $key => $value) {
$text = str_replace($key, $value, $text);
}
return $text;
}
根据您描述的字符编码,看起来您的文本是使用带引号的printable编码的。
尝试使用以下方法解码:
echo quoted_printable_decode($text);
不必要的代码。 只需在 php 中使用 quoted_printable_decode 即可。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.