C＃電子郵件主題解析

Question

我正在構建一個用C＃讀取電子郵件的系統。 我在解析主題時遇到了問題，我認為這個問題與編碼有關。

我正在閱讀的主題如下： =?ISO-8859-1?Q?=E6=F8sd=E5f=F8sdf_sdfsdf?= ，發送的原始主題是æøsdåføsdf sdfsdf （其中有挪威字符）。

我有什么想法可以改變編碼或正確解析它？ 到目前為止，我已經嘗試使用C＃編碼轉換技術將主題編碼為utf8，但沒有任何運氣。

這是我嘗試過的解決方案之一：

Encoding iso = Encoding.GetEncoding("iso-8859-1");
Encoding utf = Encoding.UTF8;
string decodedSubject =
    utf.GetString(Encoding.Convert(utf, iso,
                                   iso.GetBytes(m.Subject.Split('?')[3])));

Answer 1

編碼稱為quoted printable 。

查看此問題的答案。

改編自已接受的答案：

public string DecodeQuotedPrintable(string value)
{
        Attachment attachment = Attachment.CreateAttachmentFromString("", value);
        return attachment.Name;
}

當傳遞字符串=?ISO-8859-1?Q?=E6=F8sd=E5f=F8sdf_sdfsdf?=這將返回“æøsdåføsdf_sdfsdf”。

Answer 2

    public static string DecodeEncodedWordValue(string mimeString)
    {
        var regex = new Regex(@"=\?(?<charset>.*?)\?(?<encoding>[qQbB])\?(?<value>.*?)\?=");
        var encodedString = mimeString;
        var decodedString = string.Empty;

        while (encodedString.Length > 0)
        {
            var match = regex.Match(encodedString);
            if (match.Success)
            {
                // If the match isn't at the start of the string, copy the initial few chars to the output
                decodedString += encodedString.Substring(0, match.Index);

                var charset = match.Groups["charset"].Value;
                var encoding = match.Groups["encoding"].Value.ToUpper();
                var value = match.Groups["value"].Value;

                if (encoding.Equals("B"))
                {
                    // Encoded value is Base-64
                    var bytes = Convert.FromBase64String(value);
                    decodedString += Encoding.GetEncoding(charset).GetString(bytes);
                }
                else if (encoding.Equals("Q"))
                {
                    // Encoded value is Quoted-Printable
                    // Parse looking for =XX where XX is hexadecimal
                    var regx = new Regex("(\\=([0-9A-F][0-9A-F]))", RegexOptions.IgnoreCase);
                    decodedString += regx.Replace(value, new MatchEvaluator(delegate(Match m)
                    {
                        var hex = m.Groups[2].Value;
                        var iHex = Convert.ToInt32(hex, 16);

                        // Return the string in the charset defined
                        var bytes = new byte[1];
                        bytes[0] = Convert.ToByte(iHex);
                        return Encoding.GetEncoding(charset).GetString(bytes);
                    }));
                    decodedString = decodedString.Replace('_', ' ');
                }
                else
                {
                    // Encoded value not known, return original string
                    // (Match should not be successful in this case, so this code may never get hit)
                    decodedString += encodedString;
                    break;
                }

                // Trim off up to and including the match, then we'll loop and try matching again.
                encodedString = encodedString.Substring(match.Index + match.Length);
            }
            else
            {
                // No match, not encoded, return original string
                decodedString += encodedString;
                break;
            }
        }
        return decodedString;
    }

C＃電子郵件主題解析

問題描述

2 個解決方案

解決方案1
6 已采納 2010-11-05 14:50:23

解決方案2
6 2010-11-08 13:14:29

C＃電子郵件主題解析

問題描述

2 個解決方案

解決方案1 6 已采納 2010-11-05 14:50:23

解決方案2 6 2010-11-08 13:14:29

解決方案1
6 已采納 2010-11-05 14:50:23

解決方案2
6 2010-11-08 13:14:29