繁体   English   中英

HttpClient.PostAsync 的字符编码问题

[英]Character encoding problem with HttpClient.PostAsync

我们有一个旧版 web 应用程序,它可以在浏览器中手动运行。 当我尝试使用 http 帖子在代码中使用相同的 web 应用程序时,我得到一些土耳其语字符?

我有以下代码来制作 http 帖子:

var httpClient = new HttpClient(); //static readonly in real code

var content = new StringContent("id_6=some text with Turkish characters öçşığüÖÇŞİĞÜ", Encoding.GetEncoding("ISO-8859-9"), "application/x-www-form-urlencoded");
var response = httpClient.PostAsync(url, content).Result; //I know this is not a good way, I'll focus on it later
var responseInString = response.Content.ReadAsStringAsync().Result;
File.WriteAllText("c:\\temp\\a.htm", responseInString);

web 应用程序返回给我一个带有一些输入值的 html,包括我的代码发布的值。 我的代码发布的表单值和使用我的值计算的表单值的土耳其语字符不好,而带有土耳其语字符的硬编码提交按钮看起来还不错。

web 应用程序将此 html(为简单起见截断)返回到我的代码:

<!-- BELOW IS THE HARDCODED FORM FIELD WITH TURKISH CHARS OK! DISPLAYED AS: Programı Çağır -->
<input type="submit" value="Program&#305; &Ccedil;a&#287;&#305;r" name="j_id_jsp_262293626_16"/>

<!-- IRRELEVANT HTML REMOVED -->

<!-- BELOW IS THE OUTPUT FORM FIELD WITH CHAR ş BAD! DISPLAYED AS: some text with Turkish characters öç???üÖÇ???Ü -->
<input type="text" value="some text with Turkish characters &ouml;&ccedil;???&uuml;&Ouml;&Ccedil;???&Uuml;" id="id_2" name="id_2"/>

<!-- BELOW IS THE INPUT FORM FIELD WITH CHAR ş BAD! -->
<input type="text" value="some text with Turkish characters &ouml;&ccedil;???&uuml;&Ouml;&Ccedil;???&Uuml;" id="id_6" name="id_6" />

响应标头看起来不错: 来自调试的内容标头

有什么问题?

编辑:发布到示例表单的类似代码可以正常工作:

    static readonly HttpClient httpClient = new HttpClient();

    [TestMethod]
    public void TestHttpClientForTurkish()
    {
        var data = new Dictionary<string, string>()
        {
            {"fname", "öçşığü" },
            {"lname", "ÖÇŞİĞÜ" }
        };

        var content = new FormUrlEncodedContent(data);
        var response = httpClient.PostAsync("https://www.w3schools.com/action_page.php", content).Result;

        var responseInString = response.Content.ReadAsStringAsync().Result;
        Assert.IsTrue(responseInString.Contains("öçşığü") && responseInString.Contains("ÖÇŞİĞÜ"));
    }

试试下面的代码

 public static async Task SendRequestAsync() { var data = new Dictionary<string, byte[]>(); var key1 = "fname"; var val1 = Encoding.Unicode.GetBytes("öçşığü"); data.Add(key1, val1); var key2 = "lname"; var val2 = Encoding.Unicode.GetBytes("ÖÇŞİĞÜ"); data.Add(key2, val2); MemoryStream fs = new MemoryStream(); BinaryFormatter formatter = new BinaryFormatter(); formatter.Serialize(fs, data); var barr = fs.ToArray(); var client = new HttpClient { BaseAddress = new Uri("http://www.yourservicelocation.com") }; client.DefaultRequestHeaders.Accept.Clear(); client.DefaultRequestHeaders.Accept.Add( new MediaTypeWithQualityHeaderValue("application/bson")); var byteArrayContent = new ByteArrayContent(barr); byteArrayContent.Headers.ContentType = new MediaTypeHeaderValue("application/bson"); var result = await client.PostAsync( "api/SomeData/Incoming", byteArrayContent); result.EnsureSuccessStatusCode(); }

我的发现:

  1. FormUrlEncodedContent class 不支持 Encoding 参数(因此不能处理土耳其字符),所以我不得不使用 StringContent
  2. 我不得不使用 HttpUtility.UrlEncode 对表单值进行编码(并使用 ISO-8859-9 作为编码)。

这是表单字段中土耳其语字符没有任何问题的最终代码:

var httpClient = new HttpClient(); //static readonly in real code
var iso = Encoding.GetEncoding("ISO-8859-9");

var content = new StringContent("id_6="+
    HttpUtility.UrlEncode("some text with Turkish characters öçşığüÖÇŞİĞÜ", iso), iso, 
    "application/x-www-form-urlencoded");
var response = httpClient.PostAsync(url, content).Result;//Using Result because I don't have a UI thread or the context is not ASP.NET
var responseInString = response.Content.ReadAsStringAsync().Result;
File.WriteAllText("c:\\temp\\a.htm", responseInString);

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM