简体   繁体   English

C#-WebClient.DownloadString无法检测到响应编码

[英]C# - WebClient.DownloadString does not detect response encoding

As I was working with WebClient class, I noticed that a simple call like this 当我使用WebClient类时,我注意到一个像这样的简单调用

string downloadedString = new WebClient().DownloadString("http://whatever");

produced a string using an incorrect encoding, even though the response contained a proper Content-Type header application/json; charset=utf-8 即使响应包含正确的Content-Type标头application/json; charset=utf-8 ,也会使用错误的编码生成字符串application/json; charset=utf-8 application/json; charset=utf-8 . application/json; charset=utf-8

When I looked at the source code I found out that DownloadString doesn't look at the response headers at all. 当我查看源代码时,我发现DownloadString根本没有查看响应头。 Instead it uses request.ContentType and if the charset is not present there, it uses the Encoding property (which has to be set beforehand, otherwise it will be system's default). 相反,它使用request.ContentType ,如果那里不存在字符集,则使用Encoding属性(必须事先设置,否则将是系统的默认值)。

It seems weird that we have to specifically tell the WebClient object which encoding to use before sending the request (by adding a Content-Type header or setting encoding directly). 看起来很奇怪,我们必须在发送请求之前专门告诉WebClient对象要使用哪种编码(通过添加Content-Type标头或直接设置编码)。 It becomes pointless to use DownloadString : if we want the right encoding, we have to use DownloadData or plain old WebRequest and write code that parses response headers manually in order to get the correct response string. 使用DownloadString变得毫无意义:如果我们想要正确的编码,则必须使用DownloadData或普通的WebRequest并编写代码以手动解析响应标头以获取正确的响应字符串。

Does anyone know the reason for such behavior? 有人知道这种行为的原因吗? Is there a better way in .NET to properly download HTTP string response, than manually parsing response Content-Type ? .NET中有比手动解析响应Content-Type更好的方法来正确下载HTTP字符串响应吗?

The WebClient source code seems to indicate that when you call DownloadString it uses the request content type as the encoding for the response , which is weird, and probably a bug. WebClient 源代码似乎表明,当您调用DownloadString它使用请求内容类型作为响应的编码,这很奇怪,而且很可能是错误。

See this excellent answer to a similar question. 看到这个类似问题的出色答案 It includes code that uses DownloadData to get the response, then converts it to a string using the correct encoding, as specified in the response's Content-Type header. 它包含使用DownloadData获取响应,然后使用响应的Content-Type标头中指定的正确编码将其转换为字符串的代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM