简体   繁体   English

WebClient DownloadString UTF-8不显示国际字符

[英]WebClient DownloadString UTF-8 not displaying international characters

I attempt to save the html of a website in a string. 我试图用字符串保存网站的html。 The website has international characters (ę, ś, ć, ...) and they are not being saved to the string even though I set the encoding to be UTF-8 which corresponds to the websites charset. 该网站具有国际字符(ę,ś,ć,...),即使我将编码设置为UTF-8(对应于网站字符集),它们也不会保存到字符串中。

Here is my code: 这是我的代码:

using (WebClient client = new WebClient())
{
    client.Encoding = Encoding.UTF8;
    string htmlCode = client.DownloadString(http://www.filmweb.pl/Mroczne.Widmo);
}

When I print "htmlCode" to the console, the international characters are not shown correctly even though in the original HTML they are shown correctly. 当我将“htmlCode”打印到控制台时,即使在原始HTML中它们被正确显示,国际字符也不会正确显示。

Any help is appreciated. 任何帮助表示赞赏。

I had the same problem. 我有同样的问题。 It seems that client.DownloadString doesn't encode the characters using UTF-8. 似乎client.DownloadString不使用UTF-8对字符进行编码。 Using client.DownloadData and encoding the returned data with Encoding.UTF8.GetString solve the problem. 使用client.DownloadData并使用Encoding.UTF8.GetString对返回的数据进行Encoding.UTF8.GetString解决问题。

using (WebClient client = new WebClient())
{
     var htmlData = client.DownloadData("http://www.filmweb.pl/Mroczne.Widmo");
     var htmlCode = Encoding.UTF8.GetString(htmlData);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM