简体   繁体   English

解码多个编码字符串

[英]Decoding multiple encoded string

How do I decode this to get the result below? 如何解码此结果以得到以下结果?

/browse_ajax?action_continuation=1\u0026amp;continuation=4qmFsgJAEhhVQ2ZXdHFQeUJNR183aTMzT2VlTnNaWncaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk03Z0JBQSUzRCUzRA%253D%253D

/browse_ajax?action_continuation=1&continuation=4qmFsgJAEhhVQ2ZXdHFQeUJNR183aTMzT2VlTnNaWncaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk03Z0JBQSUzRCUzRA%253D%253D

I've tried these, also using them multiple times as I did read strings may be encoded multiple times. 我尝试过这些,也多次使用它们,因为我确实读过字符串,可能会多次编码。

System.Text.RegularExpressions.Regex.Unescape(string)
System.Uri.UnescapeDataString(string)
System.Net.WebUtility.UrlDecode(string)

Which is the right function here or rather in what order do I need to call them to get that result. 这里哪个是正确的函数,或者我需要以什么顺序调用它们以获得该结果。 As the strings vary there may be other special characters in the set so doing a workaround, editing it myself, is somewhat too risky. 随着字符串的变化,集合中可能还会包含其他特殊字符,因此自行解决此问题的方法有点麻烦。

The string has to be decoded to work with new System.Net.WebClient().DownloadString(string) . 必须将字符串解码才能与new System.Net.WebClient().DownloadString(string)

EDIT: So I found out the above statement is wrong, I do not have to decode this to use WebClient.DownloadString(string) . 编辑:所以我发现上面的声明是错误的, 我不必使用WebClient.DownloadString(string)对此进行解码 However the downloaded string suffers similar encoding too. 但是,下载的字符串也遭受类似的编码。 Setting the WebClient 's Encoding property to UTF8 inbefore downloading does most of the job, however some characters still seem corrupted, for example: Double quotes and ampersand stay \" 在下载之前,将WebClient的Encoding属性设置为UTF8可以完成大部分工作,但是某些字符似乎仍然损坏,例如:双引号和&保持“ \" and \& \& .

I don't know how to make \& to &, so I can change & amp; 我不知道如何将\\ u设为&,因此我可以更改& to &. 至 &。

That these strings are double (actually triple) encoded in this way is a sign that the string is not being encoded correctly. 这些字符串以这种方式进行了两次(实际上是三次)编码,这表明该字符串未正确编码。 If you own the code that encodes these strings, consider solving this problem there, which is the root of the issue. 如果您拥有编码这些字符串的代码,请考虑在那里解决此问题,这是问题的根源。

That said, here are the decoding calls you need to make to decode this. 就是说,这是您需要对其进行解码的解码调用。 I do not recommend this solution, as it is definitely a workaround. 我不推荐这种解决方案,因为它绝对是一种解决方法。 Again, the problematic behavior is in the code doing the encoding. 同样,有问题的行为在于代码进行编码。

string val = "/browse_ajax?action_continuation=1\u0026amp;continuation=4qmFsgJAEhhVQ2ZXdHFQeUJNR183aTMzT2VlTnNaWncaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk03Z0JBQSUzRCUzRA%253D%253D";
val = System.Uri.UnescapeDataString(val);
val = System.Uri.UnescapeDataString(val);
val = System.Web.HttpUtility.HtmlDecode(val);

This will give you: 这将为您提供:

/browse_ajax?action_continuation=1&continuation=4qmFsgJAEhhVQ2ZXdHFQeUJNR183aTMzT2VlTnNaWncaJEVnWjJhV1JsYjNNZ0FEZ0JZQUZxQUhvQk03Z0JBQSUzRCUzRA==

If you really want to keep the %253D encoding of the equal signs, just call Uri.UnescapeData(string) once. 如果您确实要保留%253D%253D编码,则只需调用Uri.UnescapeData(string)一次。 This will leave the equal signs encoded, except as %3D , which is their proper encoded value. 这将使等号被编码,但%3D除外,这是它们的正确编码值。

Looked like the mysterium was solved to me, however I stumbled upon it again, didn't find any build in solution as these seem to fail decoding utf8 if the character is part of an html-escaped character. 看起来像是奥秘解决了我的问题,但是我又偶然发现了它,没有找到任何内置解决方案,因为如果该字符是html转义字符的一部分,则这些解码器似乎无法解码utf8。

As these however only seem to use the ampersand, I had to use Replace(@"\&","&") to be able to HtmlDecode and get a proper string. 但是,由于这些似乎只使用&号,因此我必须使用Replace(@"\&","&")才能进行HtmlDecode并获取正确的字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM