简体   繁体   English

System.Text.Ecoding.UTF8.GetString返回垃圾

[英]System.Text.Ecoding.UTF8.GetString is returning junk

This is a tough one. 这是困难的一个。 I have a Response filter setup to transform the html before spitting back out to the browser ( http://aspnetresources.com/articles/HttpFilters ). 我有一个响应过滤器设置,可以在返回浏览器之前转换html( http://aspnetresources.com/articles/HttpFilters )。 This works fine on everyones machine but mine. 这对所有人的机器都有效,但对我的机器而言却很好。 Actually it was working on my machine until I had to do a hard reset because it locked up. 实际上,它一直在我的机器上工作,直到我因为锁定而不得不进行硬重置。

public override void Write(byte[] buffer, int offset, int count)
{
    string strBuffer =  System.Text.UTF8Encoding.UTF8.GetString(buffer, offset, count);

For everyone else (and mine previosly) strBuffer contains HTML. 对于其他所有人(也是我的所有人),strBuffer包含HTML。 Now for whatever reason it's returning junk characters for me. 现在,无论出于何种原因,它都会为我返回垃圾字符。 Any ideas? 有任何想法吗? I'm pulling my hair out!! 我正在拔头发!!

Update 更新资料

Turns out that "Enable dynamic content compression" is causing the issue. 原来是“启用动态内容压缩”导致了此问题。 For some reason it's getting gzipped before being passed into the filter. 由于某种原因,它在传递到过滤器之前会被压缩。

Solution

Setting the "dynamicCompressionBeforeCache" to false in the web.config fixed the issue. 在web.config中将“ dynamicCompressionBeforeCache”设置为false可以解决此问题。

<urlCompression doStaticCompression="true" doDynamicCompression="true" dynamicCompressionBeforeCache="false" />

Sounds like something went wrong. 听起来好像出了点问题。 I too have had some strange behaviour after a lockup. 禁闭后,我也有一些奇怪的行为。 What worked for me was to delete the temp files in C:\\Windows\\Microsoft.NET\\Framework\\v2.0.50727\\Temporary ASP.NET Files 对我有用的是删除C:\\ Windows \\ Microsoft.NET \\ Framework \\ v2.0.50727 \\ Temporary ASP.NET文件中的临时文件

You've specified these bytes: 31, 139, 8, 0, 0, 0, 0, 0, 4 您已指定以下字节:31、139、8、0、0、0、0、0、4

That's not valid UTF-8. 这不是有效的UTF-8。 In particular, it would mean Unicode character U+0031 ("INFORMATION SEPARATOR ONE") followed by bytes 139 and 8... and 139 followed by 8 isn't a valid UTF-8 byte sequence. 特别是,这意味着Unicode字符U + 0031(“信息分隔符ONE”)后跟字节139和8 ...,而139后跟8是无效的UTF-8字节序列。 Even if those did form a valid sequence, you'd then have 5 Unicode U+0000 characters (NUL) followed by U+0004 (END OF TRANSMISSION). 即使那些没有形成有效的顺序,你会再有5个Unicode的U + 0000字符(NUL),其次是U + 0004(发送结束)。 Hardly in valid HTML. 几乎没有有效的HTML。

I don't know what you're actually filtering, but it isn't valid UTF-8 text. 我不知道您实际上要过滤什么,但这不是有效的UTF-8文本。 It doesn't look likely to be text at all, in fact. 实际上,它看起来根本不是文本。 Is it possible that you're actually trying to apply a filter to binary data such as an image? 您是否可能实际上试图将过滤器应用于二进制数据(例如图像)?

Note that you have another fundamental problem with your method of filtering: you're assuming that each buffer contains complete text. 请注意,您的过滤方法还有另一个基本问题:您假设每个缓冲区都包含完整的文本。 It's quite possible for a you to receive one buffer which contains the first half of a character and then a second buffer containing the remainder of it. 您很可能会收到一个包含字符前半部分的缓冲区,然后接收包含字符其余部分的第二个缓冲区。 That's what the System.Text.Decoder interface is for - it's stateful, remembering partial characters. 这就是System.Text.Decoder接口的用途-它是有状态的,记住部分字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM