简体   繁体   English

简单C#HTTP服务器上的内容长度有时是错误的

[英]Content-Length Occasionally Wrong on Simple C# HTTP Server

For some experimentation was working with Simple HTTP Server code here 为了进行一些实验,在这里使用简单HTTP服务器代码

In one case I wanted it to serve some ANSI encoded text configuration files. 在一种情况下,我希望它提供一些ANSI编码的文本配置文件。 I am aware there are more issues with this code but the only one I'm currently concerned with is Content-Length is wrong, but only for certain text files. 我知道此代码还有更多问题,但是我目前唯一关心的是Content-Length是错误的,但仅适用于某些文本文件。

Example code: 示例代码:

Output stream initialisation: 输出流初始化:

outputStream = new StreamWriter(new BufferedStream(socket.GetStream()));

The handling of HTTP get: HTTP的处理得到:

public override void handleGETRequest(HttpProcessor p)
{

    if (p.http_url.EndsWith(".pac"))
    {
        string filename = Path.Combine(Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().Location), p.http_url.Substring(1));
        Console.WriteLine(string.Format("HTTP request for : {0}", filename));
        if (File.Exists(filename))
        {
            FileInfo fi = new FileInfo(filename);
            DateTime lastWrite = fi.LastWriteTime;

            Stream fs = File.Open(filename, FileMode.Open, FileAccess.Read, FileShare.Read);
            StreamReader sr = new StreamReader(fs);
            string result = sr.ReadToEnd().Trim();
            Console.WriteLine(fi.Length);
            Console.WriteLine(result.Length);
            p.writeSuccess("application/x-javascript-config",result.Length,lastWrite);
            p.outputStream.Write(result);
            // fs.CopyTo(p.outputStream.BaseStream);
            p.outputStream.BaseStream.Flush();
            fs.Close();
        }
        else
        {
            Console.WriteLine("404 - FILE not found!");
            p.writeFailure();
        }
    }

}  

   public void writeSuccess(string content_type,long length,DateTime lastModified) {
            outputStream.Write("HTTP/1.0 200 OK\r\n");            
            outputStream.Write("Content-Type: " + content_type + "\r\n");
            outputStream.Write("Last-Modified: {0}\r\n", lastModified.ToUniversalTime().ToString("r"));
            outputStream.Write("Accept-Range: bytes\r\n");
            outputStream.Write("Server: FlakyHTTPServer/1.3\r\n");
            outputStream.Write("Date: {0}\r\n", DateTime.Now.ToUniversalTime().ToString("r"));
            outputStream.Write(string.Format("Content-Length: {0}\r\n\r\n", length));   
              }

For most files I've tested with Content-Length is correct. 对于大多数我用Content-Length测试过的文件来说都是正确的。 However when testing with HTTP debugging tool Fiddler some times protocol violation is reported on Content-Length. 但是,使用HTTP调试工具Fiddler进行测试时,有时Content-Length会报告违反协议。

For example fiddler says: 例如提琴手说:

Request Count: 1 Bytes Sent: 303 (headers:303; body:0) Bytes Received: 29,847 (headers:224; body:29,623) 请求计数:发送的1个字节:303(标头:303;正文:0)已接收的字节:29,847(标头:224;正文:29,623)

So Content-Length should be 29623. But the HTTP header generated is 所以Content-Length应该是29623。但是生成的HTTP标头是

Content-Length: 29617

I saved the body of HTTP content from Fiddler and visibly compared the files, couldn't notice any difference. 我从Fiddler中保存了HTTP内容的正文,并明显地比较了文件,没有发现任何区别。 Then loaded them into BeyondCompare Hex compare, there are several problems with files like this: 然后将它们加载到BeyondCompare Hex比较中,这样的文件存在一些问题:

Original File: 2D 2D 96       20 2A 2F
HTTP Content : 2D 2D EF BF BD 20 2A 2F

Original File: 27 3B 0D 0A 09 7D 0D 0A 0D 0A 09
HTTP Content : 27 3B    0A 09 7D    0A    0A 09

I suspect problem is related to encoding but not exactly sure. 我怀疑问题与编码有关,但不确定。 Only serving ANSI encoded files, no Unicode. 仅提供ANSI编码的文件,不提供Unicode。

I made the file serve correctly with right Content-Length by modifying parts of the file with bytes sequence. 通过使用字节序列修改文件的某些部分,我使文件具有正确的Content-Length正确显示。 Made this change in 3 parts of the file: 在文件的3部分中进行了更改:

2D 2D 96 (--–) to 2D 2D 2D (---)

Based on the bytes you pasted, it looks like there are a couple things going wrong here. 根据您粘贴的字节,这里似乎有些错误。 First, it seems that CRLF in your input file (0D 0A) is being converted to just LF (0A). 首先,似乎输入文件(0D 0A)中的CRLF正在转换为LF(0A)。 Second, it looks like the character encoding is changing, either when reading the file into a string , or Write ing the string to the HTTP client. 其次,当将文件读取为string或将字符串Write HTTP客户端时,字符编码似乎正在更改。

The HTTP Content-Length represents the number of bytes in the stream, whereas string.Length gives you the number of characters in the string. HTTP Content-Length表示流中的字节数,而string.Length则为您提供字符串中的字符数。 Unless your file is exclusively using the first 128 ASCII characters (which precludes non-English characters as well as special windows-1252 characters like the euro sign), it's unlikely that string.Length will exactly equal the length of the string encoded in either UTF-8 or ISO-8859-1. 除非您的文件专门使用前128个ASCII字符(排除非英语字符以及特殊的Windows-1252字符(例如欧元符号),否则string的可能性不大。Length长度将完全等于以任一UTF编码的字符串的长度-8或ISO-8859-1。

If you convert the string to a byte[] before sending it to the client, you'll be able to get the "true" Content-Length. 如果在将字符串发送给客户端之前将其转换为byte[] ,则可以获得“ true” Content-Length。 However, you'll still end up with mangled text if you didn't read the file using the proper encoding. 但是,如果您未使用正确的编码读取文件,则仍然会产生乱码。 (Whether you specify the encoding or not, a conversion is happening when reading the file into a string of Unicode characters.) (无论是否指定编码,在将文件读取为Unicode string时都会发生转换。)

I highly recommend specifying the charset in the Content-Type header (eg application/x-javascript-config;charset=utf-8 ). 我强烈建议在Content-Type标头中指定字符集(例如application/x-javascript-config;charset=utf-8 )。 It doesn't matter whether your charset is utf-8, utf-16, iso-8859-1, windows-1251, etc., as long as it's the same character encoding you use when converting your string into a byte[]. 字符集是否为utf-8,utf-16,iso-8859-1,windows-1251等都没有关系,只要它与将字符串转换为byte []时使用的字符编码相同即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM