简体   繁体   中英

StreamReader possible encoding issues

I am having an issue while reading files in C# using a StreamReader. I have a file that is UTF-8 encoded. I am going to simplify it to one line for the example. The single line has a newline in it. I am reading with UTF-8 encoding, but once the string is read it does not seem to be treating the newline properly. Let me show the example

using (StreamReader sr = new StreamReader(file, Encoding.UTF8))
{
    string line;

    while ((line = sr.ReadLine()) != null)
    {
        Debug.WriteLine("test1\ntest2" + " - " + "test1\ntest2".GetHashCode());
        Debug.WriteLine(line + " - " + line.GetHashCode());
    }
}

Here is the contents of the file

test1\ntest2

Here is the output from this code

test1
test2 - -61586127
test1\ntest2 - -228288099

In the line that is printed from the string literal it treats the \\n as a newline when it prints it. When it prints the line read from the file it does not do this. You can also see that the hash code values are different.

The contents of your file is wrong. In C# when using literal strings, characters with \\ prepended are treated like special characters (eg \\n for NewLine, \\r for carriage return, \\t for tab, etc.). This is called escaping and \\ is the escape character. The sequence of \\ and some additional character results in one final character within a string.

While in your file's contents there are actual two characters ( \\ and n ), which are treated as actual characters not as a special one character. So in your file you actually need to put in NewLine character or replace \\n with actual C#'s \\n after reading from stream.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM