Can't read international characters from files

Question

I am trying to read portuguese characters from files, and keep getting into problems.

I have the following C# code (for testing purposes):

var streamReader = new StreamReader("file.txt");

while (streamReader.Peek() >= 0)
{
  var buffer = new char[1];
  streamReader.Read(buffer, 0, buffer.Length);
  Console.Write(buffer[0]);
}

It reads each character in the file and then outputs it to the console. The file contains the following: "cãsa". The output in the console is: "c?sa".

What am I doing wrong?

Answer 1

You need to read the file using the correct encoding - by default the file will be read as UTF-8, if that's not the right encoding, you will get such issues.

In this example, I am using an constructor overload that takes an encoding, in this case UnicodeEncoding , which is UTF-16:

using(var streamReader = new StreamReader("file.txt", Encoding.UnicodeEncoding))
{
    while (streamReader.Peek() >= 0)
    {
      var buffer = new char[1];
      streamReader.Read(buffer, 0, buffer.Length);
      Console.Write(buffer[0]);
    }
}

In this example, I am using codepage 860, corresponding to Portuguese:

using(var streamReader = new StreamReader("file.txt", Encoding.GetEncoding(860)))
{
    while (streamReader.Peek() >= 0)
    {
      var buffer = new char[1];
      streamReader.Read(buffer, 0, buffer.Length);
      Console.Write(buffer[0]);
    }
}

Can't read international characters from files

Question

1 answers

solution1
2 ACCPTED 2012-01-16 10:09:06

Can't read international characters from files

Question

1 answers

solution1 2 ACCPTED 2012-01-16 10:09:06

solution1
2 ACCPTED 2012-01-16 10:09:06