Git messes up with non-ascii characters on Linux container

Question

I have a .Net Core (C#) project with the following line in one of the classes:

var input = "£";

But when I do a git clone in a Docker container ( microsoft/dotnet:2.2-sdk ) it messes it up and displays it as (in bash using cat ).

And when I run it, its Utf-8 bytes are [239, 191, 189] = [EF, BF, BD] which seem to be a so-called Unicode replacement character .

Windows editor that I use is VS 2017, but character is displayed properly on other windows machines and parsed properly by dotnet run/test command, so I don't think this is a problem of failing to save the character incorrectly.

Any ideas why I am seeing such a mess and how to solve it?

Some details

I get bytes using Encoding.UTF8.GetBytes("£");
It works perfectly well on Windows 10 machine
Linux version Debian GNU/Linux 9 (stretch) from the cat /etc/os-release
locale -a returns C C.UTF-8 POSIX
On Windows Notepad++, when opened, is claims to be ANSI and is displayed correctly.

Running fgrep 'var input' file.cs | od -tx1 -c fgrep 'var input' file.cs | od -tx1 -c

0000100  76  61  72  20  69  6e  70  75  74  20  3d  20  22  a3  22  3b
          v   a   r       i   n   p   u   t       =       " 243   "   ;

Answer 1

Your file contains a single byte a3 which corresponds to the Windows-1252 encoding for the character £ . Your Linux system displays because it is not a valid UTF-8 encoding.

You should configure Visual Studio to use UTF-8 instead of Windows-1252.

Git messes up with non-ascii characters on Linux container

Question

1 answers

solution1
1 ACCPTED 2019-07-10 08:47:41

Git messes up with non-ascii characters on Linux container

Question

1 answers

solution1 1 ACCPTED 2019-07-10 08:47:41

solution1
1 ACCPTED 2019-07-10 08:47:41