I have a Python program which is written in UTF-8 as confirmed by PyCharm and Sublime Text. It prints out the pound character, £
( 0xC2 0xA3
), to a reStructured Text:
Opening the reStructured Text file with PyCharm and Sublime Text it looks fine and both claim it's UTF-8.
The problem comes when I generate HTML out of this file by using rst2html5 , with this command:
rst2html5 --input-encoding=utf-8 --output-encoding=utf-8 foo.rst > foo.html
The HTML claims to be UTF-8, by means of <meta charset="utf-8" />
, but the pound characters, £
, are now shown as £
. Opening it in Sublime Tex as UTF-8 also shows £
instead of £
. This is the actual data:
Any ideas what's going on or how to stop it? Does that look like UTF-8 at all?
The generated file starts like this:
0xFF 0xFE
reminds me of the UTF-16 BOM but setting the header to <meta charset="utf-16" />
does not solve the problem and telling a text editor to open the file as UTF-16 still shows the non ASCII character broken.
In case it is relevant, my active Windows code page is 437.
The problem was being cause by PowerShell redirection and not by rst2html5 itself. Running it like this:
rst2html5 --input-encoding=utf-8 --output-encoding=utf-8 foo.rst foo.html
which has the same effect as the redirection ( >
) one worked well, and using the redirection on on CMD also worked well.
If someone has more information about why PowerShell is messing up the encoding, that'd be good to add here.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.