I have some Japanese words I wish to convert to utf-8, as shown below:
jap_word1 = u'中山'
jap_word2 = u'小倉'
print jap_word1.encode('utf-8') # Doesn't work
print jap_word2.encode('utf-8') # Prints properly
Why is it that one word can be converted properly to utf-8 and printed to show the same characters but not the other?
(I am using python 2.6 on Windows 7 Ultimate)
Lots of things must align to print characters properly:
# coding: xxxx
statement in your script, where xxxx matches the encoding the file is saved in? import sys; print sys.stdout.encoding
import sys; print sys.stdout.encoding
a. If not, can you change the console encoding? ( chcp
command on Windows) Saving the script in UTF-8, this works in both PythonWin and IDLE.
# coding: utf-8
jap_word1 = u'中山'
jap_word2 = u'小倉'
print jap_word1
print jap_word2
Interestingly, I got your results with the .encode('utf-8')
added to both prints in IDLE, but it worked correctly in Pythonwin, whose default output window supports UTF-8.
Idle is a strange beast. sys.stdout.encoding
on my system produces 'cp1252'
, which doesn't support Asian characters, but it prints the first word wrong and the second one right when printing in UTF-8.
Because your console is not in UTF-8. Run chcp 65001
before running.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.