[英]Python: calling upper() on words containing non-latin characters
I have a file with words in lines, ex 我有一个文件,行中有单词,例如
А
б
Вв
Гг
(non-latin letters) etc. (非拉丁字母)等
I want to get this: 我想得到这个:
А
Б
ВВ
ГГ
while after the code runs I see no changes 代码运行后,我看不到任何变化
here is code: 这是代码:
f = open('sample.csv')
for line in f:
for sampleword in line.split():
print sampleword.upper()
Non-latin characters are not capitilized. 非拉丁字符不首字母大写。 What's the problem?
有什么问题?
Solution for capitalizing non-latin letters in Python 2 is to use unicode strings: 在Python 2中大写非拉丁字母的解决方案是使用unicode字符串:
words = [u'łuk', u'ćma']
assert [w.upper() for w in words] == [u'ŁUK', u'ĆMA']
To read unicode from file you may refer to official Python manual : 要从文件中读取unicode,您可以参考Python官方手册 :
Reading Unicode from a file is therefore simple:
因此,从文件读取Unicode很简单:
import codecs
f = codecs.open('unicode.rst', encoding='utf-8')
for line in f:
print repr(line)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.