简体   繁体   English

Python:对包含非拉丁字符的单词调用upper()

[英]Python: calling upper() on words containing non-latin characters

I have a file with words in lines, ex 我有一个文件,行中有单词,例如

А
б
Вв
Гг

(non-latin letters) etc. (非拉丁字母)等

I want to get this: 我想得到这个:

А
Б
ВВ
ГГ

while after the code runs I see no changes 代码运行后,我看不到任何变化

here is code: 这是代码:

f = open('sample.csv')
for line in f:
    for sampleword in line.split():
        print sampleword.upper()

Non-latin characters are not capitilized. 非拉丁字符不首字母大写。 What's the problem? 有什么问题?

Solution for capitalizing non-latin letters in Python 2 is to use unicode strings: 在Python 2中大写非拉丁字母的解决方案是使用unicode字符串:

words = [u'łuk', u'ćma']
assert [w.upper() for w in words] == [u'ŁUK', u'ĆMA']

To read unicode from file you may refer to official Python manual : 要从文件中读取unicode,您可以参考Python官方手册

Reading Unicode from a file is therefore simple: 因此,从文件读取Unicode很简单:

import codecs
f = codecs.open('unicode.rst', encoding='utf-8')
for line in f:
    print repr(line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM