In Python I have got this string
string = "Ľubomír Mezovský"
I need to get only first character of it. But when I tried string[0]
it returned
. When I tried string[:2]
it worked well. My question is why? I need to run this for several strings and when string does not start with diacritic character, it returns substring of two characters.
I am also using # encoding=utf8
and Python 2.7
You're dealing with byte-string (assuming you're using Python 2.x).
Convert the byte-string to unicode-string using str.decode
, get the first character, then convert it back to binary string using str.encode
(optional unless you should use byte-string)
>>> string = "Ľubomír Mezovský"
>>> print(string.decode('utf-8')[0].encode('utf-8'))
Ľ
Try converting the string to Unicode and the encode to "utf-8"
Ex:
string = u"Ľubomír Mezovský"
print string[0].encode('utf-8')
Output:
Ľ
Tested in python2.7
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.