蟒蛇。 unicode + variable和u + constant之间的区别？

Question

Can someone please tell me how to fix this please. 有人可以告诉我如何解决此问题。

This works: 这有效：

    nOrd = (ord(u'ط'))

But this fails: 但这失败了：

s="‎ط"   
s=unicode(s, 'utf-8')
nOrd = (ord((s)))

The error I get is: 我得到的错误是：

TypeError: ord() expected a character, but string of length 2 found TypeError：ord（）需要一个字符，但是找到了长度为2的字符串

Answer 1

Your second s is simply not the same text as the first example: 你的第二个s根本就不是同一个文本作为第一个例子：

>>> u'ط'
u'\u0637'
>>> u'ط'.encode('utf8')
'\xd8\xb7'
>>> s="‎ط"
>>> s
'\xe2\x80\x8e\xd8\xb7'
>>> s.decode('utf8')
u'\u200e\u0637'

You have a U+200E LEFT-TO-RIGHT MARK character in the second example. 在第二个示例中，您有一个U + 200E左至右标记字符。 That makes it two characters, not one. 这使其成为两个字符，而不是一个。

You could remove it by stripping with str.lstrip() or by using str.replace() ; 你可以通过剥离除去它str.lstrip()或使用str.replace() ; the first only removes it from the start, the other from everywhere in the string: 第一个只从头开始删除它，另一个从字符串中的任何地方删除它：

s = s.lstrip(u'\u200e')
# or
s = s.replace(u'\u200e', u'')

蟒蛇。 unicode + variable和u + constant之间的区别？

问题描述

1 个解决方案

解决方案1
4 已采纳 2016-11-19 14:34:12

蟒蛇。 unicode + variable和u + constant之间的区别？

问题描述

1 个解决方案

解决方案1 4 已采纳 2016-11-19 14:34:12

解决方案1
4 已采纳 2016-11-19 14:34:12