How can I compare unicode type with str type in python of Chinese?

Question

I'm using python 2.7 for example:

a = u'你好'
b = '你好'

I tried following code but failed

print a.encode('UTF-8') == b #return False

How to compare them as equal?

Answer 1

I hope you are using python3 , Both of the variables are string you don't need to change in to any of it. Simply compare both of them.

>>> a = u'你好'
>>> b = '你好'
>>> type(a)
<class 'str'>
>>> type(b)
<class 'str'>
>>> a == b
True

if you are using python2 your attempt will work.

Answer 2

Very likely your Python source file isn't encoded in UTF-8. The variable b will contain whatever bytes are between those quotes. Those bytes will depend on the encoding. For example

# coding: utf-8
print repr("你好")

prints: '\\xe4\\xbd\\xa0\\xe5\\xa5\\xbd'

Now if we save our source file as GB2312 and update the declaration:

# coding: GB2312
print repr("你好")

prints: '\\xc4\\xe3\\xba\\xc3'

In any case, if you have a byte array with text, you also need to know the encoding of those bytes, otherwise you can't reliably interpret them.

If you need UTF-8 bytes regardless of source file encoding, you can write u'你好'.encode('utf-8') will will always return '\\xe4\\xbd\\xa0\\xe5\\xa5\\xbd' .

How can I compare unicode type with str type in python of Chinese?

Question

2 answers

solution1
1 2017-02-15 08:28:11

solution2
1 2017-02-15 22:49:49

How can I compare unicode type with str type in python of Chinese?

Question

2 answers

solution1 1 2017-02-15 08:28:11

solution2 1 2017-02-15 22:49:49

solution1
1 2017-02-15 08:28:11

solution2
1 2017-02-15 22:49:49