Will a UNICODE string just containing ASCII characters always be equal to the ASCII string?

Question

I noticed the following holds:

>>> u'abc' == 'abc'
True
>>> 'abc' == u'abc'
True

Will this always be true or could it possibly depend on the system locale? (It seems strings are unicode in python 3: eg this question , but bytes in 2.x)

Answer 1

Python 2 coerces between unicode and str using the ASCII codec when comparing the two types. So yes, this is always true.

That is to say, unless you mess up your Python installation and use sys.setdefaultencoding() to change that default. You cannot do that normally, because the sys.setdefaultencoding() function is deleted from the module at start-up time, but there is a Cargo Cult going around where people use reload(sys) to reinstate that function and change the default encoding to something else to try and fix implicit encoding and decoding problems. This is a dumb thing to do for precisely this reason.

Will a UNICODE string just containing ASCII characters always be equal to the ASCII string?

Question

1 answers

solution1
14 ACCPTED 2015-02-20 11:17:37

Will a UNICODE string just containing ASCII characters always be equal to the ASCII string?

Question

1 answers

solution1 14 ACCPTED 2015-02-20 11:17:37

solution1
14 ACCPTED 2015-02-20 11:17:37