Handling encode() when converting from python2 to python3

Question

I'm working on converting a large project from python2 to python3 (not requiring python2 backwards compatibility).

When testing the conversion, I found I was having an issue where certain strings were being converted to bytes objects, which was causing trouble. I traced it back to the following method, which gets called in a number of places:

def custom_format(val):
    return val.encode('utf8').strip().upper()

In python2 :

custom_format(u'\xa0')
# '\xc2\xa0'
custom_format('bar')
# `BAR`

In python3 :

custom_format('\xa0')
# b'\xc2\xa0'
custom_format('bar')
# b`BAR`

The reason this is an issue is because at some points the output of custom_format is meant to be inserted into a SQL template string using format() , but 'foo = {}'.format(b'bar') == "foo = b'BAR'" , which would mess up potential the SQL syntax.

Simply removing the encode('utf8') part would ensure that custom_format('bar') properly returns 'BAR' , but now custom_format('\\xa0') returns '\\xa0' rather than the '\\xc2\\xa0' of the python2 version. (though I don't know enough about unicode to know if that's a bad thing or not)

Without messing with the SQL or format() parts of the code, how can I make sure the expected behavior from the python2 version is exhibited in the python3 version? Is it as simple as dropping encode('utf8') or will that cause unintended conflicts?

Answer 1

If your intent is to ensure all incoming strings, be it str s or bytes , get converted into bytes , then you have to keep encode since Python3 uses str instead of bytes (which is the case for Python2) as the native string type. encode converts str into bytes .

If your intent is to ensure that the queries look right. Then you can just remove encode and let Python3 handle things for you.

Handling encode() when converting from python2 to python3

Question

1 answers

solution1
0 2019-01-11 21:29:34

Handling encode() when converting from python2 to python3

Question

1 answers

solution1 0 2019-01-11 21:29:34

solution1
0 2019-01-11 21:29:34