简体   繁体   中英

Encoding utf-8 to base64 with accents

I have some data like this:

data1 = ['Agos', '30490349304']
data2 = ['Desir\xc3\xa9','9839483948']

I'm using an API that expects the data encoded in base64, so what I do is:

data = data1
string = base64.b64encode("Hi, %s! Your code is %s" % (data[0], data[0]))
myXMLRPCCall(string)

Which works fine with data1. With data2 the encoding goes ok, but then the XMLRPC returns an error, since it expects (from the API docs) only ISO-8859-1 (Latin1) characters.
My question is: how can I transform my string into Latin1 so that the API accepts it?

First make sure you're not confused about encodings, etc. Read, for example, this .

Then notice that the main problem isn't with the base64 encoding, but with the fact that you're trying to put byte string (normal string in Python 2.x) inside a Unicode string. I believe you can fix this by removing the "u" from the last string in your example code.

base64.b64encode("Hi, %s! Your code is %s" % (data[0].decode('utf8').encode('latin1'), data[0]))

This seem to work:

...

data = data2
base64.b64encode("Hi, %s! Your code is %s" % (data[0], data[0]))
# => 'SGksIERlc2lyw6khIFlvdXIgY29kZSBpcyBEZXNpcsOp'

# I can't test the XMLRPC parts, so this is just a hint ..
for_the_wire = base64.b64encode("Hi, %s! Your code is %s" % (data[0], data[0]))
latin_1_encoded = for_the_wire.encode('latin-1')

# send latin_1_encoded over the wire ..

Some python (2.X) unicode readings:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM