简体   繁体   中英

How to convert java byte[] to python string?

I know that java and python handle bytes differently so I am a little bit confused about how to convert byte[] to python string I have this byte[] in java

{ 118, -86, -46, -63, 100, -69, -30, -102, -82, -44, -40, 92, 0, 98, 36, -94 }

I want to convert it to python string here is how i did it

b=[118, -86, -46, -63, 100, -69, -30, -102, -82, -44, -40, 92, 0, 98, 36, -94]
str=""
for i in b:
    str=str+chr(abs(i))

But I am not really sure if this is the correct way to do it.

String concatenation is highly inefficient.

I'd recommend to do that in a generator comprehension passed to str.join using an empty separator:

s = "".join([chr(abs(x)) for x in b])

edit: the abs bit is weird. It does what's requested, but nothing useful since byte is signed. So you'd need two's complement as in Martijn answer that fixes the next OP problem: data validity :)

It would be okay if you had some list of ASCII values in a table (and dropping abs allows us to use map , it's so rare to be able to use it let's not deprive us from doing so :)

items = [65, 66, 67, 68]
print("".join(map(chr,items)))

result:

"ABCD"

The Java byte type is a signed integer ; the value ranges between -128 and 127. Python's chr expects a value between 0 and 255 instead. From the Primitive Data Types section of the Java tutorial:

byte : The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive).

You need to convert from 2s compliment to an unsigned integer:

def twoscomplement_to_unsigned(i):
    return i % 256

result = ''.join([chr(twoscomplement_to_unsigned(i)) for i in b])

However, if this is Python 3, you really want to use the bytes type:

result = bytes(map(twoscomplement_to_unsigned, b))

Assuming you're using Python 3, bytes can already be initialized from a list. You'll need to convert the signed integers to unsigned bytes first.

items = [118, -86, -46, -63, 100, -69, -30, -102, -82, -44, -40, 92, 0, 98, 36, -94]
data = bytes(b % 256 for b in items)
print(data)  # b'v\xaa\xd2\xc1d\xbb\xe2\x9a\xae\xd4\xd8\\\x00b$\xa2'

If the bytes represent text, decode it afterwards. In your example, they do not represent text encoded to UTF-8, so this would fail.

data = data.decode('utf8')
print(data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM