简体   繁体   English

java的getByte()等效于python

[英]java's getByte() equivalent in python

I am a nebie to python. 我是python的宠儿。 I have a java method which accepts a string, converts the string to a byte array and returns the byte array. 我有一个Java方法,它接受一个字符串,将该字符串转换为字节数组,然后返回字节数组。 The method looks like this. 该方法看起来像这样。

private static byte[] convert(String str) {
        byte[] byteArray = str.getBytes();
        return byteArray;
    }

convert("sr_shah") results in a byte array like this 115 114 95 115 104 97 104 . convert("sr_shah")导致像这样的字节数组115 114 95 115 104 97 104 On using Charset.defaultCharset() i came to know that my machine's dfault charectorset is windows-1252 . 在使用Charset.defaultCharset() ,我发现我的机器的dfault charectorset是windows-1252

Now i need to create exact equivalent of the above method in Python. 现在我需要在Python中创建与上述方法完全相同的东西。 The problem i am facing now is with converting string to bytearray. 我现在面临的问题是将字符串转换为bytearray。 I am unable to get java's getBytes() equivalent in python. 我无法在python中获得java的getBytes()等效项。 I searched in internet and took many help from stackoverflow's previous posts on converting string to byte array but unfortunately none of them worked for me. 我在网上搜索并从stackoverflow以前的帖子中获取了许多帮助,将字符串转换为字节数组但不幸的是,它们都没有为我工作。

The methods i used are bytearray(),bytes(),str.encode() . 我使用的方法是bytearray(),bytes(),str.encode() I used encodings like windows-1252,utf_16,utf_8,utf_16_le,utf_16_be,iso-8859-1 unfortunately none of them give the right result as i expected(ie like the byte array i got from java getBytes()) . 我使用了像windows-1252,utf_16,utf_8,utf_16_le,utf_16_be,iso-8859-1遗憾的是它们都没有像我预期的那样得到正确的结果(就像我从java getBytes()得到的字节数组)。 I am not getting what wrong thing am i doing. 我没有弄错我在做什么。 this is how i tried in python. 这是我在python中尝试的方式。

>>> bytearray('sr_shah','windows-1252')
bytearray(b'sr_shah')
>>> bytearray('sr_shah','utf_8')
bytearray(b'sr_shah')
>>> bytearray('sr_ahah','utf_16')
bytearray(b'sr_ahah')
>>> bytearray('sr_shah','utf_16_le')
bytearray(b'sr_shah')
>>> name = 'sr_shah'
>>> name.encode('windows-1252')
'sr_shah'
>>> name.encode('utf_8')
'sr_shah'
>>> name.encode('latin_1')
'sr_shah'
>>> name.encode('iso-8859-1')
'sr_shah'
>>> name.encode('utf-8')
'sr_shah'
>>> name.encode('utf-16')
'\xff\xfes\x00r\x00_\x00s\x00h\x00a\x00h\x00'
>>> name.encode('utf-16-le')
's\x00r\x00_\x00s\x00h\x00a\x00h\x00'
>>> 

Please help me to get the right conversion. 请帮助我获得正确的转换。

You can do this: 你可以这样做:

str = 'sr_shah'
b = [ord(s) for s in str]
print b

**Output**

[115, 114, 95, 115, 104, 97, 104]

the ord() built-in function is as close as I know to the getByte() function you want, although it works on single characters, so you need to deal with the arrays yourself. ord()内置函数尽可能接近你想要的getByte()函数,虽然它适用于单个字符,所以你需要自己处理数组。

The bytearray you have created in Python contains the bytes you want. 您在Python中创建的bytearray包含您想要的字节。 To see their decimal representation, print the bytes one by one: 要查看它们的十进制表示,请逐个打印字节:

>>> for x in bytearray('sr_shah','windows-1252'): print(x)
...
115
114
95
115
104
97
104

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM