简体   繁体   English

转换汉字时Java桥代码错误:'utf-8'编解码器无法解码位置0的字节0xc0:无效的起始字节

[英]Java bridge code error while converting chinese characters : 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte

We are receiving data in different encoding format, currently we are using below mentioned java encodings https://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html 我们正在接收不同编码格式的数据,当前我们正在使用以下提到的Java编码https://docs.oracle.com/javase/1.5.0/docs/guide/intl/encoding.doc.html

we are moving to python so changing this encoding logic into python. 我们正在转向python,因此将这种编码逻辑更改为python。 As python is not supporting encoding for Chinese character which is equivalent to java encoding Cp935 we are using javabridge code as below 由于python不支持等同于java编码Cp935的汉字编码,因此我们使用了javabridge代码,如下所示

 `
class String:
    new_fn = javabridge.make_new("java/lang/String", "([BLjava/lang/String;)V")
    def __init__(self, i, s):
        self.new_fn(i, s)
    toString = javabridge.make_method("toString", "()Ljava/lang/String;", "Retrieve the string value")    

array = numpy.array(list(fielddata) , numpy.uint16)
                            strobject = String(array,encoding)
                            convertedstr = strobject.toString()  `

however we are getting the error 但是我们得到了错误


'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte 'utf-8'编解码器无法解码位置0的字节0xc0:无效的起始字节


looking for the help or alternative way of doing this in python. 在python中寻找帮助或替代方法。

class JavaEncoder:
        # creating new method for java bridge
        new_fn = javabridge.make_new("java/lang/String", "([BLjava/lang/String;)V")

        def __init__(self, i, s):

            i[i == 0] = 64
            self.new_fn(i, s)

        # creating toString method of JAVA
        toString = javabridge.make_method("toString", "()Ljava/lang/String;", "Retrieve the integer value")

While converting data using JAVABRIDGE if field is having size 1 and data contains 00 then numpy.uint8 convert this into 0 considering this as integer because of which, while converting data, we are getting encoding error to avoid this we added above code 64 is space (40 EBCDIC/20 ASCII space) in uint8. 当使用JAVABRIDGE转换数据时,如果字段的大小为1并且数据包含00,则numpy.uint8将此转换为0并将其视为整数,因为在转换数据时,我们会遇到编码错误,以免我们在代码64以上添加空格(uint8中的(40 EBCDIC / 20 ASCII空格)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Intellij IDEA:无法提交文件:'utf8'编解码器无法解码位置9的字节0xcc - Intellij IDEA: Impossible to commit files: 'utf8' codec can't decode byte 0xcc in position 9 如何删除XML中的特殊字符,并且在读取此xml文件时不应导致错误“1字节UTF-8序列的无效字节1” - How to remove the special characters in XML and should not lead to the error “Invalid byte 1 of 1-byte UTF-8 sequence” while reading this xml file Java将字节数组转换为字符串UTF-8 - Java converting byte array to string UTF-8 Java stax:3 字节 UTF-8 序列的无效字节 2 - Java stax: Invalid byte 2 of 3-byte UTF-8 sequence java.io.UTFDataFormatException:导出为ex​​cel时,1字节UTF-8序列的字节1无效 - java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence while exporting to excel “ 1字节UTF-8序列的无效字节1”错误 - “Invalid byte 1 of 1-byte UTF-8 sequence” error 2 字节 UTF-8 Java 的无效字节 2,序列错误取决于 Windows/IntelliJ - Invalid byte 2 of 2-byte UTF-8 Java, sequence error depending on Windows/IntelliJ JAXB错误的说明:1字节UTF-8序列的字节1无效 - Explanation of JAXB error: Invalid byte 1 of 1-byte UTF-8 sequence 在Windows中使用Java读取UTF-8格式的xml -file会给出“ IOException:2字节UTF-8序列的无效字节2。” -error - Reading xml -file in UTF-8 format in Windows with Java gives “IOException: Invalid byte 2 of 2-byte UTF-8 sequence.” -error 如何从输入流中读取Java字节范围之外的有效utf-8字符0xC2 0x85? - How can i read valid utf-8 characters 0xC2 0x85 from an input stream which are outside the byte range in java?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM