简体   繁体   中英

Golang XML Unmarshal byte array decoding as illegal character

I have been tasked to decode an alarm event from a Hikvision ANPR camera, and one of the xml fields is <plateCharBelieve>x,x,x,x,x,x,x</plateCharBelieve> where x is a byte representing 0-100 on what confidence the ANPR has on each character of the licence plate. When the xml unmarshal happens it seems to try and unmarshal it as an ASCII character, and every now and then one of those ASCII characters is an illegal character and throws an error. XML syntax error on line 19: illegal character code U+000C .

Is there a way I can prevent the default behaviour? or maybe implement a custom decoder? If I can decode it as int of 0-100. How would I just drop/ignore that line as I don't really require it. Just wont to stop it throwing an error.

Appreciate any feedback.

十六进制转储 xml

golang don't support decode the non-standard xml

follow is the code: src/encoding/xml line1142 and line 1154

        if !isInCharacterRange(r) {
            d.err = d.syntaxError(fmt.Sprintf("illegal character code %U", r))
            return nil
        }
// Decide whether the given rune is in the XML Character Range, per
// the Char production of https://www.xml.com/axml/testaxml.htm,
// Section 2.2 Characters.
func isInCharacterRange(r rune) (inrange bool) {
    return r == 0x09 ||
        r == 0x0A ||
        r == 0x0D ||
        r >= 0x20 && r <= 0xD7FF ||
        r >= 0xE000 && r <= 0xFFFD ||
        r >= 0x10000 && r <= 0x10FFFF
}

standard xml Section 2.2 Characters don't allow some byte

follow is some solutoin:

let xml decode any byte, just delete the isInCharacterRange :

src/encoding/xml line1142

        //if !isInCharacterRange(r) {
        //  d.err = d.syntaxError(fmt.Sprintf("illegal character code %U", r))
        //  return nil
        //}

then defined PlateCharBelieve as string or others:

code:

package main

import (
    "encoding/xml"
    "fmt"
    "strings"
)

type Root struct {
    OtherInfo        string `xml:"otherInfo"`
    PlateCharBelieve string `xml:"plateCharBelieve"`
}

func main() {
    sb := strings.Builder{}
    sb.WriteString(`<root><otherInfo>otherInfo</otherInfo><plateCharBelieve>`)
    sb.Write([]byte{11, 22, 33, 44, 97, 98, 99})
    sb.WriteString(`</plateCharBelieve></root>`)
    fmt.Printf("%v\n", sb.String())
    var root Root

    err := xml.Unmarshal([]byte(sb.String()), &root)
    if err != nil {
        fmt.Printf("%v\n", err)
    }
    fmt.Printf("%v\n", root)

}

output: <root><otherInfo>otherInfo</otherInfo><plateCharBelieve>>,,abc</plateCharBelieve></root

if you want decode the []byte, impl the Unmarshaler interface, like follow

type Bytes struct {
}

func (b Bytes) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
    fmt.Printf("%v\n", "UnmarshalXML")
    ....
    return nil
}

type Root struct {
    OtherInfo        string `xml:"otherInfo"`
    PlateCharBelieve Bytes  `xml:"plateCharBelieve"`
}

if you dont want to change the golang code,you can relpace the bytes by regx:

1.you can use <![CDATA[]]> enclose it

2.you can use regexp.replace delete it

3.you can use regexp.match decode it yourself

follow is a demo

package main

import (
    "encoding/xml"
    "fmt"
    "regexp"
    "strings"
)

type Root struct {
    OtherInfo        string `xml:"otherInfo"`
    PlateCharBelieve string `xml:"plateCharBelieve"`
}

func main() {
    sb := strings.Builder{}
    sb.WriteString(`<root><otherInfo>otherInfo</otherInfo><plateCharBelieve>`)
    sb.Write([]byte{11, 22, 33, 44, 97, 98, 99})
    sb.WriteString(`</plateCharBelieve></root>`)
    fmt.Printf("%v\n", sb.String())
    r, _ := regexp.Compile(`<plateCharBelieve>.*?</plateCharBelieve>`)
    //if you want decode it use match
    xmlstr := r.ReplaceAllString(sb.String(), "")
    var root Root
    err := xml.Unmarshal([]byte(xmlstr), &root)
    if err != nil {
        fmt.Printf("%v\n", err)
    }
    fmt.Printf("%v\n", root)

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM