简体   繁体   中英

Is there an easy way to substitute XML like tags with specific hex values with Python3?

I have a set of data that is structured like XML but the data is NOT ascii but instead hex.

For instance, the data could be

EX. A
<body>
    <entry1> 0x12 </entry1>
    <entry2> 0x01 </entry2>
</body>

and that could translate to

EX. B
<0x01>
    <0x02> 0x12 <0xff>
    <0x03> 0x01 <0xff>
<0xff>

In the example above (EX B), <0x02> 0x12 <0xff> indicates that entry1 has a value of 0x12.

I am not a native python programmer, so I may be going about doing this the long way (would love it if it were easier) but what I am trying to do is go from the human readable structure (EX. A) to the hex version (EX B).

My idea is to write the XML like to file using lxml and populating the neccessary revelent area, and read the file with Python and perform substitutions using string manipulation for the tags based off a code book/ dictionary.

In the end, I am looking for a byte array that would look like

0x01 0x02 0x12 0xff 0x03 0x01 0xff 0xff

My question is : Is there an easier way ?

A custom html.HTMLParser ( doc ) might suit your needs:

from html.parser import HTMLParser

class MyHTMLParser(HTMLParser):
    def __init__(self):
        super().__init__()
        self.__tags = {}
        self.__counter = 1

        self.__result = []

    def handle_starttag(self, tag, attrs):
        if not tag in self.__tags:
            self.__tags[tag] = '0x{:02x}'.format(self.__counter)
            self.__counter += 1
        self.__result.append(self.__tags[tag])

    def handle_endtag(self, tag):
        self.__result.append('0xff')

    def handle_data(self, data):
        self.__result.append(data.strip())

    @property
    def result(self):
        return [v for v in self.__result if v]

parser = MyHTMLParser()
parser.feed('''<body>
    <entry1> 0x12 </entry1>
    <entry2> 0x01 </entry2>
</body>''')

print(' '.join(parser.result))

Prints:

0x01 0x02 0x12 0xff 0x03 0x01 0xff 0xff

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM