简体   繁体   中英

parsing .xml file using python :search and copy related data

I want to copy some data from.xml file based on some search value. In below xml file I want to search 0xCCB7B836 ( 0xCCB7B836 )and copy data inside that

   4e564d2d52656648
   6173685374617274 
   1782af065966579e 
   899885d440d3ad67 
   d04b41b15e2b13c2

one more example: search value 0xECFBBA1A and return 0000

or

search value 0xA54E2B5A and return 30d4

 <MEM_DATA>
  <MEM_SECTOR>
     <MEM_SECTOR_NUMBER>0</MEM_SECTOR_NUMBER>
     <MEM_SECTOR_STATUS>ACTIVE</MEM_SECTOR_STATUS>
     <MEM_SECTOR_STARTADR>0x800000</MEM_SECTOR_STARTADR>
     <MEM_SECTOR_ENDADR>0x0</MEM_SECTOR_ENDADR>
     <MEM_SECTOR_COUNTER>0x1</MEM_SECTOR_COUNTER>
     <MEM_ERASED_MARKER>SET</MEM_ERASED_MARKER>
     <MEM_USED_MARKER>SET</MEM_USED_MARKER>
     <MEM_FULL_MARKER>NOT_SET</MEM_FULL_MARKER>
     <MEM_ERASE_MARKER>NOT_SET</MEM_ERASE_MARKER>
     <MEM_START_MARKER>SET</MEM_START_MARKER>
     <MEM_START_OFFSET>0x1</MEM_START_OFFSET>
     <MEM_CLONE_MARKER>NOT_SET</MEM_CLONE_MARKER>
       <MEM_BLOCK>
         <MEM_BLOCK_ID>0x101</MEM_BLOCK_ID>
         <MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
         <MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
         <MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
         <MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
         <MEM_BLOCK_LEN>0x28</MEM_BLOCK_LEN>
         <MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
         <MEM_BLOCK_HEADER_CRC>0xE527</MEM_BLOCK_HEADER_CRC>
         <MEM_BLOCK_CRC>0xCCB7B836</MEM_BLOCK_CRC>
         <MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
         <MEM_BLOCK_DATA> 
           <MEM_PAGE_DATA>4e564d2d52656648</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>6173685374617274</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>1782af065966579e</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>899885d440d3ad67</MEM_PAGE_DATA> 
           <MEM_PAGE_DATA>d04b41b15e2b13c2</MEM_PAGE_DATA> 
         </MEM_BLOCK_DATA>
       </MEM_BLOCK>
       <MEM_BLOCK>
         <MEM_BLOCK_ID>0x20F</MEM_BLOCK_ID>
         <MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
         <MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
         <MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
         <MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
         <MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
         <MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
         <MEM_BLOCK_HEADER_CRC>0xE0D2</MEM_BLOCK_HEADER_CRC>
         <MEM_BLOCK_CRC>0xECFBBA1A</MEM_BLOCK_CRC>
         <MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
         <MEM_BLOCK_DATA> 
           <MEM_PAGE_DATA>0000</MEM_PAGE_DATA> 
         </MEM_BLOCK_DATA>
       </MEM_BLOCK>
       <MEM_BLOCK>
         <MEM_BLOCK_ID>0x1F8</MEM_BLOCK_ID>
         <MEM_BLOCK_NAME>UNKNOWN</MEM_BLOCK_NAME>
         <MEM_BLOCK_STATUS>VALID</MEM_BLOCK_STATUS>
         <MEM_BLOCK_FLAGS>0x0</MEM_BLOCK_FLAGS>
         <MEM_BLOCK_STORAGE>Emulation</MEM_BLOCK_STORAGE>
         <MEM_BLOCK_LEN>0x2</MEM_BLOCK_LEN>
         <MEM_BLOCK_VERSION>0x0</MEM_BLOCK_VERSION>
         <MEM_BLOCK_HEADER_CRC>0x1DCC</MEM_BLOCK_HEADER_CRC>
         <MEM_BLOCK_CRC>0xA54E2B5A</MEM_BLOCK_CRC>
         <MEM_BLOCK_CRC2>None</MEM_BLOCK_CRC2>
         <MEM_BLOCK_DATA> 
           <MEM_PAGE_DATA>30d4</MEM_PAGE_DATA> 
         </MEM_BLOCK_DATA>
       </MEM_BLOCK>
  </MEM_SECTOR>
</MEM_DATA>

Assuming that we have this xml data inside a file named test.xml , you can do something like that:

import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()

def search_and_copy(query):
    for child in root.findall("MEM_SECTOR/MEM_BLOCK"):
        if child.find("MEM_BLOCK_CRC").text == query:
            return [item.text for item in child.findall("MEM_BLOCK_DATA/*")]

Let's try this search_and_copy() function out:

>>> search_and_copy("0xCCB7B836")
['4e564d2d52656648', '6173685374617274', '1782af065966579e', '899885d440d3ad67', 'd04b41b15e2b13c2']

>>> search_and_copy("0xA54E2B5A")
['30d4']

We can use xpath , with python's xml etree and elementpath to write a function to retrieve the data:

Breakdown of the code below (within the elementpath.Selector ):
1. the first line looks for elements that have our search string
2. The second line .. goes back one step to get the parent element
3. Proceeding from the parent element, this line searches for MEM_PAGE_DATA within the parent element. This element holds the data we are actually interested in.
4. The rest of the code simply pulls the text from the matches

import xml.etree.ElementTree as ET
import elementpath

#wrapped the shared data into a test.xml file
root = ET.parse('test.xml').getroot()

def find_data(search_string):          
    selector = elementpath.Selector(f""".//*[text()='{search_string}'] 
                                        //..
                                        //MEM_PAGE_DATA""")
    #pull text from the match
    result = [entry.text for entry in selector.select(root)]
    return result

Test on the strings provided:

find_data("0xCCB7B836")

['4e564d2d52656648',
 '6173685374617274',
 '1782af065966579e',
 '899885d440d3ad67',
 'd04b41b15e2b13c2']


find_data("0xECFBBA1A")

['0000']

find_data("0xA54E2B5A")

['30d4']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM