简体   繁体   中英

python elementtree

I have been trying to parse data from a xml file for several days now and I can't get it to work. From the example below I need per layer status, index and from under foreground/producer type and filename. The problem is that the structure is different depending on the content. Look at index 2 where filename is under foreground/producer/fill/producer (I do not need the filenamne under foreground/producer/key/producer). I'm looking for a simple solution (have been trying with etree.ElementTree but parsing seems so difficult).

<?xml version="1.0" encoding="utf-8"?>
<channel>
   <video-mode>1080i5000</video-mode>
   <stage>
      <layers>
         <layer>
            <status>stopped</status>
            <auto_delta>-1</auto_delta>
            <frame-number>1829997</frame-number>
            <nb_frames>0</nb_frames>
            <frames-left>-1829996</frames-left>
            <foreground>
               <producer>
                  <type>empty-producer</type>
               </producer>
            </foreground>
            <background>
               <producer>
                  <type>transition-producer</type>
                  <source>
                     <producer>
                        <type>empty-producer</type>
                     </producer>
                  </source>
                  <destination>
                     <producer>
                        <type>ffmpeg-producer</type>
                        <filename>media\\MULTI\testfile2.mpg</filename>
                        <width>1920</width>
                        <height>1080</height>
                        <progressive>true</progressive>
                        <fps>25</fps>
                        <loop>false</loop>
                        <frame-number>0</frame-number>
                        <nb-frames>4396</nb-frames>
                        <file-frame-number>0</file-frame-number>
                        <file-nb-frames>4396</file-nb-frames>
                     </producer>
                  </destination>
               </producer>
            </background>
            <index>0</index>
         </layer>
         <layer>
            <status>playing</status>
            <auto_delta>-1</auto_delta>
            <frame-number>1830920</frame-number>
            <nb_frames>4294967295</nb_frames>
            <frames-left>4293136376</frames-left>
            <foreground>
               <producer>
                  <type>ffmpeg-producer</type>
                  <filename>media\AMB.mp4</filename>
                  <width>720</width>
                  <height>576</height>
                  <progressive>true</progressive>
                  <fps>25</fps>
                  <loop>true</loop>
                  <frame-number>1830920</frame-number>
                  <nb-frames>4294967295</nb-frames>
                  <file-frame-number>520</file-frame-number>
                  <file-nb-frames>1600</file-nb-frames>
               </producer>
            </foreground>
            <background>
               <producer>
                  <type>empty-producer</type>
               </producer>
            </background>
            <index>1</index>
         </layer>
         <layer>
            <status>playing</status>
            <auto_delta>-1</auto_delta>
            <frame-number>1830758</frame-number>
            <nb_frames>4294967295</nb_frames>
            <frames-left>4293136538</frames-left>
            <foreground>
               <producer>
                  <type>separated-producer</type>
                  <fill>
                     <producer>
                        <type>ffmpeg-producer</type>
                        <filename>media\action.mpg</filename>
                        <width>1920</width>
                        <height>1080</height>
                        <progressive>false</progressive>
                        <fps>25</fps>
                        <loop>true</loop>
                        <frame-number>1830758</frame-number>
                        <nb-frames>4294967295</nb-frames>
                        <file-frame-number>22</file-frame-number>
                        <file-nb-frames>247</file-nb-frames>
                     </producer>
                  </fill>
                  <key>
                     <producer>
                        <type>ffmpeg-producer</type>
                        <filename>media\action_a.mpg</filename>
                        <width>1920</width>
                        <height>1080</height>
                        <progressive>false</progressive>
                        <fps>25</fps>
                        <loop>true</loop>
                        <frame-number>1830758</frame-number>
                        <nb-frames>4294967295</nb-frames>
                        <file-frame-number>22</file-frame-number>
                        <file-nb-frames>247</file-nb-frames>
                     </producer>
                  </key>
               </producer>
            </foreground>
            <background>
               <producer>
                  <type>empty-producer</type>
               </producer>
            </background>
            <index>2</index>
         </layer>
      </layers>
   </stage>
   <mixer/>
   <output>
      <consumers>
         <consumer>
            <type>oal-consumer</type>
            <index>500</index>
         </consumer>
         <consumer>
            <type>ogl-consumer</type>
            <key-only>false</key-only>
            <windowed>true</windowed>
            <auto-deinterlace>true</auto-deinterlace>
            <index>600</index>
         </consumer>
      </consumers>
   </output>
   <index>0</index>
</channel>
import xml.etree.ElementTree as ET                                              
tree = ET.parse('x.xml')                                                        
root = tree.getroot()                                                           

for child in root:                                                              
    print child.tag                                                             
    for child2 in child:                                                        
        print '> ',child2.tag                                                   

'''                                                                             
====                                                                            
output                                                                          
====                                                                            
video-mode                                                                      
stage                                                                           
>  layers                                                                       
mixer                                                                           
output                                                                          
>  consumers                                                                    
index                                                                           

'''   

With regards to the problem: "that the structure is different depending on the content." Every XML is define with regards to a definition, the DTD. The structure of a file can't change internally, otherwise it would be ill-defined. If what you mean is, you want to parse parts of the tree depending on leafs above the node, you will have to come up with some if then else statements and functions, for example such as so:

import xml.etree.ElementTree as ET                                              
tree = ET.parse('x.xml')                                                        
root = tree.getroot()                                                           

def parseStageTag(element):                                                     
    print 'parsing Stage'                                                       
    for child in element:                                                       
        if child.tag=='layers':                                                 
            parseLayersTag(child)                                               

def parseOutputTag(element):                                                    
    pass                                                                        

def parseLayersTag(element):                                                    
    print 'parsing Layers'                                                      
    for child in element:                                                       
        print child                                                             


for child in root:                                                              
    if child.tag=='stage':                                                      
        parseStageTag(child)                                                    

    for child2 in child:                                                        
        print '> ',child2.tag  
'''
output
parsing Stage
parsing Layers
<Element 'layer' at 0x1079e4250>
<Element 'layer' at 0x1079e4f10>
<Element 'layer' at 0x1079e6510>
>  layers
>  consumers
'''

I've found similar issues parsing XML files, until I discovered ElementTree's support for XPath

For example, the following code:

import os
import xml.etree.ElementTree

os.chdir('C:/temp/blah')

et = xml.etree.ElementTree.parse('file.xml')
layerTagList = et.findall("./stage/layers/layer")

for curLayerTag in layerTagList:
    indexTag = curLayerTag.find("./index")
    print "Layer[%s]" %(indexTag.text)
    fgFiles = curLayerTag.findall(".//foreground//filename")
    for fileTag in fgFiles:
            print "  FG - %s" %(fileTag.text)
    bgFiles = curLayerTag.findall(".//background//filename")
    for fileTag in bgFiles:
            print "  BG - %s" %(fileTag.text)

gives the output:

Layer[0]
  BG - media\\MULTI\testfile2.mpg
Layer[1]
  FG - media\AMB.mp4
Layer[2]
  FG - media\action.mpg
  FG - media\action_a.mpg

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM