简体   繁体   中英

parsing XML file in python2.7

I know this is a very common question, but the kind of XML file and the kind of extraction of data i need is a little unique due to the nature of the xml file. So appreciate any help on the steps to extract the required data, with pyhton2.7

I have the below XML

<?xml version="1.0" encoding="UTF-8"?>
<Package xmlns="http://soap.sforce.com/2006/04/metadata">
    <types>
        <members>Mango.XYZ_DIG_Team_ABCDEF_Mango_Review</members>
        <members>Mango.XYZ_DIG_Team_Reporting_Mango_Review</members>
        <members>Opportunity.A_T_Occupier_City_Job_List</members>
        <name>ListView</name>
    </types>
    <types>
        <members>Modify_All_Data_Permission</members>
        <members>Opportunity_Alerts_Implementation</members>
        <members>Process_Builder_Permission</members>
        <members>Regional_Business_Support</members>
        <members>Reports_Dashboards_Data_Export_for_Super_Users</members>
        <name>PermissionSet</name>
    </types>
    <types>
        <members>SolutionManager</members>
        <members>Standard</members>
        <name>Profile</name>
    </types>
     <types>
        <members>Mango.Set Verified Date and System Id</members>
        <members>Mango.Update Mango Site With Billing Street%2C City%2C Country</members>
        <members>Mango.Update Family Id on Mango when created</members>
        <members>Opportunity.Set Opportunity Name</members>
        <name>WorkflowRule</name>
    </types>
    <version>38.0</version>
</Package>

i am trying to extract only the members from the PermissionSet block. So that eventually i will have a file, that only have the entries like

    Modify_All_Data_Permission
    Opportunity_Alerts_Implementation
    Process_Builder_Permission
    Regional_Business_Support
    Reports_Dashboards_Data_Export_for_Super_Users

I have been able to extract only the 'name' tag by

from xml.dom import minidom

doc = minidom.parse("path_to_xmlFile")


t = doc.getElementsByTagName("types")
for n in t:
    name = n.getElementsByTagName("name")[0]
    print name.firstChild.data

How can i extract the members and save that to a file?

Note: the number of 'members' are not fixed they varies. I can also try with a different library, if it serves the purpose.

Probably easiest to use XPath

import xml.etree.ElementTree as ET

root = ET.parse('file.xml').getroot()
for member in root.findall(".//members/")
    print(member.text)

This may help you!

import xml.etree.ElementTree as ET

tree = ET.parse('file.xml')
root = tree.getroot()
for data in root[1]:
   print data.text

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM