简体   繁体   中英

Working with xml and exporting names of nodes

I wrote this code below. In my XML file I have nodes:

Assembly_1, Detail_1, Detail_2, Assembly_2, Detail_3

What I am trying to do is to get the name of the assembly for each detail (Detail_1 and 2 would be in Assembly_1, etc.)

I have a lot of details... more than 200. So this code (function) works good but it takes a lot of time because the XML file is loaded each time.

How can I make it run faster?

def CorrectAssembly(detail):

    from xml.dom import minidom

    xml_path = r"C:\Users\vblagoje\test_python_s2k\Load_Independent_Results\HSB53111-01-D_2008_v2-Final-Test-Cases_All_1.1.xml"
    mydoc=minidom.parse(xml_path)
    root = mydoc.getElementsByTagName("FEST2000")
    assembly=""

    for node in root:
        for childNodes in node.childNodes:
            if childNodes.nodeType == childNodes.TEXT_NODE: continue

            if childNodes.nodeName == "ASSEMBLY":
                assembly = childNodes.getAttribute("NAME")
            if childNodes.nodeName == "DETAIL":
                if detail == childNodes.getAttribute("NAME"):
                    break

    return assembly

One solution is, to simply read the XML-file once before looking up all the details.
Something along this:

from xml.dom import minidom


def CorrectAssembly(detail, root):

    assembly=""

    for node in root:
        for childNodes in node.childNodes:
            if childNodes.nodeType == childNodes.TEXT_NODE: continue

            if childNodes.nodeName == "ASSEMBLY":
                assembly = childNodes.getAttribute("NAME")
            if childNodes.nodeName == "DETAIL":
                if detail == childNodes.getAttribute("NAME"):
                    break

    return assembly


xml_path = r"C:\Users\vblagoje\test_python_s2k\Load_Independent_Results\HSB53111-01-D_2008_v2-Final-Test-Cases_All_1.1.xml"
mydoc=minidom.parse(xml_path)
root = mydoc.getElementsByTagName("FEST2000")

aDetail = "myDetail"
assembly = CorrectAssembly(aDetail, root)
anotherDetail = "myDetail2"
assembly = CorrectAssembly(anotherDetail , root)
# an so on

You still go through (part of) the loaded XML every time you call the function though. Maybe it is beneficial to create a dictionary mapping the assembly to details and then to simply look them up when you need it:

from xml.dom import minidom

# read the xml
xml_path = r"C:\Users\vblagoje\test_python_s2k\Load_Independent_Results\HSB53111-01-D_2008_v2-Final-Test-Cases_All_1.1.xml"
mydoc=minidom.parse(xml_path)
root = mydoc.getElementsByTagName("FEST2000")

detail_assembly_map = {}

# fill the dictionary
for node in root:
    for childNodes in node.childNodes:
        if childNodes.nodeType == childNodes.TEXT_NODE: continue
        if childNodes.nodeName == "ASSEMBLY":
            assembly = childNodes.getAttribute("NAME")
        if childNodes.nodeName == "DETAIL":
            detail_assembly_map[childNodes.getAttribute("NAME")] = assembly

# use it
aDetail = "myDetail"
assembly = detail_assembly_map[aDetail]

From your post it is not really clear how the structure of the XML is, but in case the details are children of the assemblies , then the mapping could be done differently by iterating first through the assembly-knots and therein through its detail-children . Then you would not rely on a proper ordering of the elements.

This post could be helpful too, depending on the structure of your XML-tree.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM