简体   繁体   中英

Python XML Parsing the child tags in order

My XML file is similar to the one as below:

<suite name="regression_1">
    <test name="Login check" id="s1-t1">
        <keyword name="Valid Username and Password">
            <keyword name="Invalid Username or Password">
                <keyword name="Invalid password">
                    <message level="TRACE" >Return error</message>
                    <status status="PASS"/>
                </keyword>
                <message level="INFO">Return error</message>
                <status status="FAIL"/>
            </keyword>
            <message level="INFO">Return: None</message>
            <status status="PASS"/>
        </keyword>
        <status status="FAIL"/>
    </test>
    <test name="test-2" id="s1-t1">
        <keyword name="abc">
            <keyword name="def">
                <message level="INFO">Return error</message>
                <status status="FAIL"/>
            </keyword>
            <message level="INFO">Return: None</message>
            <status status="PASS"/>
        </keyword>
        <status status="FAIL"/>
    </test>
</suite>

My output should check for the keywords and give the keyword structure for those whose status is "FAIL". There will be many keywords in a test and there may or may not be child keywords.

**** Sample Output *******

Suite: regression_1

Test Name: Login check

Keyword failed: ["Valid Username & Password", "Invalid Username or Password"]

Failure test case message : Return error

Suite: regression_1

Test Name: test-2

Keyword failed: ["abc","def"]

Failure test case message : Return error


My code is able to dig till the last child to collect the fail status. But not able to parse the proper path which is required for analysis. Also I think the complete loop is not getting executed. ie if 3rd child is "PASS", its not coming back to the 2nd child to check its status.

def getStatusForNode(tc):
    status_to_be_returned = []
    is_just_father = False


    for child in tc.childNodes:
        if child.nodeName == "keyword":
            is_just_father = True
            status_to_be_returned.append(getStatusForNode(child)[0])
            keyword_track.append(child.getAttribute("name"))

    if not is_just_father:
        status = tc.getElementsByTagName('status')
        return [(tc, status)]


    return  status_to_be_returned


DOMTree = xml.dom.minidom.parse("output.xml")
collection = DOMTree.documentElement
tc_entry = collection.getElementsByTagName("suite")

top = Element('tests')
comment = Comment("This xml is generated only for failing tests")
top.append(comment)


for tc in tc_entry:
    if tc.hasAttribute("name"):
       print("Suite name: {}".format(tc.getAttribute("name")))

    tests = tc.getElementsByTagName('test')
    for test in tests:
        keyword_track = []
        for child in test.childNodes:
            if child.nodeName == "keyword":
                children_status = getStatusForNode(child)
                for (tc_name, status) in children_status:
                    for state in status:
                        if state.getAttribute("status") != "PASS":
                            print("---")
                            print("Test name: {}".format(test.getAttribute("name")))
                            print("Keyword failed: {}".format(tc_name.getAttribute("name")))
                            print("Status: {}".format(state.getAttribute("status")))
                            messages = tc_name.getElementsByTagName('msg')
                            print("Failure test case messages:")
                            for message in messages:
                                print(message.childNodes[0].data)
                            print ("")

Output received from this code:

Test name: ABC

Keyword name: keyword_1-2-3

Status: FAIL

Failure test case messages: Failed in level 3

Any suggested optimisations for the code?

Question : XML Parsing the child tags in order

Solution with xml.etree.ElementTree , for instance:

Note : Still makes no sense to have the First <keyword> in Keyword faild: , both have PASS . If you want to have the First <keyword> in Output, remove # .

from xml.etree import ElementTree as ET

with open('output.xml') as fh:
    suite = ET.fromstring(fh.read())

# Find all <test>"
for test in suite.findall('./test'):
    keyword_failed = []
    # first_keyword = test.find('./keyword')
    # keyword_failed = [first_keyword.attrib['name']]
    message = None

    # Find all <test><keyword> <status status="FAIL">
    for keyword in test.findall('.//keyword/status[@status="FAIL"]/..'):
        keyword_failed.append(keyword.attrib['name'])
        message = keyword.find('./message')

        print('Suite: {}'.format(suite.attrib['name']))
        print('\tTest Name: {}'.format(test.attrib['name']))
        print('\tKeyword failed: {}'.format(keyword_failed))
        print('\tFailure test case message : level={} {}'.format(message.attrib['level'], message.text))

Output :
Suite: regression_1
Test Name: Login check
Keyword failed: ['Invalid Username or Password']
Failure test case message : level=INFO Return error
Suite: regression_1
Test Name: test-2
Keyword failed: ['def']
Failure test case message : level=INFO Return error

Tested with Python: 3.4.2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM