简体   繁体   中英

XML not returning correct child tags/data in Python

Hello I am making a requests call to return order data from a online store. My issue is that once I have passed my data to a root variable the method iter is not returning the correct results. eg Display multiple tags of the same name rather than one and not showing the data within the tag.

I thought this was due to the XML not being correctly formatted so I formatted it by saving it to a file using pretty_print but that hasn't fixed the error.

How do I fix this? - Thanks in advance

Code:

import requests, xml.etree.ElementTree as ET, lxml.etree as etree

url="http://publicapi.ekmpowershop24.com/v1.1/publicapi.asmx"
headers = {'content-type': 'application/soap+xml'}
body = """<?xml version="1.0" encoding="utf-8"?>
<soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope">
  <soap12:Body>
    <GetOrders xmlns="http://publicapi.ekmpowershop.com/">
      <GetOrdersRequest>
        <APIKey>my_api_key</APIKey>
        <FromDate>01/07/2018</FromDate>
        <ToDate>04/07/2018</ToDate>
      </GetOrdersRequest>
    </GetOrders>
  </soap12:Body>
</soap12:Envelope>"""

#send request to ekm
r = requests.post(url,data=body,headers=headers)

#save output to file
file = open("C:/Users/Mark/Desktop/test.xml", "w")
file.write(r.text)
file.close()

#take the file and format the xml
x = etree.parse("C:/Users/Mark/Desktop/test.xml")
newString = etree.tostring(x, pretty_print=True)
file = open("C:/Users/Mark/Desktop/test.xml", "w")
file.write(newString.decode('utf-8'))
file.close()

#parse the file to get the roots
tree = ET.parse("C:/Users/Mark/Desktop/test.xml")
root = tree.getroot()

#access elements names in the data
for child in root.iter('*'):
    print(child.tag)

#show orders elements attributes
tree = ET.parse("C:/Users/Mark/Desktop/test.xml")
root = tree.getroot()
for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
    out = {}
    for child in order:
        if child.tag in ('OrderID'):
        out[child.tag] = child.text
    print(out)

Elements output:

{http://publicapi.ekmpowershop.com/}Orders
{http://publicapi.ekmpowershop.com/}Order
{http://publicapi.ekmpowershop.com/}OrderID
{http://publicapi.ekmpowershop.com/}OrderNumber
{http://publicapi.ekmpowershop.com/}CustomerID
{http://publicapi.ekmpowershop.com/}CustomerUserID
{http://publicapi.ekmpowershop.com/}Order
{http://publicapi.ekmpowershop.com/}OrderID
{http://publicapi.ekmpowershop.com/}OrderNumber
{http://publicapi.ekmpowershop.com/}CustomerID
{http://publicapi.ekmpowershop.com/}CustomerUserID

Orders Output:

{http://publicapi.ekmpowershop.com/}Order {}
{http://publicapi.ekmpowershop.com/}Order {}

XML Structure after formating:

 <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <soap:Body>
    <GetOrdersResponse xmlns="http://publicapi.ekmpowershop.com/">
      <GetOrdersResult>
        <Status>Success</Status>
        <Errors/>
        <Date>2018-07-10T13:47:00.1682029+01:00</Date>
        <TotalOrders>10</TotalOrders>
        <TotalCost>100</TotalCost>
        <Orders>
          <Order>
            <OrderID>100</OrderID>
            <OrderNumber>102/040718/67</OrderNumber>
            <CustomerID>6910</CustomerID>
            <CustomerUserID>204</CustomerUserID>
            <FirstName>TestFirst</FirstName>
            <LastName>TestLast</LastName>
            <CompanyName>Test Company</CompanyName>
            <EmailAddress>test@Test.com</EmailAddress>
            <OrderStatus>Dispatched</OrderStatus>
            <OrderStatusColour>#00CC00</OrderStatusColour>
            <TotalCost>85.8</TotalCost>
            <OrderDate>10/07/2018 14:30:43</OrderDate>
            <OrderDateISO>2018-07-10T14:30:43</OrderDateISO>
            <AbandonedOrder>false</AbandonedOrder>
            <EkmStatus>SUCCESS</EkmStatus>
          </Order>
        </Orders>
        <Currency>GBP</Currency>
      </GetOrdersResult>
    </GetOrdersResponse>
  </soap:Body>
</soap:Envelope>

You need to consider the namespace when checking for tags.

>>> # Include the namespace part of the tag in the tag values that we check.
>>> tags = ('{http://publicapi.ekmpowershop.com/}OrderID', '{http://publicapi.ekmpowershop.com/}OrderNumber')
>>> for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
...     out = {}
...     for child in order:
...         if child.tag in tags:
...             out[child.tag] = child.text
...     print(out)
... 
{'{http://publicapi.ekmpowershop.com/}OrderID': '100', '{http://publicapi.ekmpowershop.com/}OrderNumber': '102/040718/67'}

If you don't want the namespace prefixes in the output, you can strip them by only including that part of the tag after the } character.

>>> for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
...     out = {}
...     for child in order:
...         if child.tag in tags:
...             out[child.tag[child.tag.index('}')+1:]] = child.text
...     print(out)
... 
{'OrderID': '100', 'OrderNumber': '102/040718/67'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM