简体   繁体   English

XML无法在Python中返回正确的子标记/数据

[英]XML not returning correct child tags/data in Python

Hello I am making a requests call to return order data from a online store. 您好,我正在打电话要求从网上商店退回订单数据。 My issue is that once I have passed my data to a root variable the method iter is not returning the correct results. 我的问题是,一旦将数据传递给根变量,iter方法就不会返回正确的结果。 eg Display multiple tags of the same name rather than one and not showing the data within the tag. 例如,显示多个同名而不是一个的标签,并且不显示标签内的数据。

I thought this was due to the XML not being correctly formatted so I formatted it by saving it to a file using pretty_print but that hasn't fixed the error. 我认为这是由于XML格式不正确,所以我通过使用pretty_print将其保存到文件中来对其进行了格式化,但这并没有解决错误。

How do I fix this? 我该如何解决? - Thanks in advance - 提前致谢

Code: 码:

import requests, xml.etree.ElementTree as ET, lxml.etree as etree

url="http://publicapi.ekmpowershop24.com/v1.1/publicapi.asmx"
headers = {'content-type': 'application/soap+xml'}
body = """<?xml version="1.0" encoding="utf-8"?>
<soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope">
  <soap12:Body>
    <GetOrders xmlns="http://publicapi.ekmpowershop.com/">
      <GetOrdersRequest>
        <APIKey>my_api_key</APIKey>
        <FromDate>01/07/2018</FromDate>
        <ToDate>04/07/2018</ToDate>
      </GetOrdersRequest>
    </GetOrders>
  </soap12:Body>
</soap12:Envelope>"""

#send request to ekm
r = requests.post(url,data=body,headers=headers)

#save output to file
file = open("C:/Users/Mark/Desktop/test.xml", "w")
file.write(r.text)
file.close()

#take the file and format the xml
x = etree.parse("C:/Users/Mark/Desktop/test.xml")
newString = etree.tostring(x, pretty_print=True)
file = open("C:/Users/Mark/Desktop/test.xml", "w")
file.write(newString.decode('utf-8'))
file.close()

#parse the file to get the roots
tree = ET.parse("C:/Users/Mark/Desktop/test.xml")
root = tree.getroot()

#access elements names in the data
for child in root.iter('*'):
    print(child.tag)

#show orders elements attributes
tree = ET.parse("C:/Users/Mark/Desktop/test.xml")
root = tree.getroot()
for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
    out = {}
    for child in order:
        if child.tag in ('OrderID'):
        out[child.tag] = child.text
    print(out)

Elements output: 元素输出:

{http://publicapi.ekmpowershop.com/}Orders
{http://publicapi.ekmpowershop.com/}Order
{http://publicapi.ekmpowershop.com/}OrderID
{http://publicapi.ekmpowershop.com/}OrderNumber
{http://publicapi.ekmpowershop.com/}CustomerID
{http://publicapi.ekmpowershop.com/}CustomerUserID
{http://publicapi.ekmpowershop.com/}Order
{http://publicapi.ekmpowershop.com/}OrderID
{http://publicapi.ekmpowershop.com/}OrderNumber
{http://publicapi.ekmpowershop.com/}CustomerID
{http://publicapi.ekmpowershop.com/}CustomerUserID

Orders Output: 订单输出:

{http://publicapi.ekmpowershop.com/}Order {}
{http://publicapi.ekmpowershop.com/}Order {}

XML Structure after formating: 格式化后的XML结构:

 <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <soap:Body>
    <GetOrdersResponse xmlns="http://publicapi.ekmpowershop.com/">
      <GetOrdersResult>
        <Status>Success</Status>
        <Errors/>
        <Date>2018-07-10T13:47:00.1682029+01:00</Date>
        <TotalOrders>10</TotalOrders>
        <TotalCost>100</TotalCost>
        <Orders>
          <Order>
            <OrderID>100</OrderID>
            <OrderNumber>102/040718/67</OrderNumber>
            <CustomerID>6910</CustomerID>
            <CustomerUserID>204</CustomerUserID>
            <FirstName>TestFirst</FirstName>
            <LastName>TestLast</LastName>
            <CompanyName>Test Company</CompanyName>
            <EmailAddress>test@Test.com</EmailAddress>
            <OrderStatus>Dispatched</OrderStatus>
            <OrderStatusColour>#00CC00</OrderStatusColour>
            <TotalCost>85.8</TotalCost>
            <OrderDate>10/07/2018 14:30:43</OrderDate>
            <OrderDateISO>2018-07-10T14:30:43</OrderDateISO>
            <AbandonedOrder>false</AbandonedOrder>
            <EkmStatus>SUCCESS</EkmStatus>
          </Order>
        </Orders>
        <Currency>GBP</Currency>
      </GetOrdersResult>
    </GetOrdersResponse>
  </soap:Body>
</soap:Envelope>

You need to consider the namespace when checking for tags. 检查标签时,需要考虑名称空间。

>>> # Include the namespace part of the tag in the tag values that we check.
>>> tags = ('{http://publicapi.ekmpowershop.com/}OrderID', '{http://publicapi.ekmpowershop.com/}OrderNumber')
>>> for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
...     out = {}
...     for child in order:
...         if child.tag in tags:
...             out[child.tag] = child.text
...     print(out)
... 
{'{http://publicapi.ekmpowershop.com/}OrderID': '100', '{http://publicapi.ekmpowershop.com/}OrderNumber': '102/040718/67'}

If you don't want the namespace prefixes in the output, you can strip them by only including that part of the tag after the } character. 如果您不希望在输出中使用名称空间前缀,则可以通过仅在}字符之后包含标记的那一部分来去除它们。

>>> for order in root.iter('{http://publicapi.ekmpowershop.com/}Order'):
...     out = {}
...     for child in order:
...         if child.tag in tags:
...             out[child.tag[child.tag.index('}')+1:]] = child.text
...     print(out)
... 
{'OrderID': '100', 'OrderNumber': '102/040718/67'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM