简体   繁体   English

使用Python从soap请求返回的xml中提取数据以保存到csv

[英]Using Python pulling data from xml returned from a soap request to save to csv

I have zeep pulling soap data from a SOAP endpoint as:我有 zeep 从 SOAP 端点提取肥皂数据,如下所示:

'''<?xml version="1.0" encoding="utf-8"?>
    <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <soap:Body>
            <GetLogCategoriesResponse xmlns="http://mynamespace.net/client">
                <GetLogCategoriesResult>
                    <IsSuccessful>true</IsSuccessful>
                    <Messages />
                    <Categories>
                        <LogCategory>
                            <Category>Client Call</Category>
                            <CategoryId>805</CategoryId>
                            <CategoryType>UDF</CategoryType>
                        </LogCategory>
                        <LogCategory>
                            <Category>Client Portal</Category>
                            <CategoryId>808</CategoryId>
                            <CategoryType>UDF</CategoryType>
                        </LogCategory>
                        <LogCategory>
                            <Category>Complaint Notes</Category>
                            <CategoryId>1255</CategoryId>
                            <CategoryType>UDF</CategoryType>
                        </LogCategory>
                    </Categories>
                </GetLogCategoriesResult>
            </GetLogCategoriesResponse>
        </soap:Body>
    </soap:Envelope>'''

I've tried pulling the data using Elementtree as below without success:我尝试使用 Elementtree 提取数据,如下所示,但没有成功:

'''from zeep import Client, Transport
    import xml.etree.ElementTree as ET



    client = Client('http://sandbox.mynamespace.net/Client.asmx?wsdl')
    with client.settings(raw_response=True):
        soap_result = client.service.GetLogCategories(userName='user', password='pass')

    namespaces = {
        'soap': 'http://schemas.xmlsoap.org/soap/envelope/',
        'a': 'http://mynamespace.net/client',
    }

    dom = ET.fromstring(soap_result.content)
    print(dom)
    names = dom.findall(
        './soap:Body'
        './a:GetLogCategoriesResponse'
        './a:GetLogCategoriesResult'
        './a:Categories'
        './a:LogCategory'
        './a:Category',
        namespaces,)
    print(names)
    for name in names:
        print('in For')
        print(name.text)'''

I do have a partially working but it only pulls back the first instance of the data group, I need to pull back all groups:我确实有一个部分工作,但它只拉回数据组的第一个实例,我需要拉回所有组:

'''from zeep import Client, Transport
from bs4 import BeautifulSoup

client = Client('http://sandbox.mynamespace.net/2.18/Client.asmx?wsdl')
with client.settings(raw_response=True):
    soap_result = client.service.GetLogCategories(userName='uname', password='pass')

soup = BeautifulSoup(soap_result.text, 'html.parser')
searchTerms = ['Category','CategoryId','CategoryType']
for st in searchTerms:
    print(st+'\t',)
    print(soup.find(st.lower()).contents[0])'''

I am looking for any pointers or solutions that will work at this point.我正在寻找在这一点上有效的任何指针或解决方案。 Thanks again再次感谢

Try this.尝试这个。

from simplified_scrapy.simplified_doc import SimplifiedDoc 
html = '''<?xml version="1.0" encoding="utf-8"?>
    <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <soap:Body>
            <GetLogCategoriesResponse xmlns="http://mynamespace.net/client">
                <GetLogCategoriesResult>
                    <IsSuccessful>true</IsSuccessful>
                    <Messages />
                    <Categories>
                        <LogCategory>
                            <Category>Client Call</Category>
                            <CategoryId>805</CategoryId>
                            <CategoryType>UDF</CategoryType>
                        </LogCategory>
                        <LogCategory>
                            <Category>Client Portal</Category>
                            <CategoryId>808</CategoryId>
                            <CategoryType>UDF</CategoryType>
                        </LogCategory>
                        <LogCategory>
                            <Category>Complaint Notes</Category>
                            <CategoryId>1255</CategoryId>
                            <CategoryType>UDF</CategoryType>
                        </LogCategory>
                    </Categories>
                </GetLogCategoriesResult>
            </GetLogCategoriesResponse>
        </soap:Body>
    </soap:Envelope>'''

doc = SimplifiedDoc(html)
Categories = doc.getElementsByTag('LogCategory')
print ([(c.Category.text,c.CategoryId.text,c.CategoryType.text) for c in Categories])

Result:结果:

[('Client Call', '805', 'UDF'), ('Client Portal', '808', 'UDF'), ('Complaint Notes', '1255', 'UDF')]

Here are more examples of SimplifiedDoc here这里有SimplifiedDoc更多的例子在这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM