[英]Using Python pulling data from xml returned from a soap request to save to csv
I have zeep pulling soap data from a SOAP endpoint as:我有 zeep 从 SOAP 端点提取肥皂数据,如下所示:
'''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetLogCategoriesResponse xmlns="http://mynamespace.net/client">
<GetLogCategoriesResult>
<IsSuccessful>true</IsSuccessful>
<Messages />
<Categories>
<LogCategory>
<Category>Client Call</Category>
<CategoryId>805</CategoryId>
<CategoryType>UDF</CategoryType>
</LogCategory>
<LogCategory>
<Category>Client Portal</Category>
<CategoryId>808</CategoryId>
<CategoryType>UDF</CategoryType>
</LogCategory>
<LogCategory>
<Category>Complaint Notes</Category>
<CategoryId>1255</CategoryId>
<CategoryType>UDF</CategoryType>
</LogCategory>
</Categories>
</GetLogCategoriesResult>
</GetLogCategoriesResponse>
</soap:Body>
</soap:Envelope>'''
I've tried pulling the data using Elementtree as below without success:我尝试使用 Elementtree 提取数据,如下所示,但没有成功:
'''from zeep import Client, Transport
import xml.etree.ElementTree as ET
client = Client('http://sandbox.mynamespace.net/Client.asmx?wsdl')
with client.settings(raw_response=True):
soap_result = client.service.GetLogCategories(userName='user', password='pass')
namespaces = {
'soap': 'http://schemas.xmlsoap.org/soap/envelope/',
'a': 'http://mynamespace.net/client',
}
dom = ET.fromstring(soap_result.content)
print(dom)
names = dom.findall(
'./soap:Body'
'./a:GetLogCategoriesResponse'
'./a:GetLogCategoriesResult'
'./a:Categories'
'./a:LogCategory'
'./a:Category',
namespaces,)
print(names)
for name in names:
print('in For')
print(name.text)'''
I do have a partially working but it only pulls back the first instance of the data group, I need to pull back all groups:我确实有一个部分工作,但它只拉回数据组的第一个实例,我需要拉回所有组:
'''from zeep import Client, Transport
from bs4 import BeautifulSoup
client = Client('http://sandbox.mynamespace.net/2.18/Client.asmx?wsdl')
with client.settings(raw_response=True):
soap_result = client.service.GetLogCategories(userName='uname', password='pass')
soup = BeautifulSoup(soap_result.text, 'html.parser')
searchTerms = ['Category','CategoryId','CategoryType']
for st in searchTerms:
print(st+'\t',)
print(soup.find(st.lower()).contents[0])'''
I am looking for any pointers or solutions that will work at this point.我正在寻找在这一点上有效的任何指针或解决方案。 Thanks again
再次感谢
Try this.尝试这个。
from simplified_scrapy.simplified_doc import SimplifiedDoc
html = '''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetLogCategoriesResponse xmlns="http://mynamespace.net/client">
<GetLogCategoriesResult>
<IsSuccessful>true</IsSuccessful>
<Messages />
<Categories>
<LogCategory>
<Category>Client Call</Category>
<CategoryId>805</CategoryId>
<CategoryType>UDF</CategoryType>
</LogCategory>
<LogCategory>
<Category>Client Portal</Category>
<CategoryId>808</CategoryId>
<CategoryType>UDF</CategoryType>
</LogCategory>
<LogCategory>
<Category>Complaint Notes</Category>
<CategoryId>1255</CategoryId>
<CategoryType>UDF</CategoryType>
</LogCategory>
</Categories>
</GetLogCategoriesResult>
</GetLogCategoriesResponse>
</soap:Body>
</soap:Envelope>'''
doc = SimplifiedDoc(html)
Categories = doc.getElementsByTag('LogCategory')
print ([(c.Category.text,c.CategoryId.text,c.CategoryType.text) for c in Categories])
Result:结果:
[('Client Call', '805', 'UDF'), ('Client Portal', '808', 'UDF'), ('Complaint Notes', '1255', 'UDF')]
Here are more examples of SimplifiedDoc here这里有SimplifiedDoc更多的例子在这里
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.