[英]Parsing zap xml report (or how to get child node with xpath in python)
有一个简单的问题(对许多开发人员来说并不难)
我想解析 zap xml 报告并获取警报的所有节点:
这里是 xml 格式的 zap 报告的摘录:
<?xml version="1.0"?><OWASPZAPReport version="D-2020-10-20" generated="Wed, 18 Nov 2020 16:51:34">
<site name="http://webmail.example.com" host="webmail.example.com" port="80" ssl="false"><alerts>
<alertitem>
<pluginid>3</pluginid>
<alert>Session ID in URL Rewrite</alert>
<name>Session ID in URL Rewrite</name>
<instances>
<instance>
<uri>http://webmail.example.com/dyn/login.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?cid=47</uri>
<method>GET</method>
<evidence>jsessionid=c2e851a8c7f47dcd4dea016fd1e0</evidence>
</instance>
<instance>
<uri>http://webmail.example.com/dyn/portal/index.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?aloId=21152&cid=47&page=alo</uri>
<method>GET</method>
<evidence>jsessionid=c2e851a8c7f47dcd4dea016fd1e0</evidence>
</instance>
</instances>
<count>6</count>
<solution><p>For secure content, put session ID in a cookie. To be even more secure consider using a combination of cookie and URL rewrite.</p></solution>
<reference><p>http://seclists.org/lists/webappsec/2002/Oct-Dec/0111.html</p></reference>
<cweid>200</cweid>
<wascid>13</wascid>
<sourceid>3</sourceid>
</alertitem>
</alerts>
想要获取子节点内容(如 uri/method/evidence)。
实际上我正在使用此代码(在python3中)并且能够获取所有警报项:
tree = etree.parse(report_file)
root = tree.getroot()
for site in tree.findall('site'):
sitename = site.attrib['name']
for alert in site.findall('.//alertitem'):
name_alert = alert.find('name').text
...
但我想解析子节点
并获取 uri 的内容,例如 http://webmail.example.com/[...]
你可以帮帮我吗 ?
尝试这个:
from lxml import etree
alerts = """[your xml above (make sure it's well formed)]"""
doc = etree.XML(alerts)
for instance in doc.xpath('//instance'):
print(instance.xpath('./uri')[0].text)
print(instance.xpath('./method')[0].text)
print(instance.xpath('./evidence')[0].text)
输出:
http://webmail.example.com/dyn/login.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?cid=47
GET
jsessionid=c2e851a8c7f47dcd4dea016fd1e0
http://webmail.example.com/dyn/portal/index.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?aloId=21152&cid=47&page=alo
GET
jsessionid=c2e851a8c7f47dcd4dea016fd1e0
谢谢杰克! 最终代码(如果有人需要的话):
tree = etree.parse(report_file)
root = tree.getroot()
for site in tree.xpath('site'):
sitename = site.attrib['name']
for alert in site.xpath('//alertitem'):
nom = alert.find('name').text
criticite = alert.find('riskdesc').text
try:otherinfo = alert.find('otherinfo').text
except:otherinfo=""
desc = alert.find('desc').text
try:url_trace = alert.find('uri').text
except:url_trace = ""
try:method = alert.find('method').text
except:method = ""
try:parametre = alert.find('param').text
except:parametre = ""
for instance in alert.xpath('.//instance'):
print(instance.xpath('./uri')[0].text)
#print(instance.xpath('./method')[0].text)
try:
print(instance.xpath('./evidence')[0].text)
except:pass
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.