繁体   English   中英

解析 zap xml 报告(或如何在 python 中使用 xpath 获取子节点)

[英]Parsing zap xml report (or how to get child node with xpath in python)

有一个简单的问题(对许多开发人员来说并不难)

我想解析 zap xml 报告并获取警报的所有节点:

这里是 xml 格式的 zap 报告的摘录:

<?xml version="1.0"?><OWASPZAPReport version="D-2020-10-20" generated="Wed, 18 Nov 2020 16:51:34">
<site name="http://webmail.example.com" host="webmail.example.com" port="80" ssl="false"><alerts>
<alertitem>
  <pluginid>3</pluginid>
  <alert>Session ID in URL Rewrite</alert>
  <name>Session ID in URL Rewrite</name>
  <instances>
  <instance>
  <uri>http://webmail.example.com/dyn/login.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?cid=47</uri>
  <method>GET</method>
  <evidence>jsessionid=c2e851a8c7f47dcd4dea016fd1e0</evidence>
  </instance>
  <instance>
  <uri>http://webmail.example.com/dyn/portal/index.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?aloId=21152&amp;cid=47&amp;page=alo</uri>
  <method>GET</method>
  <evidence>jsessionid=c2e851a8c7f47dcd4dea016fd1e0</evidence>
  </instance>
  </instances>
  <count>6</count>
  <solution>&lt;p&gt;For secure content, put session ID in a cookie. To be even more secure consider using a combination of cookie and URL rewrite.&lt;/p&gt;</solution>
  <reference>&lt;p&gt;http://seclists.org/lists/webappsec/2002/Oct-Dec/0111.html&lt;/p&gt;</reference>
  <cweid>200</cweid>
  <wascid>13</wascid>
  <sourceid>3</sourceid>
  </alertitem>
 </alerts>

想要获取子节点内容(如 uri/method/evidence)。

实际上我正在使用此代码(在python3中)并且能够获取所有警报项:

tree = etree.parse(report_file)
root = tree.getroot()
for site in tree.findall('site'):
    sitename = site.attrib['name']
    for alert in site.findall('.//alertitem'):
        name_alert = alert.find('name').text
        ...
        

但我想解析子节点
并获取 uri 的内容,例如 http://webmail.example.com/[...]

你可以帮帮我吗 ?

尝试这个:

from lxml import etree
alerts = """[your xml above (make sure it's well formed)]"""
doc = etree.XML(alerts)
for instance in doc.xpath('//instance'):
    print(instance.xpath('./uri')[0].text)
    print(instance.xpath('./method')[0].text)
    print(instance.xpath('./evidence')[0].text)

输出:

http://webmail.example.com/dyn/login.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?cid=47
GET
jsessionid=c2e851a8c7f47dcd4dea016fd1e0
http://webmail.example.com/dyn/portal/index.seam;jsessionid=c2e851a8c7f47dcd4dea016fd1e0?aloId=21152&cid=47&page=alo
GET
jsessionid=c2e851a8c7f47dcd4dea016fd1e0

谢谢杰克! 最终代码(如果有人需要的话):

        tree = etree.parse(report_file)
        root = tree.getroot()
        for site in tree.xpath('site'):
            sitename = site.attrib['name']
            for alert in site.xpath('//alertitem'):
                nom = alert.find('name').text
                criticite = alert.find('riskdesc').text
                try:otherinfo = alert.find('otherinfo').text
                except:otherinfo=""
                desc = alert.find('desc').text
                try:url_trace = alert.find('uri').text
                except:url_trace = ""
                try:method = alert.find('method').text
                except:method = ""
                try:parametre = alert.find('param').text
                except:parametre = ""
                for instance in alert.xpath('.//instance'):
                    print(instance.xpath('./uri')[0].text)
                    #print(instance.xpath('./method')[0].text)
                    try:
                        print(instance.xpath('./evidence')[0].text)
                    except:pass

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM