繁体   English   中英

尝试使用 Python 3 解析 XML 文件

[英]Trying to parse through a XML file using Python 3

希望大家能帮帮我。 我对python比较陌生。 我有我需要在 powershell 中工作的东西,但是通过 powershell 对象访问 XML 元素似乎比 Python 容易得多。 在powershell中,我可以简单地做

[xml]$test = Get-Content .\test.xml

然后遍历对象以找到我需要的信息。 完全公开,虽然 XML 看起来很容易,但我被行话绊倒了。 这是 XML 文件的一个小版本

<?xml version="1.0" encoding="UTF-8"?>
<!--DISA STIG Viewer :: 2.9-->
<CHECKLIST>
    <ASSET>
        <ROLE>None</ROLE>
        <ASSET_TYPE>Computing</ASSET_TYPE>
        <HOST_NAME></HOST_NAME>
        <HOST_IP></HOST_IP>
        <HOST_MAC></HOST_MAC>
        <HOST_FQDN></HOST_FQDN>
        <TECH_AREA></TECH_AREA>
        <TARGET_KEY>2266</TARGET_KEY>
        <WEB_OR_DATABASE>false</WEB_OR_DATABASE>
        <WEB_DB_SITE></WEB_DB_SITE>
        <WEB_DB_INSTANCE></WEB_DB_INSTANCE>
    </ASSET>
    <STIGS>
        <iSTIG>
            <STIG_INFO>
                <SI_DATA>
                    <SID_NAME>version</SID_NAME>
                    <SID_DATA>5</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>classification</SID_NAME>
                    <SID_DATA>UNCLASSIFIED</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>customname</SID_NAME>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>stigid</SID_NAME>
                    <SID_DATA>McAfee_VirusScan88_Managed_Client</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>description</SID_NAME>
                    <SID_DATA>The McAfee VirusScan Managed Client STIG is published as a tool to improve the security of Department of Defense (DoD) information systems. The requirements are derived from the NIST 800-53 and related documents. Comments or proposed revisions to this document should be sent via e-mail to the following address: disa.stig_spt@mail.mil.</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>filename</SID_NAME>
                    <SID_DATA>U_McAfee_VirusScan88_Managed_Client_STIG_V5R21_Manual-xccdf.xml</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>releaseinfo</SID_NAME>
                    <SID_DATA>Release: 21 Benchmark Date: 25 Oct 2019</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>title</SID_NAME>
                    <SID_DATA>McAfee VirusScan 8.8 Managed Client STIG</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>uuid</SID_NAME>
                    <SID_DATA>1a441b95-b269-4423-8a40-a34f56441f5a</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>notice</SID_NAME>
                    <SID_DATA>terms-of-use</SID_DATA>
                </SI_DATA>
                <SI_DATA>
                    <SID_NAME>source</SID_NAME>
                </SI_DATA>
            </STIG_INFO>
            <VULN>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Vuln_Num</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>V-6453</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Severity</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>high</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Group_Title</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>DTAM001-McAfee VirusScan Control Panel </ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Rule_ID</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>SV-55134r1_rule</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Rule_Ver</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>DTAM001</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Rule_Title</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>McAfee VirusScan On-Access General Policies must be configured to enable on-access scanning at system startup.
</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Vuln_Discuss</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>For antivirus software to be effective, it must be running at all times, beginning from the point of the system's initial startup. Otherwise, the risk is greater for viruses, trojans, and other malware infecting the system during that startup phase.
</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>IA_Controls</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Check_Content</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>From the ePO server console System Tree, select the Systems tab, select the asset to be checked, select Actions, select Agent, and select Modify Policies on a Single System. From the product pull down list, select VirusScan Enterprise 8.8.0. Select from the Policy column the policy associated with the On-Access General Policies. Under the General tab, locate the "Enable on-access scanning:" label. Ensure the "Enable on-access scanning at system startup" option is selected.

Criteria:  If the "Enable on-access scanning at startup" option is selected, this is not a finding. 

On the client machine, use the Windows Registry Editor to navigate to the following key:
HKLM\Software\McAfee\ (32-bit)
HKLM\Software\Wow6432Node\McAfee\ (64-bit)
SystemCore\VSCore\On Access Scanner\McShield\Configuration

Criteria:  If the value of bStartDisabled is 0, this is not a finding. If the value is 1, this is a finding.</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Fix_Text</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>From the ePO server console System Tree, select the Systems tab, select the asset to be checked, select Actions, select Agent, and select Modify Policies on a Single System. From the product pull down list, select VirusScan Enterprise 8.8.0. Select from the Policy column the policy associated with the On-Access General Policies. Under the General tab, locate the "Enable on-access scanning:" label. Select the "Enable on-access scanning at system startup" option. Select Save.</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>False_Positives</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>False_Negatives</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Documentable</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>false</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Mitigations</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Potential_Impact</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Third_Party_Tools</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Mitigation_Control</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Responsibility</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>System Administrator</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Security_Override_Guidance</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA></ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Check_Content_Ref</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>M</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Weight</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>10.0</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>Class</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>Unclass</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>STIGRef</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>McAfee VirusScan 8.8 Managed Client STIG :: Version 5, Release: 21 Benchmark Date: 25 Oct 2019</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>TargetKey</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>2266</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STIG_DATA>
                    <VULN_ATTRIBUTE>CCI_REF</VULN_ATTRIBUTE>
                    <ATTRIBUTE_DATA>CCI-001242</ATTRIBUTE_DATA>
                </STIG_DATA>
                <STATUS>Not_Reviewed</STATUS>
                <FINDING_DETAILS></FINDING_DETAILS>
                <COMMENTS></COMMENTS>
                <SEVERITY_OVERRIDE></SEVERITY_OVERRIDE>
                <SEVERITY_JUSTIFICATION></SEVERITY_JUSTIFICATION>
            </VULN>
        </iSTIG>
    </STIGS>
</CHECKLIST>

我知道有几种不同的方法可以做到这一点,但我首先尝试通过 minidom

import xml.dom.minidom
doc = xml.dom.minidom.parse(r'C:\Temp\test.xml')
print (doc.nodeName)
root = doc.firstChild.tagName
root

这导致打印出确实是文档根的CHECKLIST。 现在在 powershell 中,我会执行 root.STIG.iSTIG.STIG_INFO.SI_DATA 并在那里开始循环,但无法理解为什么这会如此不同。

我也尝试从 ElementTree 开始,但没有走多远

from xml.etree import ElementTree as ET
doc = ET.parse(r'C:\Temp\test.xml').getroot()

任何人都可以在这里指出我正确的方向而不必给我书面代码作为答案吗? 我已经使用 lxml 转换了我的 XML,并且能够输出以下文件,这很好,但在下一步时遇到了麻烦。

谢谢!

由于您正在寻找一个总体方向,请尝试以下操作并根据您的需要对其进行修改:

from lxml import etree

stig = """your xml above"""
parser = etree.XMLParser()

tree = etree.fromstring(stig, parser)
items = tree.xpath('//iSTIG/STIG_INFO//SI_DATA')
for item in items:
    print(item.xpath('string(SID_NAME/text())')," ",item.xpath('string(SID_DATA/text())'))

输出:

version   5
classification   UNCLASSIFIED

等等。

显然,您可以将每个项目添加到列表等,而不是打印。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM