i'm trying to parse XML file to txt file (mainly to get the Text's body), but the for loop wouldn't run hence wouldn't append results to the file, i know i'm missing something in the XML I tried to create an outer for loop in which it will findall MAEC_Bundle before finding the behaviours (I think because it's the root ?).
this is the XML file
<MAEC_Bundle xmlns:ns1="http://xml/metadataSharing.xsd" xmlns="http://maec.mitre.org/XMLSchema/maec-core-1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maec.mitre.org/XMLSchema/maec-core-1 file:MAEC_v1.1.xsd" id="maec:thug:bnd:1" schema_version="1.100000">
<Analyses>
<Analysis start_datetime="2019-11-25 21:41:59.491211" id="maec:thug:ana:2" analysis_method="Dynamic">
<Tools_Used>
<Tool id="maec:thug:tol:1">
<Name>Thug</Name>
<Version>0.9.40</Version>
<Organization>The Honeynet Project</Organization>
</Tool>
</Tools_Used>
</Analysis>
</Analyses>
<Behaviors>
<Behavior id="maec:thug:bhv:4">
<Description>
<Text>[window open redirection] about:blank -> http://desbloquear.celularmovel.com/</Text>
</Description>
<Discovery_Method tool_id="maec:thug:tol:1" method="Dynamic Analysis"/>
</Behavior>
<Behavior id="maec:thug:bhv:5">
<Description>
<Text>[HTTP] URL: http://desbloquear.celularmovel.com/ (Status: 200, Referer: None)</Text>
</Description>
<Discovery_Method tool_id="maec:thug:tol:1" method="Dynamic Analysis"/>
</Behavior>
<Behavior id="maec:thug:bhv:6">
<Description>
<Text>[HTTP] URL: http://desbloquear.celularmovel.com/ (Content-type: text/html, MD5: f1fb042c62910c34be16ad91cbbd71fa)</Text>
</Description>
<Discovery_Method tool_id="maec:thug:tol:1" method="Dynamic Analysis"/>
</Behavior>
<Behavior id="maec:thug:bhv:7">
<Description>
<Text>[meta redirection] http://desbloquear.celularmovel.com/ -> http://desbloquear.celularmovel.com/cgi-sys/defaultwebpage.cgi</Text>
</Description>
<Discovery_Method tool_id="maec:thug:tol:1" method="Dynamic Analysis"/>
</Behavior>
<Behavior id="maec:thug:bhv:8">
<Description>
<Text>[HTTP] URL: http://desbloquear.celularmovel.com/cgi-sys/defaultwebpage.cgi (Status: 200, Referer: http://desbloquear.celularmovel.com/)</Text>
</Description>
<Discovery_Method tool_id="maec:thug:tol:1" method="Dynamic Analysis"/>
</Behavior>
<Behavior id="maec:thug:bhv:9">
<Description>
<Text>[HTTP] URL: http://desbloquear.celularmovel.com/cgi-sys/defaultwebpage.cgi (Content-type: text/html, MD5: a28fe921afb898e60cc334e06f71f46e)</Text>
</Description>
<Discovery_Method tool_id="maec:thug:tol:1" method="Dynamic Analysis"/>
</Behavior>
</Behaviors>
<Pools/>
</MAEC_Bundle>
this is the code for parsing in python, the code below only writes operation to the file but does not enter the loop
import xml.etree.ElementTree as ET
def logsParsing():
tree = ET.parse(
'analysis.xml')
root = tree.getroot()
with open('sample1.txt', 'w') as f:
f.write('Operation\n')
with open('sample1.txt', 'a') as f:
for behavior in root.findall('Behaviors'):
operation = behavior.find('Behavior').find('Description').find('Text').text
line_to_write = operation + '\n'
f.write(line_to_write)
f.close()
logsParsing()
Listing [Python 3.Docs]: xml.etree.ElementTree - The ElementTree XML API . You might want to insist on the following sections:
Here's a way of handling things.
code00.py :
#!/usr/bin/env python3
import sys
import xml.etree.ElementTree as ET
def main():
tree = ET.parse("analysis.xml")
root_node = tree.getroot()
namespaces = {
"xmlns": "http://maec.mitre.org/XMLSchema/maec-core-1", # Namespace (default) from XML file (this is the only one we need, as tags that matter to us are not prefixed)
}
xpath = "./{0:s}:Behaviors/{0:s}:Behavior/{0:s}:Description/{0:s}:Text".format("xmlns") # Compute each "Text" node full path
print("Nodes to search: {0:s}".format(xpath))
text_nodes = root_node.findall(xpath, namespaces)
with open("sample1.txt", "w") as fout: # Only open the out file once
node_count = 0
fout.write("Operation:\n")
for text_node in text_nodes:
fout.write(text_node.text + "\n")
node_count += 1
print("Wrote {0:d} nodes info.".format(node_count))
if __name__ == "__main__":
print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(item.strip() for item in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
main()
print("\nDone.")
Output :
[cfati@CFATI-5510-0:e:\\Work\\Dev\\StackOverflow\\q059057339]> "e:\\Work\\Dev\\VEnvs\\py_064_03.07.03_test0\\Scripts\\python.exe" code00.py Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] 64bit on win32 Nodes to search: ./xmlns:Behaviors/xmlns:Behavior/xmlns:Description/xmlns:Text Wrote 6 nodes info. Done. [cfati@CFATI-5510-0:e:\\Work\\Dev\\StackOverflow\\q059057339]> type sample1.txt Operation: [window open redirection] about:blank -> http://desbloquear.celularmovel.com/ [HTTP] URL: http://desbloquear.celularmovel.com/ (Status: 200, Referer: None) [HTTP] URL: http://desbloquear.celularmovel.com/ (Content-type: text/html, MD5: f1fb042c62910c34be16ad91cbbd71fa) [meta redirection] http://desbloquear.celularmovel.com/ -> http://desbloquear.celularmovel.com/cgi-sys/defaultwebpage.cgi [HTTP] URL: http://desbloquear.celularmovel.com/cgi-sys/defaultwebpage.cgi (Status: 200, Referer: http://desbloquear.celularmovel.com/) [HTTP] URL: http://desbloquear.celularmovel.com/cgi-sys/defaultwebpage.cgi (Content-type: text/html, MD5: a28fe921afb898e60cc334e06f71f46e)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.