简体   繁体   中英

Include one XML within another XML and parse it with python

I wanted to include an XML file in another XML file and parse it with python. I am trying to achieve it through Xinclude. There is a file1.xml which looks like

<?xml version="1.0"?>
<root>
  <document xmlns:xi="http://www.w3.org/2001/XInclude">
     <xi:include href="file2.xml" parse="xml" />
  </document>
  <test>some text</test>
</root>

and file2.xml which looks like

<para>This is a paragraph.</para>

Now in my python code i tried to access it like:

from xml.etree import ElementTree, ElementInclude

tree = ElementTree.parse("file1.xml")
root = tree.getroot()
for child in root.getchildren():
    print child.tag

It prints the tag of all child elements of root

document
test

Now when i tries to print the child objects directly like

print root.document
print root.test

It says the root doesnt have children named test or document. Then how am i suppose to access the content in file2.xml?

I know that I can access the XML elements from python with schema like:

    schema=etree.XMLSchema(objectify.fromstring(configSchema))
    xmlParser = objectify.makeparser(schema = schema)
    cfg = objectify.fromstring(xmlContents, xmlParser)
    print cfg.elemetName # access element

But since here one XML file is included in another, I am confused how to write the schema. How can i solve it?

Below

import xml.etree.ElementTree as ET


xml1 = '''<?xml version="1.0"?>
<root>
  <test>some text</test>
</root>'''

xml2 = '''<para>This is a paragraph.</para>'''

root1 = ET.fromstring(xml1)
root2 = ET.fromstring(xml2)

root1.insert(0,root2)

para_value = root1.find('.//para').text
print(para_value)

output

This is a paragraph.

Not sure why you want to use XInclude, but including an XML file in another one is a basic mechanism of SGML and XML, and can be achieved without XInclude as simple as:

<!DOCTYPE root [
  <!ENTITY externaldoc SYSTEM "file2.xml">
]>
<root>
  <document>
    &externaldoc;
  </document>
  <test>some text</test>
</root>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM