简体   繁体   English

使用Python从XML文件中提取信息?

[英]Using Python to extract information from a XML file?

Can anyone offer some help with regards to using Python to extract information from a XML file? 在使用Python从XML文件中提取信息方面,谁能提供一些帮助? This will be my example XML. 这将是我的示例XML。

<root>
    <number index="2">
        <info>
            <info.RANDOM>Random Text</info.RANDOM>
        </info>
</root>

What I want to print out is the information between the root tags. 我要打印的是根标签之间的信息。 However, I want it to print it as is, which means all the tags, text in between the tags, and the content within the tag (in this case number index ="2") I have tried itertext(), but that removes the tags and prints only the text in between the root tags. 但是,我希望它按原样打印它,这意味着我尝试了itertext()来表示所有标签,标签之间的文本以及标签内的内容(在这种情况下为number index =“ 2”),但是这样可以删除标签,并仅打印根标签之间的文本。 So far, I have a makeshift solution that prints out only the element.tag and the element.text but that does not print out the end tags and the content within the tag. 到目前为止,我有一个临时解决方案,该解决方案仅打印出element.tag和element.text,但不打印出结束标记和标记中的内容。 Any help would be appreciated! 任何帮助,将不胜感激! :) :)

With s as your input, 以s作为输入,

s='''<root>
      <number index="2">
        <info>
            <info.RANDOM>Random Text</info.RANDOM>
        </info>
        </number>
</root>'''

Find all tags with tag name number and convert the tag to string using ET.tostring() 查找具有标签名称number所有标签,然后使用ET.tostring()将标签转换为字符串

import xml.etree.ElementTree as ET
root = ET.fromstring(s)
for node in root.findall('.//number'):
  print ET.tostring(node)

Output: 输出:

<number index="2">
        <info>
            <info.RANDOM>Random Text</info.RANDOM>
        </info>
        </number>
from bs4 import BeautifulSoup

xml = "<root><number index=\"2\"><info><info.RANDOM>Random Text</info.RANDOM></info></root>"
soup = BeautifulSoup(xml, "xml")

output = soup.prettify()
print(output[output.find("<root>") + 7:output.rfind("</root>")])    

the + 7 accounts for root>\\n root>\\n+ 7帐户root>\\n

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM