簡體   English   中英

解析 XML:如何使用 Python 從 XML 文件中具有相同名稱但不同文本的行中獲取所有信息?

[英]Parsing XML: How can I get all the information from lines with same name but different text in XML file using Python?

我正在嘗試解析 ICD10 XML 文件,但在提取信息時遇到了一些問題。

<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
  <name>A00.0</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
  <inclusionTerm>
    <note>Classical cholera</note>
    <note>Classical cholera again</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.1</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
  <inclusionTerm>
    <note>Cholera eltor</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.9</name>
  <desc>Cholera, unspecified</desc>
</diag>
</diag>

使用這個:

from xml.etree import ElementTree as ET
root = ET.parse('cut.xml')
diag = root.find(".//*[name='A00.0']")
inclusionTerm = diag.find('inclusionTerm')
if inclusionTerm is not None:
    print('Inclusion Term: '+diag.find('inclusionTerm').find('note').text)

該代碼僅打印 A00.0 ID 中“包含項”內的第一個注釋。 如何編寫代碼以獲取“inclusionTerm”中的所有“注釋”?

可以編寫 XPath 表達式來訪問所有note元素:

from xml.etree import ElementTree as ET

xml = '''<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
  <name>A00.0</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
  <inclusionTerm>
    <note>Classical cholera</note>
    <note>Classical cholera again</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.1</name>
  <desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
  <inclusionTerm>
    <note>Cholera eltor</note>
  </inclusionTerm>
</diag>
<diag>
  <name>A00.9</name>
  <desc>Cholera, unspecified</desc>
</diag>
</diag>'''

root = ET.fromstring(xml)

notes = root.findall('.//diag[name="A00.0"]/inclusionTerm/note')

for note in notes:
  print(note.text)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM