[英]Parsing XML: How can I get all the information from lines with same name but different text in XML file using Python?
我正在嘗試解析 ICD10 XML 文件,但在提取信息時遇到了一些問題。
<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
<name>A00.0</name>
<desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
<inclusionTerm>
<note>Classical cholera</note>
<note>Classical cholera again</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.1</name>
<desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
<inclusionTerm>
<note>Cholera eltor</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.9</name>
<desc>Cholera, unspecified</desc>
</diag>
</diag>
使用這個:
from xml.etree import ElementTree as ET
root = ET.parse('cut.xml')
diag = root.find(".//*[name='A00.0']")
inclusionTerm = diag.find('inclusionTerm')
if inclusionTerm is not None:
print('Inclusion Term: '+diag.find('inclusionTerm').find('note').text)
該代碼僅打印 A00.0 ID 中“包含項”內的第一個注釋。 如何編寫代碼以獲取“inclusionTerm”中的所有“注釋”?
可以編寫 XPath 表達式來訪問所有note
元素:
from xml.etree import ElementTree as ET
xml = '''<diag>
<name>A00</name>
<desc>Cholera</desc>
<diag>
<name>A00.0</name>
<desc>Cholera due to Vibrio cholerae 01, biovar cholerae</desc>
<inclusionTerm>
<note>Classical cholera</note>
<note>Classical cholera again</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.1</name>
<desc>Cholera due to Vibrio cholerae 01, biovar eltor</desc>
<inclusionTerm>
<note>Cholera eltor</note>
</inclusionTerm>
</diag>
<diag>
<name>A00.9</name>
<desc>Cholera, unspecified</desc>
</diag>
</diag>'''
root = ET.fromstring(xml)
notes = root.findall('.//diag[name="A00.0"]/inclusionTerm/note')
for note in notes:
print(note.text)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.