用 Python xml.dom.minidom 解析 XML 文件

Question

I am trying to parse a XML file with python for a school project.我正在尝试用 python 为学校项目解析 XML 文件。

To see if the prasing works I printed the values of the "lista_marfuri".为了查看 prasing 是否有效，我打印了“lista_marfuri”的值。

It shows the following error: xml.parsers.expat.ExpatError: XML declaration not well-formed: line 1, column 35它显示以下错误：xml.parsers.expat.ExpatError：XML 声明格式不正确：第 1 行，第 35 列

The XML code is: XML 代码为：

<?xml version="1.0" encoding="UTF-8 standalone="yes"?>

<fapte>
    <lista_marfuri>
        <marfa> 
            <id> 1 </id>
            <nume> grebla </nume>
            <categorie> gradinarit </gradinarit>
            <cantitate> 100 </cantitate>
            <pret> 20 </pret>
        </marfa>
        <marfa> 
            <id> 2 </id>
            <nume> sac 1kg ingrasamant </nume>
            <categorie> gradinarit </gradinarit>
            <cantitate> 300 </cantitate>
            <pret> 30 </pret>
        </marfa>
        <marfa> 
            <id> 3 </id>
            <nume> surubelnita </nume>
            <categorie> general </gradinarit>
            <cantitate> 200 </cantitate>
            <pret> 5 </pret>
        </marfa>
    </lista_marfuri>
    
    
    <lista_categorii>
        ...
    </lista_categorii>
    
    <lista_clienti>
        ...
    </lista_clienti>
    
    <lista_comenzi>
        ...
    </lista_comenzi>
    
</fapte>

And the python code is:而 python 代码是：

import xml.dom.minidom

tree = xml.dom.minidom.parse('SBC.xml')

fapte = tree.documentElement

marfuri = fapte.getElementsByTagName('marfa')

for marfa in marfuri:
    print(f"-- Marfa {marfa.getAttribute('id')} --")

    nume = marfa.getElementByTagName('nume')[0].childNodes[0].nodeValue
    categorie = marfa.getElementByTagName('categorie')[0].childNodes[0].nodeValue
    cantitate = marfa.getElementByTagName('cantitate')[0].childNodes[0].nodeValue
    pret = marfa.getElementByTagName('pret')[0].childNodes[0].nodeValue

    print(f"Nume: {nume}")
    print(f"Categorie: {categorie}")
    print(f"Cantitate: {cantitate}")
    print(f"Pret: {pret}")

Answer 1

I think working with ElementTree will make your life easier.我认为使用 ElementTree 会让你的生活更轻松。

import xml.etree.ElementTree as ET

xml = '''<fapte>
    <lista_marfuri>
        <marfa> 
            <id> 1 </id>
            <nume> grebla </nume>
            <categorie> gradinarit </categorie>
            <cantitate> 100 </cantitate>
            <pret> 20 </pret>
        </marfa>
        <marfa> 
            <id> 2 </id>
            <nume> sac 1kg ingrasamant </nume>
            <categorie> gradinarit </categorie>
            <cantitate> 300 </cantitate>
            <pret> 30 </pret>
        </marfa>
        <marfa> 
            <id> 3 </id>
            <nume> surubelnita </nume>
            <categorie> general </categorie>
            <cantitate> 200 </cantitate>
            <pret> 5 </pret>
        </marfa>
    </lista_marfuri>
</fapte>'''

root = ET.fromstring(xml)
for marfa in root.findall('.//marfa'):
    for entry in marfa:
        print(f'{entry.tag} : {entry.text.strip()}')
    print('------------------')

output输出

id : 1
nume : grebla
categorie : gradinarit
cantitate : 100
pret : 20
------------------
id : 2
nume : sac 1kg ingrasamant
categorie : gradinarit
cantitate : 300
pret : 30
------------------
id : 3
nume : surubelnita
categorie : general
cantitate : 200
pret : 5
------------------

Answer 2

If the xml is valid, correct closing ” in the first line as @mzjn noted (this shows also your Error message), than it's the shortest to use pandas read_xml() :如果 xml 有效，请正确关闭第一行中的” ，如@mzjn 所述（这也会显示您的错误消息），而不是使用 pandas read_xml()的最短时间：

import pandas as pd

df = pd.read_xml('yourFileName.xml', xpath='.//marfa')
print(df)

Output: Output：


   id                 nume   categorie  cantitate  pret
0   1               grebla  gradinarit        100    20
1   2  sac 1kg ingrasamant  gradinarit        300    30
2   3          surubelnita     general        200     5

PS: This works only, if all your interested values are on the same level in the tree. PS：仅当您所有感兴趣的值都在树中的同一级别时，这才有效。

用 Python xml.dom.minidom 解析 XML 文件

问题描述

2 个解决方案

解决方案1
1 2022-12-24 14:38:11

解决方案2
0 2022-12-27 05:59:00

用 Python xml.dom.minidom 解析 XML 文件

问题描述

2 个解决方案

解决方案1 1 2022-12-24 14:38:11

解决方案2 0 2022-12-27 05:59:00

解决方案1
1 2022-12-24 14:38:11

解决方案2
0 2022-12-27 05:59:00