简体   繁体   English

使用Python从XML文件编辑XML文本

[英]Editing the XML texts from a XML file using Python

I have an XML file which contains some data as given. 我有一个XML文件,其中包含一些给定的数据。

<?xml version="1.0" encoding="UTF-8" ?> 
- <ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" /> 
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
  <Value>n/a</Value> 
  <Result>n/a</Result> 
  </Parameter>
  </ParameterList>
  </ParameterData>

I have one text file with lines as 我有一个文本文件,行为

Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3   
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3

Now I want to edit XML texts; 现在我想编辑XML文本; i,e. I,E。 I want to replace the field (n/a) with the corresponding values from the text file. 我想用文本文件中的相应值替换字段(n / a)。 Like I want the file to looks like 就像我希望文件看起来像

<?xml version="1.0" encoding="UTF-8" ?> 
- <ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj" /> 
- <ParameterList count="85">
- <Parameter name="Spec 2 Included" type="boolean" mode="both">
  <Value>TRUE</Value> 
  <Result>TRUE</Result> 
  </Parameter>
- <Parameter name="Spec 2 Label" type="string" mode="both">
  <Value>19-Flat2-HS3</Value> 
  <Result>19-Flat2-HS3</Result> 
  </Parameter>
- <Parameter name="Spec 3 Included" type="boolean" mode="both">
  <Value>FALSE</Value> 
  <Result>FALSE</Result> 
  </Parameter>
- <Parameter name="Spec 3 Label" type="string" mode="both">
  <Value>4-1-Bead1-HS3</Value> 
  <Result>4-1-Bead1-HS3</Result> 
  </Parameter>
  </ParameterList>
  </ParameterData>

I am new to this Python-XML coding. 我是这个Python-XML编码的新手。 I dont have idea about how to edit the text fields in a XML file. 我不知道如何编辑XML文件中的文本字段。 I am trying to Use elementtree.ElementTree module. 我正在尝试使用elementtree.ElementTree模块。 but to read the lines in XML file and extract the attributes I dont know which modules need to be imported. 但要读取XML文件中的行并提取属性,我不知道需要导入哪些模块。

Please help. 请帮忙。

Thanks and Regards. 谢谢并恭祝安康。

You can convert your data text into python dictionary by regular expression 您可以通过正则表达式将数据文本转换为python字典

data="""Spec 2 Included : TRUE
Spec 2 Label: 19-Flat2-HS3
Spec 3 Included : FALSE
Spec 3 Label: 4-1-Bead1-HS3"""

#data=open("data.txt").read()

import re

data=dict(re.findall('(Spec \d+ (?:Included|Label))\s*:\s*(\S+)',data))

data will be as follows data如下

{'Spec 3 Included': 'FALSE', 'Spec 2 Included': 'TRUE', 'Spec 3 Label': '4-1-Bead1-HS3', 'Spec 2 Label': '19-Flat2-HS3'}

Then you can convert it by using any of your favoriate xml parser, I will use minidom here. 然后你可以使用任何你喜欢的xml解析器转换它,我将在这里使用minidom。

from xml.dom import minidom

dom = minidom.parseString(xml_text)
params=dom.getElementsByTagName("Parameter")
for param in params:
    name=param.getAttribute("name")
    if name in data:
        for item in param.getElementsByTagName("*"): # You may change to "Result" or "Value" only
            item.firstChild.replaceWholeText(data[name])

print dom.toxml()

#write to file
open("output.xml","wb").write(dom.toxml())

Results 结果

<?xml version="1.0" ?><ParameterData>
  <CreationInfo date="10/28/2009 03:05:14 PM" user="manoj"/>
  <ParameterList count="85">
    <Parameter mode="both" name="Spec 2 Included" type="boolean">
      <Value>TRUE</Value>
      <Result>TRUE</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 2 Label" type="string">
      <Value>19-Flat2-HS3</Value>
      <Result>19-Flat2-HS3</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 3 Included" type="boolean">
      <Value>FALSE</Value>
      <Result>FALSE</Result>
    </Parameter>
    <Parameter mode="both" name="Spec 3 Label" type="string">
      <Value>4-1-Bead1-HS3</Value>
      <Result>4-1-Bead1-HS3</Result>
    </Parameter>
  </ParameterList>
</ParameterData>

Well, you could start with 好吧,你可以先开始吧

import xml.etree.ElementTree as ET
tree = ET.parse("blah.xml")

Find the elements you want to modify. 找到要修改的元素

To replace the contents of an element, just do 要替换元素的内容,只需这样做

element.text = "TRUE"

The import statement above works in Python 2.5 or later. 上面的import语句适用于Python 2.5或更高版本。 If you have an older version of Python you'll need to install ElementTree as an extension, and then the import statement is different: import elementtree.ElementTree as ET . 如果你有旧版本的Python,你需要安装ElementTree作为扩展,然后import语句是不同的: import elementtree.ElementTree as ET

Unfortunately, the XPath supported by ElementTree isn't complete. 不幸的是,ElementTree支持的XPath不完整。 Since Python 2.6 includes an older version, finding elements by attribute (as stated here ) does not work. 因为Python 2.6包括一个旧版本,发现通过属性的元素(如说这里 )不起作用。 So Python's own documentation should be your first stop: xml.etree.ElementTree 所以Python自己的文档应该是你的第一站: xml.etree.ElementTree

import xml.etree.ElementTree as ET

original = ET.parse("original.xml")
parameters = original.findall(".//Parameter")
changes = {}

# read changes
with open("changes.txt", "rb") as in_file:
    for change in in_file:
        change = change.rstrip()                # remove line endings
        name, value = change.split(":")
        changes[name.strip()] = value.strip()   # remove whitespaces

# find paramter element and apply changes
for parameter in parameters:
    parameter_name = parameter.get("name")
    if changes.has_key(parameter_name):                
        value = parameter.find("./Value")
        value.text = changes[parameter_name]
        result = parameter.find("./Result")
        result.text = changes[parameter_name]

original.write("new.xml")

Here is how you could do it using Amara 以下是使用Amara的方法

from amara import bindery

doc = bindery.parse(XML)

def cleanup_for_dict(key, value):
    return key.strip(), value.strip()

params = dict(( cleanup_for_dict(*line.split(':', 1))
                for line in TEXT.splitlines()))

for param in doc.ParameterData.ParameterList.Parameter:
    if param.name in params:
        param.Value = params[param.name]
        param.Result = params[param.name]

doc.xml_write()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM