简体   繁体   English

如何使用命名前缀更改XML元素的属性

[英]How to change attribute for XML element with named prefix

I'm having real trouble trying to edit an XML attribute within a string that contains a single XML element with a name prefix. 我在尝试编辑包含单个带有名称前缀的XML元素的字符串中的XML属性时遇到了麻烦。

I'm trying to use code as per below: 我正在尝试使用以下代码:

import xml.etree.ElementTree as ET

def replace_xml_label(xml):
    element = ET.fromstring(xml)
    element.set('label', 'new_test_label')
    return ET.tostring(element).decode('ascii')

xml_1 = '<abc label="test label">test_value</abc>'
xml_2 = '<abc:option label="test label">test_value</abc:option>'

For xml_1 , I get output as expected: 对于xml_1 ,我得到了预期的输出:

print(replace_xml_label(xml_1))
<abc label="new_test_label">test_value</abc>

However, the style of XML element I need to work with has a name prefix similar to xml_2 , which raises a ParseError : 但是,我需要使用的XML元素的样式具有类似于xml_2的名称前缀,这引发了ParseError

print(replace_xml_label(xml_2))
Traceback (most recent call last):
 ... in XML parser.feed(text)
xml.etree.ElementTree.ParseError: unbound prefix: line 1, column 0

My expected output would be: 我的预期输出将是:

<abc:option label="new_test_label">test_value</abc:option>

I suspect the error is related to the lack of a defined namespace, but have been unable to successfully define one (eg using ET.register_namespace('abc', 'my-ns') . 我怀疑该错误与缺少已定义的名称空间有关,但是无法成功定义一个名称空间(例如,使用ET.register_namespace('abc', 'my-ns')

Trying to modify the string in-place to define a namespace: 尝试就地修改字符串以定义名称空间:

# ...doesn't raise an exception, but the output isn't in the format I need
xml_3 = xml_2.replace('<abc:option', '<abc:option xmlns:abc="my-ns"')
print(replace_xml_label(xml_3))  
<ns0:option xmlns:ns0="myns" label="new_test_label">test_value</ns0:option>

# replacing the output afterwards works, but by this point I may as well have used a regular expression!
print(replace_xml_label(xml_3).replace('ns0', 'abc').replace(' xmlns:abc="my-ns"',''))
<abc:option label="new_test_label">test_value</abc:option>

Am I doing something wrong, missing something obvious, or simply using the wrong tool? 我是在做错什么,遗漏了明显的东西,还是只是使用了错误的工具?

I'd prefer to use what's available in the standard Python 3.4+ library. 我更喜欢使用标准Python 3.4+库中的可用库。

Sure, the problem was due to undeclared prefix. 当然,问题出在未声明的前缀。 XML requires all namespace prefixes in use to be properly declared, otherwise the document doesn't qualify to be XML, hence can't normally be parsed using XML parser library. XML要求正确声明使用中的所有名称空间前缀,否则文档不符合XML的资格,因此通常无法使用XML解析器库进行解析。 So the ultimate solution is to fix at the side that currently produce XML-like document to be producing well-formed XML. 因此,最终的解决方案是将当前生成类似XML的文档的一侧修复为生成格式良好的XML。

One possible workaround to fix this at the parsing side is, by wrapping the string with a parent element that contains declaration of the undeclared prefix, for example : 解决此问题的一种可能的解决方法是,通过将字符串包装在包含未声明前缀声明的父元素中,例如:

xml_2 = '<abc:option label="test label">test_value</abc:option>'

parent = '<foo xmlns:abc="bar">{}</foo>'
wellformed_xml = parent.format(xml_2)

result = replace_xml_label(wellformed_xml)
print(result)

It seems my main problem as suspected, (and confirmed by har07 ) was the undeclared namespace prefix. 似乎我怀疑的主要问题( 并由har07确认 )是未声明的名称空间前缀。

Since it's not possible to fix at source, and the output needs to be in the format I specified, it seems the best workaround is to temporarily convert to simple tags before processing, and convert back to the original tags afterwards. 由于不可能在源代码处进行修复,并且输出必须采用我指定的格式,因此最好的解决方法似乎是在处理之前暂时转换为简单标签,然后再转换回原始标签。

xml_2 = '<abc:option label="test label">test_value</abc:option>'

# Convert first and last occurrences of tag to be without namespace.
original_tag = 'abc:option'
temp_tag = 'abc'
valid_xml = temp_tag.join(
    xml_2.replace(original_tag, temp_tag, 1).rsplit(original_tag, 1)
)

# Replace label
modified_xml = replace_xml_label(valid_xml)

# Convert first and last occurrences of tag to re-add namespace.
output_xml = original_tag.join(
    modified_xml.replace(temp_tag, original_tag, 1).rsplit(temp_tag, 1)
)
print(output_xml)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM