Python minidom查找空文本节点

Question

I am parsing an XML file with the minidom parser, where I'm iterating over the XML and output specific information that stands between the tags into a dictionary. 我正在使用minidom解析器解析XML文件，在其中迭代XML并将位于标签之间的特定信息输出到字典中。

Like this: 像这样：

d={}
dom = parseString(data)
macro=dom.getElementsByTagName('macro')
for node in macro:
    d={}
    id_name=node.getElementsByTagName('id')[0].toxml()
    id_data=id_name.replace('<id>','').replace('</id>','')
    print (id_data)
    cl_name=node.getElementsByTagName('cl')[1].toxml()
    cl_data=cl_name.replace('<cl>','').replace('</cl>','')
    print (cl_data)
    d_source[id_data]=(cl_data)

Now, my problem is that the data where I'm looking for in cl_name=node.getElementsByTagName('cl')[1].toxml() is sometimes non-existent! 现在，我的问题是在cl_name = node.getElementsByTagName（'cl'）[1] .toxml（）中寻找的数据有时不存在！

In this case the part of the XML looks like this: 在这种情况下，XML的一部分如下所示：

<cl>blabla</cl>
<cl></cl>

Because of this I receive an "index is out of range"-error. 因此，我收到“索引超出范围”错误。 However, I really need this "nothing" in my dictionary. 但是，我的词典中确实不需要此“内容”。 My dictionary should look like this: 我的字典应如下所示：

d={blabla:'',xyz:'abc'}

I have to look for the empty text node, which I tried by doing this: 我必须寻找一个空的文本节点，我尝试这样做：

if node.getElementsByTagName('cl')[1].toxml is None:
    print ('')
else:
    cl_name=node.getElementsByTagName('cl')[1].toxml()
    cl_data=cl_name.replace('<cl>','').replace('</cl>','')
    print (cl_data)
    d_target[id_data]=(cl_data)
    print(d_target)

I still receive that indexing error...I also thought about inserting a white space into the original source file, but am not sure if this would solve the issue. 我仍然收到该索引错误...我还考虑过在原始源文件中插入空格，但是不确定是否可以解决该问题。 Any ideas? 有任何想法吗？

Answer 1

If the minidom is not dictated somehow, I suggest to change your mind and use the standard xml.etree.ElementTree. 如果没有以某种方式决定最小化，我建议您改变主意，并使用标准的xml.etree.ElementTree。 It is much easier. 这要容易得多。

Answer 2

I figured out it's working when adding a white space into the original source file. 我发现在原始源文件中添加空格时它可以正常工作。 This looks a bit messy though. 不过，这看起来有点混乱。 So if anyone has a better idea, I'm looking forward to it! 因此，如果有人有更好的主意，我很期待！

Python minidom查找空文本节点

问题描述

2 个解决方案

解决方案1
0 已采纳 2012-07-17 07:23:01

解决方案2
0 2012-07-17 07:23:54

Python minidom查找空文本节点

问题描述

2 个解决方案

解决方案1 0 已采纳 2012-07-17 07:23:01

解决方案2 0 2012-07-17 07:23:54

解决方案1
0 已采纳 2012-07-17 07:23:01

解决方案2
0 2012-07-17 07:23:54