简体   繁体   English

如何在python和lxml中查找具有某些值的标签

[英]How to find a tag with some value in python and lxml

I have an xml file with the following structure: 我有一个具有以下结构的xml文件:

<main_tag>
   <first>
     <tag1>val1</tag1>
     <conf>
       <tag2>val2</tag2>
       <tag3>val3</tag3>
       <tag4>val4</tag4>
     </conf>
   </first>
   <second>
     <tag1>val2</tag1>
     <conf>
       <tag2>val6</tag2>
       <tag3>val7</tag3>
       <tag4>val8</tag4>
     </conf>
   </second>
</main_tag>

I have to change the value of tag2. 我必须更改tag2的值。 Possible values are stored in a dict: 可能的值存储在字典中:

{tag2values:[newvalue1, newvalue2]}

If value of tag1 is val1, then we change tag2 value to newvalue1. 如果tag1的值为val1,则将tag2的值更改为newvalue1。 If tag1 value is val2, then we change tag2 value to newvalue2. 如果tag1的值为val2,则将tag2的值更改为newvalue2。

So the question is: is there a way to find an element in lxml matching its parent's value? 所以问题是:有没有办法在lxml中找到与其父值匹配的元素? Or to find an element by it's parent neighbour value? 还是通过其父邻居值查找元素?

The .xpath method let's you find tags by XPath 1.0 expressions: .xpath方法让您通过XPath 1.0表达式查找标签:

>>> from lxml import etree
>>> from cStringIO import StringIO
>>> tag2values = ['newvalue1', 'newvalue2']
>>> example = StringIO("""\
... <main_tag>
...    <first>
...      <tag1>val1</tag1>
...      <conf>
...        <tag2>val2</tag2>
...        <tag3>val3</tag3>
...        <tag4>val4</tag4>
...      </conf>
...    </first>
...    <second>
...      <tag1>val2</tag1>
...      <conf>
...        <tag2>val6</tag2>
...        <tag3>val7</tag3>
...        <tag4>val8</tag4>
...      </conf>
...    </second>
... </main_tag>
... """)
>>> tree = etree.parse(example)
>>> value1selector = '*/conf/tag2[../../tag1/text() = "val1"]'
>>> value2selector = '*/conf/tag2[../../tag1/text() = "val2"]'
>>> for elem in tree.xpath(value1selector):
...     elem.text = tag2values[0]
... 
>>> for elem in tree.xpath(value2selector):
...     elem.text = tag2values[1]
... 
>>> print(etree.tostring(tree, pretty_print=True))
<main_tag>
   <first>
     <tag1>val1</tag1>
     <conf>
       <tag2>newvalue1</tag2>
       <tag3>val3</tag3>
       <tag4>val4</tag4>
     </conf>
   </first>
   <second>
     <tag1>val2</tag1>
     <conf>
       <tag2>newvalue2</tag2>
       <tag3>val7</tag3>
       <tag4>val8</tag4>
     </conf>
   </second>
</main_tag>

In the above example, the XPath expression in value1selector gives you all tag2 elements that are children of conf , with a sibling tag1 tag with text val1 , as ElementTree Element instances, thus making it trivial to replace their text content. 在上面的例子中,XPath表达式value1selector给你所有tag2元素是孩子conf ,与兄弟姐妹tag1文本标签val1 ,如ElementTree的Element的情况下,从而使之琐碎,以取代其文本内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM