简体   繁体   中英

can we search multiple pattern using etree findall() in xml?

For my case, I have to find few elements in the XML file and update their values using the text attribute. For that, I have to search xml element A, B and C. My project is using xml.etree and python language. Currently I am using:

self.get_root.findall(H/A/T)
self.get_root.findall(H/B/T)
self.get_root.findall(H/C/T)

The sample XML file:

<H><A><T>text-i-have-to-update</H></A></T>
<H><B><T>text-i-have-to-update</H></B></T>
<H><C><T>text-i-have-to-update</H></C></T>

As we can notice, only the middle element in the path is different. Is there a way to optimize the code using something like self.get_root.findall(H|(A,B,C)|T) ? Any guidance in the right direction will do! Thanks!

I went through the similar question: XPath to select multiple tags but it didn't work for my case

Update: maybe regular expression inside the findall()?

The html in your question is malformed; assuming it's properly formatted (like below), try this:

import xml.etree.ElementTree as ET

data = """<root>
<H><A><T>text-i-have-to-update</T></A></H>
<H><B><T>text-i-have-to-update</T></B></H>
<H><C><T>text-i-have-to-update</T></C></H>
</root>"""

doc = ET.fromstring(data)
for item in doc.findall('.//H//T'):
    item.text = "modified text"
print(ET.tostring(doc).decode())

Output:

<root>
<H><A><T>modified text</T></A></H>
<H><B><T>modified text</T></B></H>
<H><C><T>modified text</T></C></H>
</root>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM