简体   繁体   中英

Findall equivalent for xpath , Lxml

I am extracting text with respect to tags and I need to get them in a list form wrt p tags. I have this xpath expression as:

 find =  etree.XPath("//w:p//.//*[local-name() = 'ins']//text()" ,namespaces={'w':"http://schemas.openxmlformats.org/wordprocessingml/2006/main"}) 

And i want to use it in a findall expression. I tried:

inserted_list_1=[]
for p in lxml_tree.findall('.//{' + w + '}p'):
    inserted_list_1.append([t.text for t in p.findall('.//{' + w + '}ins')])

but all this returns is a list full of None values whilst the former xpath works perfectly.
I think there's some intermediate path missing.

You cannot use that expression with findall() ; the findall() method deliberately keeps compatibility with the limited ElementTree API XPath support .

Use the xpath() method instead:

for p in lxml_tree.xpath('.//w:p', namespaces={'w': w}):

and just use namespace prefixes for much more readable queries.

If you just wanted to extract all contained text, you can use:

[t for t in p.xpath('../w:p//w:ins//text()',namespaces={'w': w})]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM