简体   繁体   中英

How to getElementsByClassName by using python xml.dom.minidom?

I want to obtain the body of all elements that do have a specific class.

Python xml.dom.minidom has a method for getting an element by id, getElementById() but I need to get all elements that do have a specific class.

How do I obtain this?

Note, if this is not possible using minidom, please provide a simple alternative that would allow me to get the full content of the elements of this class. By full content I mean also all the subnodes and text below them, as a simple string.

I recommended you to use lxml instead of xml.dom.minidom.

Using lxml.html / cssselect:

import lxml.html

root = lxml.html.fromstring(document_string)
for elem in root.cssselect('elem.class'):
    print(elem.tag)
    print(elem.get('src'))

Using lxml.etree / xpath:

import lxml.etree

root = lxml.etree.fromstring(document_string)
for elem in root.xpath('.//elem[contains(@class, "class")]'):
    print(elem.tag)
    print(elem.get('src'))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM