如何使用python xml.dom.minidom获取ElementsByClassName？

Question

I want to obtain the body of all elements that do have a specific class. 我想获得具有特定类的所有元素的主体。

Python xml.dom.minidom has a method for getting an element by id, getElementById() but I need to get all elements that do have a specific class. Python xml.dom.minidom有一个通过id， getElementById()获取元素的方法，但我需要获取具有特定类的所有元素。

How do I obtain this? 我如何获得这个？

Note, if this is not possible using minidom, please provide a simple alternative that would allow me to get the full content of the elements of this class. 请注意，如果使用minidom无法做到这一点，请提供一个简单的替代方案，以便我可以获得此类元素的完整内容。 By full content I mean also all the subnodes and text below them, as a simple string. 完整内容我的意思是它们下面的所有子节点和文本，作为一个简单的字符串。

Answer 1

I recommended you to use lxml instead of xml.dom.minidom. 我建议你使用lxml而不是xml.dom.minidom。

Using lxml.html / cssselect: 使用lxml.html / cssselect：

import lxml.html

root = lxml.html.fromstring(document_string)
for elem in root.cssselect('elem.class'):
    print(elem.tag)
    print(elem.get('src'))

Using lxml.etree / xpath: 使用lxml.etree / xpath：

import lxml.etree

root = lxml.etree.fromstring(document_string)
for elem in root.xpath('.//elem[contains(@class, "class")]'):
    print(elem.tag)
    print(elem.get('src'))

如何使用python xml.dom.minidom获取ElementsByClassName？

问题描述

1 个解决方案

解决方案1
2 已采纳 2013-06-17 18:42:52

如何使用python xml.dom.minidom获取ElementsByClassName？

问题描述

1 个解决方案

解决方案1 2 已采纳 2013-06-17 18:42:52

解决方案1
2 已采纳 2013-06-17 18:42:52