在Python的ElementTree中的标记后提取文本

Question

Here is a part of XML: 这是XML的一部分：

<item><img src="cat.jpg" /> Picture of a cat</item>

Extracting the tag is easy. 提取标签很容易。 Just do: 做就是了：

et = xml.etree.ElementTree.fromstring(our_xml_string)
img = et.find('img')

But how to get the text immediately after it ( Picture of a cat )? 但是，如何立即获得文本（ 猫的照片 ）呢？ Doing the following returns a blank string: 执行以下操作将返回一个空白字符串：

print et.text

Answer 1

Elements have a tail attribute -- so instead of element.text , you're asking for element.tail . 元素具有tail属性-因此，您需要的是element.text而不是element.tail 。

>>> import lxml.etree
>>> root = lxml.etree.fromstring('''<root><foo>bar</foo>baz</root>''')
>>> root[0]
<Element foo at 0x145a3c0>
>>> root[0].tail
'baz'

Or, for your example: 或者，例如：

>>> et = lxml.etree.fromstring('''<item><img src="cat.jpg" /> Picture of a cat</item>''')
>>> et.find('img').tail
' Picture of a cat'

This also works with plain ElementTree: 这也适用于普通的ElementTree：

>>> import xml.etree.ElementTree
>>> xml.etree.ElementTree.fromstring(
...   '''<item><img src="cat.jpg" /> Picture of a cat</item>'''
... ).find('img').tail
' Picture of a cat'

在Python的ElementTree中的标记后提取文本

问题描述

1 个解决方案

解决方案1
23 已采纳 2012-03-12 20:11:32

在Python的ElementTree中的标记后提取文本

问题描述

1 个解决方案

解决方案1 23 已采纳 2012-03-12 20:11:32

解决方案1
23 已采纳 2012-03-12 20:11:32