[英]lxml.etree not returning proper xpath value
I have an xml string like this 我有这样的xml字符串
<description> asdasdasd <a> Item1 </a><a> Price </a></description>
i'm using lxml.etree as follows: 我正在使用lxml.etree,如下所示:
import lxml.etree as le
doc=le.fromstring("<description>asdasdasd <a>Item1</a> <a>Price</a> </description>")
desc = doc.xpath("//description")[0]
print desc.text
But desc.text
is returning only asdasdasd . 但desc.text
是只返回asdasdasd。 I was expecting asdasdasd Item1 Price
. 我期待asdasdasd Item1 Price
。 Is there any issue with my codes? 我的代码有什么问题吗?
Here's one way to do it: 这是一种实现方法:
print desc.text + ' '.join(child.text for child in desc)
prints: 印刷品:
asdasdasd Item1 Price
Another option is to use descendant-or-self
xpath trick: 另一种选择是使用descendant-or-self
xpath技巧:
desc = doc.xpath("//description/descendant-or-self::*")
print ' '.join(child.text for child in desc)
prints: 印刷品:
asdasdasd Item1 Price
No, you have to see that as a tree (that's why lxml.etree
) 不,您必须将其视为一棵树(这就是lxml.etree
的原因)
An xml node can, by definition, have a text and some attributes and other nodes inside (see this ) 根据定义,一个xml节点可以包含一个文本,一些属性和其他节点(请参阅参考资料 )
|--> description
|--> a
|--> a
Maybe this helps understand: 也许这有助于了解:
import lxml.etree as le
doc=le.fromstring("<description>asdasdasd <a>Item1</a> <a>Price</a> </description>")
desc = doc.xpath("//description")[0]
print desc.text
for child in desc:
print child.text
That outputs: 输出:
asdasdasd
Item1
Price
The idea behind XML is to try to model instances (more or less). XML背后的想法是尝试对实例建模(或多或少)。 In your case, you have a description
object with two a
objects inside it (could be a list, for instance) 你的情况,你有一个description
有两个目的a
里面物体(可能是一个列表,例如)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.