简体   繁体   English

lxml.etree没有返回正确的xpath值

[英]lxml.etree not returning proper xpath value

I have an xml string like this 我有这样的xml字符串

<description> asdasdasd <a> Item1 </a><a> Price </a></description>

i'm using lxml.etree as follows: 我正在使用lxml.etree,如下所示:

import lxml.etree as le
doc=le.fromstring("<description>asdasdasd <a>Item1</a> <a>Price</a> </description>")
desc = doc.xpath("//description")[0]
print desc.text

But desc.text is returning only asdasdasd . desc.text是只返回asdasdasd。 I was expecting asdasdasd Item1 Price . 我期待asdasdasd Item1 Price Is there any issue with my codes? 我的代码有什么问题吗?

Here's one way to do it: 这是一种实现方法:

print desc.text + ' '.join(child.text for child in desc)

prints: 印刷品:

asdasdasd Item1 Price

Another option is to use descendant-or-self xpath trick: 另一种选择是使用descendant-or-self xpath技巧:

desc = doc.xpath("//description/descendant-or-self::*")
print ' '.join(child.text for child in desc)

prints: 印刷品:

asdasdasd  Item1 Price

No, you have to see that as a tree (that's why lxml.etree ) 不,您必须将其视为一棵树(这就是lxml.etree的原因)

An xml node can, by definition, have a text and some attributes and other nodes inside (see this ) 根据定义,一个xml节点可以包含一个文本,一些属性和其他节点(请参阅参考资料

|--> description
      |--> a
      |--> a

Maybe this helps understand: 也许这有助于了解:

import lxml.etree as le
doc=le.fromstring("<description>asdasdasd <a>Item1</a> <a>Price</a> </description>")
desc = doc.xpath("//description")[0]
print desc.text
for child in desc:
  print child.text

That outputs: 输出:

asdasdasd 
Item1
Price

The idea behind XML is to try to model instances (more or less). XML背后的想法是尝试对实例建模(或多或少)。 In your case, you have a description object with two a objects inside it (could be a list, for instance) 你的情况,你有一个description有两个目的a里面物体(可能是一个列表,例如)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM