I have an xml string like this
<description> asdasdasd <a> Item1 </a><a> Price </a></description>
i'm using lxml.etree as follows:
import lxml.etree as le
doc=le.fromstring("<description>asdasdasd <a>Item1</a> <a>Price</a> </description>")
desc = doc.xpath("//description")[0]
print desc.text
But desc.text
is returning only asdasdasd . I was expecting asdasdasd Item1 Price
. Is there any issue with my codes?
Here's one way to do it:
print desc.text + ' '.join(child.text for child in desc)
prints:
asdasdasd Item1 Price
Another option is to use descendant-or-self
xpath trick:
desc = doc.xpath("//description/descendant-or-self::*")
print ' '.join(child.text for child in desc)
prints:
asdasdasd Item1 Price
No, you have to see that as a tree (that's why lxml.etree
)
An xml node can, by definition, have a text and some attributes and other nodes inside (see this )
|--> description
|--> a
|--> a
Maybe this helps understand:
import lxml.etree as le
doc=le.fromstring("<description>asdasdasd <a>Item1</a> <a>Price</a> </description>")
desc = doc.xpath("//description")[0]
print desc.text
for child in desc:
print child.text
That outputs:
asdasdasd
Item1
Price
The idea behind XML is to try to model instances (more or less). In your case, you have a description
object with two a
objects inside it (could be a list, for instance)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.