Pleased to be member of StackOverflow, a long time lurker in here.
I need to parse text between two tags, so far I've found a wonderful tool called Xidel
I need to parse text in between
<div class="description"> Text. <tag>Also tags.</tag> More text. </div>
However, said text can include HTML tags in it, and I want them to be printed out in raw format. So using a command like:
xidel --xquery '//div[@class="description"]' file.html
Gets me:
Text. Also tags. More text.
And I need it to be exactly as it is, so:
Text. <tag>Also tags.</tag> More text.
How can I achieve this?
Regards, R
Can be done in a couple of ways with Xidel, which is why I love it so much.
HTML-templating:
xidel -s file.html -e "<div class='description'>{inner-html()}</div>"
XPath:
xidel -s file.html -e "//div[@class='description']/inner-html()"
CSS:
xidel -s file.html -e "inner-html(css('div.description'))"
BTW, on Linux: swap the double quotes for single and vice versa.
您可以通过添加--output-format=xml
选项来显示标签。
xidel --xquery '//div[@class="description"]' --output-format=xml file.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.