Xidel 提取标签内的数据——原始输出

Question

Pleased to be member of StackOverflow, a long time lurker in here.很高兴成为 StackOverflow 的成员，在这里潜伏了很长时间。

I need to parse text between two tags, so far I've found a wonderful tool called Xidel我需要解析两个标签之间的文本，到目前为止我找到了一个很棒的工具，叫做Xidel

I need to parse text in between我需要解析两者之间的文本

 <div class="description"> Text. <tag>Also tags.</tag> More text. </div>

However, said text can include HTML tags in it, and I want them to be printed out in raw format.但是，所述文本中可以包含 HTML 标签，我希望它们以原始格式打印出来。 So using a command like:所以使用如下命令：

xidel --xquery '//div[@class="description"]' file.html

Gets me:得到我：

Text. Also tags. More text.

And I need it to be exactly as it is, so:我需要它完全一样，所以：

Text. <tag>Also tags.</tag> More text.

How can I achieve this?我怎样才能做到这一点？

Regards, R问候， R

Answer 1

Can be done in a couple of ways with Xidel, which is why I love it so much.可以通过 Xidel 以多种方式完成，这就是我非常喜欢它的原因。

HTML-templating: HTML 模板：

xidel -s file.html -e "<div class='description'>{inner-html()}</div>"

XPath: X路径：

xidel -s file.html -e "//div[@class='description']/inner-html()"

CSS: CSS：

xidel -s file.html -e "inner-html(css('div.description'))"

BTW, on Linux: swap the double quotes for single and vice versa.顺便说一句，在 Linux 上：将双引号换成单引号，反之亦然。

Answer 2

您可以通过添加--output-format=xml选项来显示标签。

xidel --xquery '//div[@class="description"]' --output-format=xml file.html

Xidel 提取标签内的数据——原始输出

问题描述

2 个解决方案

解决方案1
2 2017-11-06 18:23:27

解决方案2
0 2020-10-30 01:52:28

Xidel 提取标签内的数据——原始输出

问题描述

2 个解决方案

解决方案1 2 2017-11-06 18:23:27

解决方案2 0 2020-10-30 01:52:28

解决方案1
2 2017-11-06 18:23:27

解决方案2
0 2020-10-30 01:52:28