[英]HTML to Text Conversion in Emacs
I have a bunch of org-mode files with snippets containing HTML code and I would like to convert those to plain text.我有一堆包含 HTML 代码片段的组织模式文件,我想将它们转换为纯文本。
I don't need any fancy fully automated solution, I can just past my HTML snippet into a scratch buffer if that's easier.我不需要任何花哨的全自动解决方案,如果更容易的话,我可以将我的 HTML 片段传递到临时缓冲区中。
Here's a simple example of desired behavior:这是所需行为的简单示例:
<div><div>First Line<br>Second Line</div></div>
First Line
Second Line
What are the options available to Emacs users for such a task?对于此类任务,Emacs 用户可以使用哪些选项?
Emacs added EWW in Emacs 24.4 (2014), the Emacs Web Wowser, a built-in web browser . Emacs 在 Emacs 24.4 (2014) 中添加了 EWW,即 Emacs Web Wowser,一个内置的网络浏览器。 The shr.el library is used for rendering HTML, eg,
shr.el 库用于渲染 HTML,例如,
(with-temp-buffer
(insert
"<div><div>First Line<br>Second Line</div></div> ")
(shr-render-region (point-min) (point-max))
(buffer-substring-no-properties (point-min) (point-max)))
;; =>
"First Line
Second Line
"
shr-render-region
uses libxml-parse-html-region
which requires your Emacs has libxml2 support. shr-render-region
使用libxml-parse-html-region
,它需要你的 Emacs 有 libxml2 支持。
html2org package seems to get the job done html2org 包似乎完成了工作
html2org
function converts and replaces the HTML code as text. html2org
函数将 HTML 代码转换并替换为文本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.