简体   繁体   English

将 Markdown 转换为纯文本

[英]Convert Markdown to plain text

I am looking into converting some Markdown text to plain text.我正在考虑将一些 Markdown 文本转换为纯文本。 After reading existing questions its apparent that the easiest solution would be to convert Markdown to Html with an existing converter then Html to plain text.在阅读现有问题后,很明显最简单的解决方案是使用现有转换器将 Markdown 转换为 Html,然后将 Html 转换为纯文本。 However i am still a little baffled as i need to retain the a tag href that comes from the html.但是我仍然有点困惑,因为我需要保留来自 html 的标签 href。

Eg This markdown "some text [click here]( https://somelink.com )" gets converted to html例如,这个 markdown “一些文本 [点击这里]( https://somelink.com )”被转换为 ZFC35FDC70D5FC69D2362

<p>some text <a href="https://somelink.com">click here</a></p>

then when i convert that html to plain text its "some text click here"然后当我将 html 转换为纯文本时,它的“一些文本单击此处”

How can i convert the orginal markdown to something like "some text https://somelink.com "如何将原始 markdown 转换为类似“一些文本https://somelink.com

Following on from the answer by Judah Gabriel Himango here i made changes to the method that steps through the html elements.根据 Judah Gabriel Himango 回答,我对逐步执行 html 元素的方法进行了更改。

I added the switch case for the A tag to get the attributes value and also set a flag to stop the method iterating through the a tags children as its the href that is important in my case.我为 A 标签添加了 switch case 以获取属性值,还设置了一个标志来停止方法迭代 a 标签子项,因为它的 href 在我的情况下很重要。

                case HtmlNodeType.Element:
                switch (node.Name)
                {
                    case "p":
                        // treat paragraphs as crlf
                        outText.Write("\r\n");
                        break;
                    case "br":
                        outText.Write("\r\n");
                        break;
                    case "a":
                        outText.Write($"{node.Attributes.FirstOrDefault(x => x.Name == "href")?.Value}");
                        isATag = true;
                        break;
                }

                if (node.HasChildNodes && !isATag)
                {
                    ConvertContentTo(node, outText);
                }
                break;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM