简体   繁体   English

如何在 PHP 中解析 Markdown?

[英]How to parse Markdown in PHP?

First, I know, there already is a Markdown parser for PHP.首先,我知道,已经有一个用于 PHP 的 Markdown 解析器。 I also took a look to this question but it doesn't answer to my question.我也看了一下这个问题,但它没有回答我的问题。

Obviously, even if the title mention PHP, if it's language agnostic, because I'd like to know what are the step I've to go through to do that.显然,即使标题提到了 PHP,如果它与语言无关,因为我想知道我必须经过哪些步骤才能做到这一点。

I've read about PEG , but I've to admit, I didn't really understand the example provided with the PHP parser.我读过PEG ,但我必须承认,我并没有真正理解 PHP 解析器提供的示例。

I've also read about CFG .我也读过CFG

I've found Zend_Markup_Parser_Textile which seems to construct a so called "Token Tree" (what's about it?) but it currently unusable .我发现Zend_Markup_Parser_Textile似乎构建了一个所谓的“令牌树”(它是什么?)但它目前无法使用 (Btw, Textile is not Markdown) (顺便说一句,Textile 不是 Markdown)

So, concretely, how would you go to this?那么,具体来说,你会怎么做呢?

Obviously I though about using Regex but, I'm afraid.显然我虽然想使用正则表达式,但恐怕。

Because Markdown supports several syntaxes for the same element (Setext and atx).因为 Markdown 支持同一元素的多种语法(Setext 和 atx)。

Could you give some starting point?你能给出一些起点吗?

You should have a look at Parsedown .你应该看看Parsedown

It parses Markdown text the way people do.它以人们的方式解析 Markdown 文本。 First, it divides texts into lines.首先,它将文本分成几行。 Then it looks at how these lines start and relate to each other.然后查看这些行如何开始并相互关联。 Finally, it looks for special characters to identify inline elements.最后,它寻找特殊字符来识别内联元素。

PHP Markdown Extra似乎很流行,您可以从查看其来源开始。

此外,还有一个更快的面向对象的 Markdown 实现: markdown-oo-php

Ciconia - A New Markdown Parser for PHP is a good one I found. Ciconia - 一个新的 PHP 降价解析器是我发现的一个很好的解析器。

You just need 3 things to do :你只需要做 3 件事:

1.Install Ciconia and parse file according to the document . 1.根据文档安装Ciconia并解析文件。
2. Add corresponding css theme to make it nice, like github markdown style or here . 2.添加相应的css主题,使其好看,如github markdown stylehere
3. Add syntax highlighting javascript, like google Javascript code prettifier . 3. 添加语法高亮 javascript,如google Javascript code prettifier

Then everything will look pretty good.那么一切都会看起来很不错。

If you want a complete example, here is my working demo for github style markdown:如果你想要一个完整的例子,这里是我的 github 风格降价演示:

<?php
header("Content-Type: text/html;charset=utf-8");
require 'vendor/autoload.php';
use Ciconia\Ciconia;
use Ciconia\Extension\Gfm;

$ciconia = new Ciconia();
$ciconia->addExtension(new Gfm\FencedCodeBlockExtension());
$ciconia->addExtension(new Gfm\TaskListExtension());
$ciconia->addExtension(new Gfm\InlineStyleExtension());
$ciconia->addExtension(new Gfm\WhiteSpaceExtension());
$ciconia->addExtension(new Gfm\TableExtension());
$ciconia->addExtension(new Gfm\UrlAutoLinkExtension());
$contents = file_get_contents('Readme.md');
$html = $ciconia->render($contents);
?>
<!DOCTYPE html>
<html>
    <head>
        <title>Excel to Lua table - Readme</title>
        <script src="https://cdn.rawgit.com/google/code-prettify/master/loader/run_prettify.js"></script>
        <link rel="stylesheet" href="./github-markdown.css">
        <style>
            .markdown-body {
                box-sizing: border-box;
                min-width: 200px;
                max-width: 980px;
                margin: 0 auto;
                padding: 45px;
            }
        </style>
    </head>
    <body>
        <article class="markdown-body">
        <?php
            # Put HTML content in the document
            echo $html;
        ?>
        </article>
    </body>
</html>

Using regexes.使用正则表达式。

<?php

/**
 * Slimdown - A very basic regex-based Markdown parser. Supports the
 * following elements (and can be extended via Slimdown::add_rule()):
 *
 * - Headers
 * - Links
 * - Bold
 * - Emphasis
 * - Deletions
 * - Quotes
 * - Inline code
 * - Blockquotes
 * - Ordered/unordered lists
 * - Horizontal rules
 *
 * Author: Johnny Broadway <johnny@johnnybroadway.com>
 * Website: https://gist.github.com/jbroadway/2836900
 * License: MIT
 */
class Slimdown {
    public static $rules = array (
        '/(#+)(.*)/' => 'self::header',                           // headers
        '/\[([^\[]+)\]\(([^\)]+)\)/' => '<a href=\'\2\'>\1</a>',  // links
        '/(\*\*|__)(.*?)\1/' => '<strong>\2</strong>',            // bold
        '/(\*|_)(.*?)\1/' => '<em>\2</em>',                       // emphasis
        '/\~\~(.*?)\~\~/' => '<del>\1</del>',                     // del
        '/\:\"(.*?)\"\:/' => '<q>\1</q>',                         // quote
        '/`(.*?)`/' => '<code>\1</code>',                         // inline code
        '/\n\*(.*)/' => 'self::ul_list',                          // ul lists
        '/\n[0-9]+\.(.*)/' => 'self::ol_list',                    // ol lists
        '/\n(&gt;|\>)(.*)/' => 'self::blockquote ',               // blockquotes
        '/\n-{5,}/' => "\n<hr />",                                // horizontal rule
        '/\n([^\n]+)\n/' => 'self::para',                         // add paragraphs
        '/<\/ul>\s?<ul>/' => '',                                  // fix extra ul
        '/<\/ol>\s?<ol>/' => '',                                  // fix extra ol
        '/<\/blockquote><blockquote>/' => "\n"                    // fix extra blockquote
    );

    private static function para ($regs) {
        $line = $regs[1];
        $trimmed = trim ($line);
        if (preg_match ('/^<\/?(ul|ol|li|h|p|bl)/', $trimmed)) {
            return "\n" . $line . "\n";
        }
        return sprintf ("\n<p>%s</p>\n", $trimmed);
    }

    private static function ul_list ($regs) {
        $item = $regs[1];
        return sprintf ("\n<ul>\n\t<li>%s</li>\n</ul>", trim ($item));
    }

    private static function ol_list ($regs) {
        $item = $regs[1];
        return sprintf ("\n<ol>\n\t<li>%s</li>\n</ol>", trim ($item));
    }

    private static function blockquote ($regs) {
        $item = $regs[2];
        return sprintf ("\n<blockquote>%s</blockquote>", trim ($item));
    }

    private static function header ($regs) {
        list ($tmp, $chars, $header) = $regs;
        $level = strlen ($chars);
        return sprintf ('<h%d>%s</h%d>', $level, trim ($header), $level);
    }

    /**
     * Add a rule.
     */
    public static function add_rule ($regex, $replacement) {
        self::$rules[$regex] = $replacement;
    }

    /**
     * Render some Markdown into HTML.
     */
    public static function render ($text) {
        $text = "\n" . $text . "\n";
        foreach (self::$rules as $regex => $replacement) {
            if (is_callable ( $replacement)) {
                $text = preg_replace_callback ($regex, $replacement, $text);
            } else {
                $text = preg_replace ($regex, $replacement, $text);
            }
        }
        return trim ($text);
    }
}


echo Slimdown::render ("# Title

And *now* [a link](http://www.google.com) to **follow** and [another](http://yahoo.com/).

* One
* Two
* Three

## Subhead

One **two** three **four** five.

One __two__ three _four_ five __six__ seven _eight_.

1. One
2. Two
3. Three

More text with `inline($code)` sample.

> A block quote
> across two lines.

More text...");

Origin https://gist.github.com/jbroadway/2836900来源https://gist.github.com/jbroadway/2836900

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM