正則表達式：用ol標記替換哈希字符列表

Question

首先，我想我對正則表達式不好。 真。 在過去的4天里，我試圖找出如何替換以下格式：

 # Item number 1
 # Item number 2
 # Item number 3

依此類推，包括：

<ol>
   <li>Item number 1</li>
   <li>Item number 2</li>
   <li>Item number 3</li>
</ol>

等等。 最初，我想用<li>替換/^\\s\\d\\.\\s/mi ，但是我真的放棄了，因為它更加復雜。

因此，我嘗試使用preg_match_all運行循環，以獲取所有可能的組，並將它們替換為html標簽。 但是我做錯了，我也不知道。 任何幫助將不勝感激。

那是我的代碼（ $_POST請求用XHR處理）：

$innerhtml = htmlspecialchars(addslashes($_POST['innerhtml']));
$br_nums   = '<br>';
if (strstr($innerhtml, PHP_EOL)) {
    $innerhtml = preg_replace("/\r\n\r\n/", $br_nums, $innerhtml);
}

preg_match_all('/^\s[\#\.]\s.*/mi', $innerhtml, $outmatch);
if (isset($outmatch[0])) {
    $origin_outmatc = $outmatch[0];
    $outmatch       = implode('[\r\n]', $outmatch[0]);
    $original_match = $outmatch;
    $outmatch       = explode('<br>', $outmatch);

    foreach ($outmatch as $key => $match) {
        if (preg_match('/^\<br\>/i', $match) || preg_match('/^\<br\>\[\\r\\n\]/i', $match)) {
            $match = str_replace('<br>', '', preg_replace('/^\[\\r\\n\]/i', '', $match));
        }
    }

    $full_ol = '';
    foreach($outmatch as $ol) {
        $full_ol .= '<ol>';
        $ol       = preg_replace('/^\s[\#\.]\s/', '<li>', str_replace('[\r\n]', '</li>', $ol));
        $full_ol .= $ol;
        $full_ol .= '</ol>';
    }

    $full_ol = str_replace(' # ', '<li>', preg_replace('/(?:$|)\<(?!\/li\>)\/ol\>/i', '</li></ol>', $full_ol));
    $full_ol = preg_replace('/(?:|^)\<ol\>[\r\n]\<\/li\>/i', '<ol>', $full_ol);

    $full_ol = explode('<ol>', $full_ol);
    foreach ($full_ol as $key => $list) {
        if (empty($list)) {
            unset($full_ol[$key]);
            $full_ol = array_values($full_ol);
        }
    }

    foreach ($full_ol as $key => $list) {
        $full_ol[$key] = '<ol>' . $list;
    }

    $original_match = str_replace('<br>', '+SPLIT_HERE+<br>', str_replace('[\r\n]', "\r\n", $original_match));
    $original_match = explode('+SPLIT_HERE+', $original_match);

    foreach ($original_match as $key => $possible_match) {
        if (!preg_match('/^\s\#\s/mi', $possible_match)) {
            unset($original_match[$key]);
            $original_match = array_values($original_match);
        }
    }

    foreach ($full_ol as $key => $possible_match) {
        if (preg_match('/^\<ol\>\<\/li\>\<\/ol\>$/i', $possible_match)) {
            unset($full_ol[$key]);
            $full_ol = array_values($full_ol);
        }
    }

    // Preview
    var_dump($original_match, $full_ol);

    // Replace original with html version
    $innerhtml = str_replace($original_match, $full_ol, $innerhtml);
}

請指導我-我該如何做得更好（或者至少做對了）？ 我很沮喪...謝謝。

Answer 1

此代碼應該工作。 用phptester.net測試過。

它已記錄在案，如果您有任何疑問可以命令:)

<?php

$innerhtml = "
Hallo 123,

this is a little test

 # jeah
 # huhu
 # third

 line beetween?

 # okay lets do this again
 # second
 # final

 what about u?
";

$br_nums   = '<br>';

// with that code it is not working. because double line breaks means normaly <p>
// if (strstr($innerhtml, PHP_EOL)) {
//     $innerhtml = preg_replace("/\r\n\r\n/", $br_nums, $innerhtml);
// }

preg_match_all('/\s*#{1}\s*(.*)\n/', $innerhtml . "\n", $matches); // for last line matching

$olStarted = false;

if (!empty($matches[1])) {
    foreach($matches[1] as $x => $match) {
        $replace = '';

        // start the ol, if is not started already
        if (!$olStarted) {
            $replace .= '<ol>';
            $olStarted = true;
        }

        // build li 
        $replace .= '<li>' . $match . '</li>';

        // end ol when of them is true
        //
        // 1* no next list item is there
        // 2* next list item is there, but a line breaks are between them
        if(
            !isset($matches[0][$x + 1]) || // 1*
            strpos($matches[0][$x + 1], "\n") === 1 // 2*
        ) {
           $replace .= '</ol>';
           $olStarted = false;
        }

        // actually replace the line
        $innerhtml = str_replace($matches[0][$x], $replace, $innerhtml);
    }
}

var_dump($innerhtml);

Answer 2

匹配# Item number 1正則表達式為：

\s*[#]\s+[iI][tT][eE][mM]\s+[nN][uU][mM][bB][eE][rR]\s+\[0-9]+\s*

這意味着：

(0+ spaces)#(1+ spaces)Item(1+ spaces)number(1+ spaces)123(0+ spaces)

例子是： " # Item number 12 "

如果文本與此模式匹配，則與下一個模式匹配：

[iI][tT][eE][mM]\s+[nN][uU][mM][bB][eE][rR]\s+\[0-9]+\s*

通過匹配，您可以獲取匹配字符串的索引開頭。 從Match.Index到字符串Length的子字符串，您將獲得以下值：

Item number 1

局部放電

如果"Item"可以是任何字符串，則只需輸入"\\w+"而不是"[iI][tT][eE][mM]" 。 "number" 。

第二種方式：

匹配主要模式：

\s*[#]\s+[iI][tT][eE][mM]\s+[nN][uU][mM][bB][eE][rR]\s+\[0-9]+\s*

找到匹配項后，匹配下一個模式：

\s*[#]\s+

現在，從“ Match.Value.Length到“ Length - Match.Value.Length字符串的Length - Match.Value.Length "# item number 2"子串"# item number 2" 。 在這種情況下，長度為2到"# item number 2"子字符串Match.Value.Length 。

正則表達式：用ol標記替換哈希字符列表

問題描述

2 個解決方案

解決方案1
0 已采納 2017-10-26 20:40:57

解決方案2
-1 2017-10-26 19:38:36

正則表達式：用ol標記替換哈希字符列表

問題描述

2 個解決方案

解決方案1 0 已采納 2017-10-26 20:40:57

解決方案2 -1 2017-10-26 19:38:36

解決方案1
0 已采納 2017-10-26 20:40:57

解決方案2
-1 2017-10-26 19:38:36