正則表達式替換字符串的字符以字符開頭並以兩個字符中的任何一個結尾

Question

嘗試匹配以#1-9開頭的字符串注意： #后跟一個從1 to 9的數字，並以#1-9 (或 not)結尾。

完整字符串： "#1Lorem Ipsum is simply dummy text#2printing and typesetting industry"

主意：

是用Lorem Ipsum is simply dummy text替換#1Lorem Ipsum is simply dummy text

和#2printing and typesetting industry與printing and typesetting industry

所以用和 append 替換#1-9結束標記在每個末尾。

但：

假設字符串只有一個以#1-9開頭的字符串，如下所示：

"#1Lorem Ipsum is simply dummy text"如何將放在末尾以關閉標記。

我猜可能會使用單詞末尾的最后一個"在它之前添加結束標記，因為不再有#1-9在它之前停止，但不會丟失或替換字符串的最后一個" 。

所以它變成： "Lorem Ipsum is simply dummy text"

我嘗試過的正則表達式： (#[0-9])(.*?)(#|")但這僅匹配字符串的第一部分#1並忽略#2部分（參見完整字符串） 。

我將使用php來匹配和替換，也許使用preg_replace只需要先找到正則表達式部分的方法。

我怎樣才能做到這一點？

Answer 1

您正在尋找的是負面的前瞻。 它非常強大，只有當里面的匹配不匹配時才會匹配。

#([0-9])((?:(?!$|#[0-9]).)+)

這將查找 #0-9 並在另一個 #0-9 出現或行尾時結束。 消極的前瞻位是這樣的： (?!$|#[0-9]) 。 它說只有在它不能匹配 $ 或 #0-9 時才繼續。 您必須為每個字符處理它，因此當您不匹配它時，將下一個字符與. ，並將其全部匹配到一個捕獲組中。

鐵路圖如下：

這是使用regexper.com生成的

Answer 2

<?php
function convert($str) {
    static $numberNamesMap = [
        1 => 'one',
        2 => 'two',
        3 => 'three',
        4 => 'four',
        5 => 'five',
        6 => 'six',
        7 => 'seven',
        8 => 'eight',
        9 => 'nine',
    ];
    return preg_replace_callback(
        '~#([1-9])(((?!#[1-9]).)*)~',
        function($matches) use ($numberNamesMap) {
            $class = $numberNamesMap[$matches[1]];
            $htmlText = htmlentities($matches[2]);
            return "<span class=\"$class\">$htmlText</span>";
        },
        $str
    ); 
}

參考

例子

echo convert('#1Lorem Ipsum is simply dummy text');

輸出：

<span class="one">Lorem Ipsum is simply dummy text</span>

echo convert('#1Lorem Ipsum is simply dummy text#2printing and typesetting industry');

輸出：

<span class="one">Lorem Ipsum is simply dummy text</span><span class="two">printing and typesetting industry</span>

echo convert('#1Lorem Ipsum is simply dummy text#0printing and typesetting industry');

輸出：

<span class="one">Lorem Ipsum is simply dummy text#0printing and typesetting industry</span>

Answer 3

preg_replace_callback()是完成這項工作的正確工具。 為避免需要手動聲明數字映射數組，您可以使用NumberFormatter class。 在回調主體中使用sprintf()將有助於將數據從 html 中分離出來，並使維護更容易。

代碼：（演示）

$string = '#1Lorem Ipsum is simply dummy text#2printing and typesetting industry#0nothing#35That\'s a big one!';

echo preg_replace_callback(
         '/#(\d+)((?:(?!#\d).)+)/',
         fn($m) => sprintf(
             '<span class="%s">%s</span>',
             (new NumberFormatter("en", NumberFormatter::SPELLOUT))->format($m[1]),
             htmlentities($m[2])
         ),
         $string
     );

Output：

<span class="one">Lorem Ipsum is simply dummy text</span><span class="two">printing and typesetting industry</span><span class="zero">nothing</span><span class="thirty-five">That&#039;s a big one!</span>

請注意，如果您在#[number]之后的實際字符串中沒有#符號，則可以通過使用貪婪的否定字符 class 作為第二個捕獲組來顯着提高正則表達式的性能。 #(\d+)([^#]+)這將樣本字符串上的步數從 283 步減少到僅 16 步。

老實說，即使是像#(\d+)(.+?(?=#\d|$))這樣的惰性模式也會以 213 個步驟處理樣本字符串。 性能可能不是一個因素，因此請使用您最喜歡閱讀的任何正則表達式。

正則表達式替換字符串的字符以字符開頭並以兩個字符中的任何一個結尾

問題描述

3 個解決方案

解決方案1
3 2022-08-02 22:02:20

解決方案2
1 已采納 2022-08-02 22:24:15

參考

例子

解決方案3
1 2022-08-03 04:35:16

正則表達式替換字符串的字符以字符開頭並以兩個字符中的任何一個結尾

問題描述

3 個解決方案

解決方案1 3 2022-08-02 22:02:20

解決方案2 1 已采納 2022-08-02 22:24:15

參考

例子

解決方案3 1 2022-08-03 04:35:16

解決方案1
3 2022-08-02 22:02:20

解決方案2
1 已采納 2022-08-02 22:24:15

解決方案3
1 2022-08-03 04:35:16