preg_replace_callback：在模式中包括花括號：{已捕獲，}不是

Question

我具有此功能，該功能利用preg_replace_callback將句子拆分為屬於不同類別（字母，漢字等等）的塊的“鏈”。

該函數試圖將字符' ， {和}包括為“字母”

function String_SplitSentence($string)
{
 $res = array();

 preg_replace_callback("~\b(?<han>\p{Han}+)\b|\b(?<alpha>[a-zA-Z0-9{}']+)\b|(?<other>[^\p{Han}A-Za-z0-9\s]+)~su",
 function($m) use (&$res) 
 {
 if (!empty($m["han"])) 
 {
  $t = array("type" => "han", "text" => $m["han"]);
  array_push($res,$t);
 }
 else if (!empty($m["alpha"])) 
 {
  $t = array("type" => "alpha", "text" => $m["alpha"]);
  array_push($res, $t);
 }
 else  if (!empty($m["other"])) 
 {
  $t = array("type" => "other", "text" => $m["other"]);
  array_push($res, $t);
 }
 },
 $string);

 return $res;
}

但是，花括號似乎有問題。

print_r(String_SplitSentence("Many cats{1}, several rats{2}"));

從輸出中可以看出，該函數將{視為字母字符，如所示，但停在}並將其視為“其他”。

Array
(
    [0] => Array
        (
            [type] => alpha
            [text] => Many
        )

    [1] => Array
        (
            [type] => alpha
            [text] => cats{1
        )

    [2] => Array
        (
            [type] => other
            [text] => },
        )

    [3] => Array
        (
            [type] => alpha
            [text] => several
        )

    [4] => Array
        (
            [type] => alpha
            [text] => rats{2
        )

    [5] => Array
        (
            [type] => other
            [text] => }
        )

我究竟做錯了什么？

Answer 1

我不能完全確定，因為您的示例輸入不代表任何漢字，並且我不知道您可能嘗試處理哪種附帶情況，但這是我將如何編寫模式：

~(?<han>\p{Han}+)|(?<alpha>[a-z\d{}']+)|(?<other>\S+)~ui

\\b的問題在於它正在尋找\\w字符。 \\w表示大寫字母，小寫字母，數字和下划線。 參考： https : //stackoverflow.com/a/11874899/2943403

而且您的模式不包含任何. s，因此您可以刪除s模式修飾符。

另外，您的函數調用似乎正在濫用preg_replace_callback() 。 我的意思是，您實際上並沒有更換任何東西，因此這是不適當的電話。 也許您可以考慮以下重寫：

function String_SplitSentence($string){
    if(!preg_match_all("~(?<han>\p{Han}+)|(?<alpha>[a-z\d{}']+)|(?<other>\S+)~ui",$string,$out)){
        return [];  // or $string or false
    }else{
        foreach($out as $group_key=>$group){
            if(!is_numeric($group_key)){  // disregard the indexed groups (which are unavoidably generated)
                foreach($group as $i=>$v){
                    if(strlen($v)){  // only store the value in the subarray that has a string length
                        $res[$i]=['type'=>$group_key,'text'=>$v];
                    }
                }
            }
        }
        ksort($res);
        return $res;
    }
}

有關您的模式的演示： https : //regex101.com/r/6EUaSM/1

\\ b在您的角色課弄壞了所有內容之后。 }不包含在\\w類中。 Regex希望為您做好工作-它“貪婪地”捕獲了它，直到它不再存在為止。 }由於單詞邊界而被排除在外。

preg_replace_callback：在模式中包括花括號：{已捕獲，}不是

問題描述

1 個解決方案

解決方案1
0 已采納 2018-02-12 03:38:43

preg_replace_callback：在模式中包括花括號：{已捕獲，}不是

問題描述

1 個解決方案

解決方案1 0 已采納 2018-02-12 03:38:43

解決方案1
0 已采納 2018-02-12 03:38:43