簡體   English   中英

PHP Regex批處理更新

[英]PHP Regex batch update

簡而言之,我想談談我的問題;

$text = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.';
$text = preg_replace('#(?<!((alt|src)="))Lorem(?!(.*("|<\/a>)))#i', '<a href="Lorem" title="Lorem" style="color: inherit;">\0</a>', $text);
$text = preg_replace('#(?<!((alt|src)="))Ipsum(?!(.*("|<\/a>)))#i', '<a href="Ipsum" title="Ipsum" style="color: inherit;">\0</a>', $text);
echo $text;

Lorem ”更改,但“ Ipsum ”不變。

上面的php結果:

 <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> Ipsum is simply dummy text of the printing and typesetting industry. <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of <a href="Lorem" title="Lorem" style="color: inherit;">Lorem</a> <a href="Ipsum" title="Ipsum" style="color: inherit;">Ipsum</a>. 

為什么“ Ipsum ”沒有變化?

編輯:

如果您注釋掉第一行preg_replace 以前是 -第二行preg_replace將可以正常工作。 PHP Fiddle 1 點擊F9運行

另外,如果交換兩個preg_replace的位置,則將替換“ Ipsum ”,而不替換“ LoremPHP Fiddle 2

因此, 如果這兩個詞最初不在錨標記 <a> ,則無需具有lookbehind和lookahead條件,或者至少不需要在第二個preg_replace ,否則,兩個環顧條件將是真實的PHP Fiddle 3 1


更新:

正如OP的評論中所提到的,當使用上述字符串時,如果字符串$text具有帶有相同條件詞的<a>標簽,例如:

 <a href="">test Lorem test</a>

在這種情況下,僅使用REGEX不能做到恕我直言,相反,我們需要執行以下操作:

  1. 檢查字符串$text是否出現錨標記<a>
  2. 使用數組$tempArr作為臨時存儲來存儲鏈接元素。
  3. 將每個鏈接元素替換為具有不同格式的某些文本,並以數字作為唯一ID,最終結果為: tempRep#0tempRep#1 ..等,每個鏈接元素tempRep#1代替。
  4. 運行REGEX語句2
  5. 現在,我們在步驟#3中進行反向操作,將tempRep#0tempRep#1等替換為它們對應的鏈接元素,這些鏈接元素已作為數組元素臨時存儲在$tempArr ,並將每個唯一ID中的數字與相同的數組索引號3

上面的算法可以用JavaScript來實現,因為我們需要進行一些文檔對象模型檢查,但是正如OP所說,JavaScript不是一種選擇,因此我們需要通過將字符串$text加載為HTML來利用PHP Document Object Model ,並使用以下PHP DOM命令: getElementsByTagName()getAttribute()textContent (或nodeValue )。

最后,我們有以下內容:

PHP小提琴4 [最終版]

$text = 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of <a href="link1href" title="test1">test Ipsum Lorem test</a> Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of <a href="link2href" title="test2">test Lorem test</a> Lorem Ipsum.';

$dom = new DOMDocument;
$dom->loadHTML($text);
$tempArr = array();
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {  
    $href = $link->getAttribute('href');
    $title = $link->getAttribute('title');
    $textCont = $link->textContent; //Alternatively, $link->->nodeValue could be used too
    $linkElement = '<a href="' . $href . '" title="' . $title . '">' . $textCont . '</a>';
    $tempArr[] = $linkElement;
}

for($i=0; $i < count($tempArr); $i++){
    $text = str_replace($tempArr[$i], 'tempRep#' . $i, $text);
}

$text = preg_replace('#(?<!(alt|src)=")(Lorem|Ipsum)(?!(("|<\/a>)))#i', '<a href="\0" title="\0" style="color: inherit;">\0</a>', $text);

for($i=0; $i < count($tempArr); $i++){
    $text = str_replace('tempRep#' . $i, $tempArr[$i], $text);
}
echo $text;

-----------------------------

筆記:

  1. 我發現,第二個preg_replace函數中的前瞻性條件是導致該錯誤的原因,在此PHP Fiddle 5中 ,我保留了后視性,只刪除了前瞻性,奇怪的是它仍然可以正常工作。
  2. 我已經將2個REGEX語句合並為一個:

     $text = preg_replace('#(?<!(alt|src)=")(Lorem|Ipsum)(?!(("|<\\/a>)))#i', '<a href="\\0" title="\\0" style="color: inherit;">\\0</a>', $text); 
  3. 這就是為什么我們為每個替換使用唯一的ID。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM