使用正則表達式從字符串中刪除除圖像標簽以外的所有內容

Question

我有包含所有html元素的字符串，我必須刪除除圖像以外的所有內容。

目前我正在使用此代碼

$e->outertext = "<p class='images'>".str_replace(' ', ' ', str_replace('Â','',preg_replace('/#.*?(<img.+?>).*?#is', '',$e)))."</p>";

它達到我的目的，但執行速度很慢。 任何其他做同樣的事情將是可觀的。

Answer 1

您提供的代碼似乎無法正常工作，甚至正則表達式也格式不正確。 您應該刪除初始的斜杠/像這樣： #.*?(<img.+?>).*?#is

您的想法是刪除所有內容並僅保留圖像標簽，這不是一個好方法。 更好的方法是考慮僅捕獲所有圖像標簽，然后使用匹配項構造輸出。 首先，讓我們捕獲圖像標簽。 可以使用此正則表達式完成此操作：

/<img.*>/Ug

U標志使正則表達式引擎變得懶惰而不是渴望，因此它將匹配它找到的第一個>遇到的情況。

演示1

現在，為了構造輸出，讓我們使用preg_match_all方法並將結果放入字符串中。 可以使用以下代碼完成：

<?php
// defining the input
$e = 
'<div class="topbar-links"><div class="gravatar-wrapper-24">
<img src="https://www.gravatar.com/avatar" alt="" width="24" height="24"     class="avatar-me js-avatar-me">
</div>
</div> <img test2> <img test3> <img test4>';
// defining the regex
$re = "/<img.*>/U";
// put all matches into $matches
preg_match_all($re, $e, $matches);
// start creating the result
$result = "<p class='images'>";
// loop to get all the images
for($i=0; $i<count($matches[0]); $i++) {
    $result .= $matches[0][$i];
}
// print the final result
echo $result."</p>";

演示2

改進該代碼的另一種方法是使用函數式編程（例如array_reduce ）。 但我會將其留作家庭作業。

注意：還有另一種方法可以完成此操作，即解析html文檔並使用XPath查找元素。 查看此答案以獲取更多信息。

使用正則表達式從字符串中刪除除圖像標簽以外的所有內容

問題描述

1 個解決方案

解決方案1
0 已采納 2015-09-09 14:00:10

使用正則表達式從字符串中刪除除圖像標簽以外的所有內容

問題描述

1 個解決方案

解決方案1 0 已采納 2015-09-09 14:00:10

解決方案1
0 已采納 2015-09-09 14:00:10