简体   繁体   中英

PHP regex to remove everything inside a tag

I have a string containing anchor tags. Those anchor tags holds some html and text as like below:

<a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
 <img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
 My HTML Link 
 <label>Click here to view 
  <cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
 </label>
</a>

My string is like:

<p>Hello there,</p>
<p><a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
     <img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
     My HTML Link 
     <label>Click here to view 
      <cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
     </label>
    </a>
    what's up.
    </p>
<p>
Click here <a class="content-title-link" title="Blog" href="https://example.com/my-blog" target="_blank">
     <img id="my_main_pic" class="content-title-main-pic" src="https://example.com/xyz.jpg" width="30px" height="30px" alt="Main Profile Picture">
     My HTML Link 
     <label>Click here to view 
      <cite class="glyphicon glyphicon-new-window" title="Blog"></cite>
     </label>
    </a> to view my pic.
</p>

I have to replace the anchor tags with their href in the string so string will be like:

<p>Hello there,</p>
<p>https://example.com/my-blog
    what's up.
    </p>
<p>
Click here https://example.com/my-blog to view my pic.
</p>

I have tried below code but it is not replacing a tag with it's href:

$dom = new DomDocument();
$dom->loadHTML( $text );
$matches = array();
foreach ( $dom->getElementsByTagName('a') as $item ) {
   $matches[] = array (
      'a_tag' => $dom->saveHTML($item),
      'href' => $item->getAttribute('href'),
      'anchor_text' => $item->nodeValue
   );
}

foreach( $matches as $match )
{
  // Replace a tag by its href
  $text = str_replace( $match['a_tag'], $match['href'], $text );
}

return $text;

Does anyone know is it possible to do this.

We can try using a regex for this. Replace the following pattern with the capture group:

<a.*?href="([^"]*)".*?>.*?<\/a>

Using preg_replace we can repeatedly match the above pattern and replace the anchor tag with the capture href URL inside the tag.

$result = preg_replace('/<a.*?href="([^"]*)".*?>.*?<\/a>/s', '$1', $string);

Note carefully the s flag at the end of the /pattern/s . This does the replacement in DOT ALL mode, meaning that dot will also match newline (ie across lines, which is what you want).

Demo

Search for this regex:

<a.*?href="([^"]*)"[^>]*>

and replace it with

$1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM