简体   繁体   中英

RegEx - Remove HTML hyperlinks based on the link text

I have some text that has HTML hyper-links in it. I want to remove the hyperlinks, but only specific ones.

eg I start with this:

This is text <a href="link/to/somewhere">Link to Remove</a> and more text with another link <a href="/link/to/somewhere/else">Keep this link</a>

I want to have:

This is text and more text with another link <a href="/link/to/somewhere/else">Keep this link</a> 

I have this RegEx expression,

<a\s[^>]*>.*?</a>

... but it matches ALL of the links.

What do I need to add to that expression to match only the links with the link-text 'Remove' (for example) in it?

thanks in advance.

You'll probably get a lot of feedback not to use regular expressions on HTML... but if you do decide to use one, try this:

 <a\s[^>]*>.*?Remove.*?</a>

This is where "Remove" lies somewhere in the link text.

$str=~/(.*)<a.*<\/a>([a-z ]+ <a.*<\/a>)/;
print "$1$2";

(.*?)<a.*[Rr]emove.*?a>(.*)

reconstruct with: $1$2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM