简体   繁体   中英

Remove contents of href attribute with with RegEx

For example, I have this HTML snippet:

<a href="/sites/all/themes/code.php">some text</a>

The question is - how to cut the text /sites/all/themes/code.php from the href with preg_replace() ; which pattern could I use?

I would strongly recommend against using regular expressions to parse any SGML derivative.

For HTML use some DOM parser. For PHP specifically there is DOMDocument .

pattern:

(<a .*?href=")([^"]*)

replacement: $1

you don't have to do a "replace"

(?<=<a href=")[^"]*(?=">) 

brings you what you want directly.

test with grep:

kent$  echo '<a href="/sites/all/themes/code.php">some text</a>'|grep -oP '(?<=<a href=")[^"]*(?=">)'                                    
/sites/all/themes/code.php

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM