简体   繁体   中英

What's the quickest way to strip a string from a specific tag

I have HTML in a string. I want to strip the <head> part of it. I use:

$html = preg_replace("/<head[^>]*?>.*?<\/head>/s", "", $html);

But in terms of performance, this can be a bit heavy. Is there a better alternative?

I know that I can use strip_tags() and list all accepted tags in the second argument but it's too many to list.

Your current regex takes 6720 steps when tested against part of this SO page.

This regex <head[^>]*?>(?:[^<]*<??)*</head> only takes 376 steps, and it should return the same thing. It should be almost 20x faster than your regex.

It works by greedily matching everything that's not < here: [^<]*

Then, because <?? is lazy, it will try to immediately match </head> . If there is no match, the <?? kicks in.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM