简体   繁体   中英

How can I get rid of all JavaScript from an HTML page?

I could use regex to get rid of the <script> tags in the HTML like this

$html = preg_replace('#<script(.*?)>(.*?)</script>#is','', $html);

So that works fine, but what about inline JavaScript? I figured out I could do it this way

$nodes = $dom->getElementsByTagName('*');
foreach($nodes as $node)
{
  if ($node->hasAttribute('onload')){
    $node->removeAttribute('onload');
  }
}

The issue with this is I'd have to find all the attributes, and keep making if statements. I've also seen libraries, but I want to keep things small. So is there any quick way? Also any nice lists with inline attributes if I have to keep doing what I'm doing?

我会说,不要重新发明轮子,使用http://htmlpurifier.org/之类的库来完成此任务。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM