简体   繁体   中英

How to reject specify HTML tags by using css or xpath selector

I want to remove style and script tags and the contents of them by using css or xpath selector.

This is a example HTML:

<html>
  <head>
    <title>test</title>
    <style>
      // style
    </style>
    <script>
      /* some script */
    </script>
  </head>
  <body>
    <p>text</p>
    <script>
      /* some script */
    </script>
    <div>foo</div>
  </body>
</html>

I want to get a HTML like this:

<html>
  <head>
    <title>test</title>
  </head>
  <body>
    <p>text</p>
    <div>foo</div>
  </body>
</html>

I thought I can get HTML that doesn't including <script> tags with this code, but somehow the code only duplicate the HTML.

doc = Nokogiri::HTML(open("foo.text"))
doc.css(":not(script)").to_html

How can I enable the behavior I want?

Try these lines:

doc.search('.//style').remove
doc.search('.//script').remove

更简单的是:

doc.search('style,script').remove

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM