I have an HTML email message that I parse using Jsoup :-
Jsoup.parse(bizmsg.getMessageBody()).text()
But it can't remove script tags :-
<script>
document.write("Bazinga!")
</script>
I have been using regex like this :-
String(v).replace(/(?:<script.*?>)((\n|\r|.)*?)(?:<\/script>)/ig, "");
to successfully remove scripts. But I came across this question JSoup to parse <script> tag
How do I use Rhino to parse scripts ? Code-Sample would be very helpful, thanks.
You don't need to use Rhino to remove <script>
tags. Use simple CSS selectors in JSoup and remove the obtained nodes. Here a minimal example on www.google.com
public static void main(String[] args) throws MalformedURLException, IOException {
Document doc = Jsoup.parse(new URL("http://www.google.com"),5000);
Elements elems = doc.select("script");
for (Element elem : elems)
elem.remove();
System.out.println(doc);
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.