I've been looking into Puppeteer, and am able to get the innerHTML, however, this can also contain <script>
content which I would like removed.
How do I achieve this?
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://www.example.com');
console.log(await page.evaluate(() => document.body.innerHTML));
Something like this?
const innerHTML = await page.evaluate(() => {
for (const script of document.body.querySelectorAll('script')) script.remove();
return document.body.innerHTML;
});
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.