簡體 English 中英

如何刮<html> ...</html> 在另一個里面<html> ...</html> 與木偶師

[英]How to scrape <html>...</html> INSIDE another <html>...</html> with puppeteer

原文 2019-12-08 18:03:24 5 1 javascript/ html/ node.js/ puppeteer

好的，所以我試圖用 node.js puppeteer 抓取的頁面結構是這樣的

    <html lang = "en">
    ....
       <html xmlns="https://www.w3.org/1999/xhtml" lang="en">
            <a href = "link I'm trying to go to">Go to link</a>
       </html>
    </html>

我試圖通過選擇器和 XPath 單擊。 兩者都沒有奏效，我三重檢查了兩者是否正確。 感覺跟這個內嵌的html有關系，不知道怎么處理？ 任何人都可以幫忙嗎？

1 個解決方案

其他評論指出無法從父文檔訪問 iframe 內的內容。 我再次檢查了代碼，發現它實際上是這樣的結構：

<html lang = "en">
....
   <iframe src = "url">
       <html xmlns="https://www.w3.org/1999/xhtml" lang="en">
           <a href = "link I'm trying to go to">Go to link</a>
       </html>
   </iframe>
</html>

所以我所要做的就是 page.goto(url)，然后我就可以正常抓取了。 謝謝大家！

在 Static HTML 中包含 Puppeteer (javascript) 代碼

[英]Include Puppeteer (javascript) code inside of Static HTML

HTML 元素未在 Puppeteer 中選擇

[英]HTML element not selecting in Puppeteer

在 html 項目中使用 Puppeteer

[英]Using Puppeteer in an html project

如何在另一個HTML頁面中嵌入的HTML頁面中檢測事件？

[英]how to detect events in a html page embedded inside another html page?

如何在另一個html文件中包含html文件

[英]How do I include html file inside another html file

如何在 index.html 中加載另一個 html 點擊 Javascript

[英]How to load another html inside index.html on click Javascript

如何使用 Vanilla Javascript 在另一個 html 文件中包含 html 文件？

[英]How to include a html file inside another html file with Vanilla Javascript?

檢查一個HTML對象是否在另一個HTML對象內

[英]Check if an HTML object is inside another HTML object

在另一個html內查看html頁面

[英]view html page inside another html

到達另一個 HTML 文檔中的 HTML 代碼

[英]Reach HTML code inside another HTML document

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 在 Static HTML 中包含 Puppeteer (javascript) 代碼 HTML 元素未在 Puppeteer 中選擇在 html 項目中使用 Puppeteer 如何在另一個HTML頁面中嵌入的HTML頁面中檢測事件？如何在另一個html文件中包含html文件如何在 index.html 中加載另一個 html 點擊 Javascript 如何使用 Vanilla Javascript 在另一個 html 文件中包含 html 文件？檢查一個HTML對象是否在另一個HTML對象內在另一個html內查看html頁面到達另一個 HTML 文檔中的 HTML 代碼

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM