[英]Can't Scrape Input Value from dolartoday.com with Puppeteer
我想使用以下內容來刮除#result
元素的value
:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://dolartoday.com');
await console.log(page.evaluate(() => document.getElementById('result')));
await browser.close();
})();
但是它仍然記錄以下錯誤:
(node:74908) UnhandledPromiseRejectionWarning: Error: Navigation Timeout Exceeded: 30000ms exceeded
at Promise.then (/Volumes/DATOS/Dropbox/workspaces/dolar-today/server/node_modules/puppeteer/lib/NavigatorWatcher.js:71:21)
at <anonymous>
(node:74908) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:74908) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
關於如何解決此問題的任何想法?
首先,您要嘗試在console.log()
(一個同步函數)而不是page.evaluate()
(一個異步函數)上使用await
運算符。
您還嘗試將Page DOM元素返回到Node.js環境,因為page.evaluate()
期望可序列化的返回值,這將無法使用。
如果要返回網頁上#result
元素的value
,則#result
以下方式重寫邏輯:
console.log(await page.evaluate(() => document.getElementById('result').value));
此外,導航時間已超過30000毫秒(默認最大值)。 您可以使用page.goto()
函數中的timeout
選項來擴展最大導航時間:
await page.goto('https://dolartoday.com', {
timeout: 60000,
});
您還可以使用page.setRequestInterception()
和page.on('request')
拒絕不必要的資源加載到網頁中。 這將使您的網頁加載更快:
await page.setRequestInterception(true);
page.on('request', request => {
if (['image', 'stylesheet', 'font'].indexOf(request.resourceType()) !== -1) {
request.abort();
} else {
request.continue();
}
});
您的最終程序應如下所示:
'use strict';
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', request => {
if (['image', 'stylesheet', 'font'].indexOf(request.resourceType()) !== -1) {
request.abort();
} else {
request.continue();
}
});
await page.goto('https://dolartoday.com', {
timeout: 60000,
});
console.log(await page.evaluate(() => document.getElementById('result').value));
await browser.close();
})();
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.