簡體   English   中英

無法使用Puppeteer破壞dolartoday.com的輸入值

[英]Can't Scrape Input Value from dolartoday.com with Puppeteer

我想使用以下內容來刮除#result元素的value

 const puppeteer = require('puppeteer');

    (async () => {
      const browser = await puppeteer.launch();
      const page = await browser.newPage();
      await page.goto('https://dolartoday.com');
      await console.log(page.evaluate(() => document.getElementById('result')));

      await browser.close();
    })();

但是它仍然記錄以下錯誤:

(node:74908) UnhandledPromiseRejectionWarning: Error: Navigation Timeout Exceeded: 30000ms exceeded
at Promise.then (/Volumes/DATOS/Dropbox/workspaces/dolar-today/server/node_modules/puppeteer/lib/NavigatorWatcher.js:71:21)
at <anonymous>
(node:74908) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:74908) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

關於如何解決此問題的任何想法?

首先,您要嘗試在console.log() (一個同步函數)而不是page.evaluate() (一個異步函數)上使用await運算符。

您還嘗試將Page DOM元素返回到Node.js環境,因為page.evaluate()期望可序列化的返回值,這將無法使用。

如果要返回網頁上#result元素的value ,則#result以下方式重寫邏輯:

console.log(await page.evaluate(() => document.getElementById('result').value));

此外,導航時間已超過30000毫秒(默認最大值)。 您可以使用page.goto()函數中的timeout選項來擴展最大導航時間:

await page.goto('https://dolartoday.com', {
  timeout: 60000,
});

您還可以使用page.setRequestInterception()page.on('request')拒絕不必要的資源加載到網頁中。 這將使您的網頁加載更快:

await page.setRequestInterception(true);

page.on('request', request => {
  if (['image', 'stylesheet', 'font'].indexOf(request.resourceType()) !== -1) {
    request.abort();
  } else {
    request.continue();
  }
});

您的最終程序應如下所示:

'use strict';

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.setRequestInterception(true);

  page.on('request', request => {
    if (['image', 'stylesheet', 'font'].indexOf(request.resourceType()) !== -1) {
      request.abort();
    } else {
      request.continue();
    }
  });

  await page.goto('https://dolartoday.com', {
    timeout: 60000,
  });

  console.log(await page.evaluate(() => document.getElementById('result').value));

  await browser.close();
})();

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM