簡體   English   中英

類型錯誤:無法讀取 Node.js 和 puppeteer 中未定義的屬性“匹配”

[英]TypeError: Cannot read property 'match' of undefined in Node.js and puppeteer

我正在嘗試過濾包含一堆 url 的數組。 我需要返回僅包含“媒體發布”一詞的網址。 它目前只是發回錯誤。 雖然我嘗試刪除我的package-lock.json ,但它仍然不起作用。

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://www.cbp.gov/newsroom/media-releases/all');
  const data = await page.evaluate(() => {
    const nodeList = document.getElementsByClassName('survey-processed');
    const urls = [];

    for (i=0; i<nodeList.length; i++) {
      urls.push(document.getElementsByClassName('survey-processed')[i].href);
    }
    const regex = new RegExp('/media-release\\b', 'g');
    const links = urls.filter(element => element.match(regex));
    return links;
  });
  console.log(data);
  await browser.close();
})();

誤差(節點:10208)UnhandledPromiseRejectionWarning:錯誤:評估失敗:類型錯誤:無法讀取屬性'匹配'在puppeteer_evaluation_script的未定義:在ExecutionContext._evaluateInternal(C 24:在50 puppeteer_evaluation_script在Array.filter()::11 11 \\用戶\\Documents\\\\node_modules\\puppeteer\\lib\\cjs\\puppeteer\\common\\ExecutionContext.js:217:19) 在 processTicksAndRejections (internal/process/task_queues.js:86:5)

檢查頁面后,我發現了一些類survey-processed不是a元素的元素(兩種形式: form#search-block-form.survey-processedform#views-exposed-form-newsroom-page.survey-processed )。

form元素沒有href屬性,因此它將是undefined ,這就是導致錯誤的原因。

要解決此問題,您必須更具體地選擇元素,將querySelectorAll與此選擇器"a.survey-processed"如下所示:

const data = await page.evaluate(() => {
    const nodeList = document.querySelectorAll("a.survey-processed");  // get only <a> elements that have the classname 'survey-processed'
    const urls = [];

    for (let i = 0; i < nodeList.length; i++) {                        // for each one of those
        if(/\/media-release\b/.test(nodeList[i].href)) {               // if the 'href' attribute matches the regex (use 'test' here rather than 'match')
            urls.push(nodeList[i].href);                               // push the 'href' attribute to the array
        }
    }

    return urls;
});

此外,如果您只查找包含短語"/media-release"網址,您可以使用 CSS 的屬性 contains 選擇器[attribute*=value]進一步縮短代碼,如下所示:

const data = await page.evaluate(() => {
    const nodeList = document.querySelectorAll('a.survey-processed[href*="/media-release"]');  // get only <a> elements that have the classname 'survey-processed' and whose 'href' attribute contains the phrase "/media-release"
    return Array.from(nodeList).map(element => element.href);  // convert the NodeList into an array and use 'map' to get the 'href' attributes
});

您實際上可以直接返回過濾結果並使用.includes()檢查它是否包含媒體發布

const puppeteer = require("puppeteer");

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto("https://www.cbp.gov/newsroom/media-releases/all");
  const data = await page.evaluate(() => {
    return [
      ...document.querySelectorAll(".survey-processed")
    ].filter(({ href }) => href?.includes("media-release"));
  });
  console.log(data);
  await browser.close();
})();

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM