简体   繁体   中英

Does the Chrome console execute things the same way as Visual Studio Code?

I want to crawl the web for discount links, so I've done this. When I test my code on the console, the data2 (which is an array of links that I've stored ) was correct as I expected. data2 only lists the links that have a discount.

这段代码写在Chrome控制台上

But when I run on console of VSCode, I ended up with the different result ( data2 now has all of the links, not only the discount ones).

这段代码写在VisualStudioCode上

Can you tell me the difference? I doubt that my "if" condition in VScode cannot be adapted which leads to this.

Note: THE CODE I RUN ON CHROME IS COPIED FROM THE VSCODE, THE LOGIC IS EXACTLY THE SAME.

There are a few issues with your code.

First of all, the page you are attempting to visit loads its content dynamically, so you may want to wait for the elements specified by selector string to be added to DOM using page.waitForSelector() :

await page.waitForSelector('#app > div > div.container > div.now-list-restautants > div > div > a > div.info-restaurant > p > i');

Additionally, inside page.evaluate() , the variable jq is not defined:

document.getElementsByTagName('head')[0].appendChild(jq); // jq is not defined

Furthermore, you are unnecessarily awaiting result twice. You can just return result :

return result;

Finally, make sure you are using browser.close() after you have finished scraping the links:

await browser.close();

The rest of the issues listed below are related to style and not necessarily functionality.

You should use let and const instead of var whenever possible ( source ):

const arr = ... // good
var arr = ...   // bad

If you are going to convert an iterable object to an array, use the spread syntax , instead of Array.from() ( source ):

[...elements]        // good
Array.from(elements) // bad

You can obtain the href attribute from elements using the href property, and, therefore, you do not need to use getAttribute('href') :

element.querySelector('.item-content').href                 // good
element.querySelector('.item-content').getAttribute('href') // bad

Here is a full working example:

'use strict';

const puppeteer = require('puppeteer');

let scrape = async () => {
  const browser = await puppeteer.launch({headless: false});
  const page = await browser.newPage();

  await page.goto('https://www.now.vn/ho-chi-minh/food/danh-sach-dia-diem-phuc-vu-ca-phe,nuoc-ep-sinh-to,16,70-giao-tan-noi');

  await page.waitForSelector('#app > div > div.container > div.now-list-restautants > div > div > a > div.info-restaurant > p > i');

  const result = await page.evaluate(() => {
    // document.getElementsByTagName('head')[0].appendChild(jq);
    const data2 = [];
    const elements = document.querySelector('#app > div > div.container > div.now-list-restautants > div').children;
    const arr = [...elements];
    const regex = '';

    arr.shift();

    arr.forEach((element, index) => {
      const tagi = document.querySelector('#app > div > div.container > div.now-list-restautants > div > div:nth-child(' + (index + 2) + ') > a > div.info-restaurant > p > i');

      if (element.contains(tagi)) {
        data2.push(element.querySelector('.item-content').href);
      }
    });

    return data2;
  });

  await browser.close();

  return result;
};

scrape().then(value => {
  console.log(value);
});

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM