抓取具有相同主题的多个网站

Question

const PORT = 5000;
import express from "express";
import axios from "axios";
import cheerio from "cheerio";

const app = express();

const tomsHardware = "https://www.tomshardware.com/best-picks/best-gaming-mouse";
const pcGamer = "https://www.pcgamer.com/the-best-gaming-mouse/";

const requestOne = axios.get(tomsHardware);
const requestTwo = axios.get(pcGamer);

const mice = []

app.get('/', (req, res) => {
    res.json('Welcome to my climate change API!');
});

app.get('/mouse', (req, res) => {
    axios.all([requestOne, requestTwo])
        .then((response) => {
            const html = response.data;
            const $ = cheerio.load(html);

            $('.product__title').each(function (index, elem) {
                const title = $(this).text();
                mice.push({
                    title
                });
            });
            res.json(mice)
        }).catch((err) => console.log(err));
});

I am trying to scrape both of theses of these websites and I am getting "object is not iterable" also I am not very sure about scraping both of them as they are using the same theme as it appears and they use the same class name.我正在尝试抓取这两个网站的论文，但我得到“对象不可迭代”的信息，我也不太确定是否抓取它们，因为它们使用的主题与出现的主题相同，并且使用相同的 class 名称。

Answer 1

Your response is actually an array of two responses, so you'll need to loop over that array and parse each response's HTML separately:您的response实际上是两个响应的数组，因此您需要遍历该数组并分别解析每个响应的 HTML：

app.get('/mouse', (req, res) => {
  axios.all([requestOne, requestTwo])
    .then(responses => {
      for (const response of responses) {
        const html = response.data;
        const $ = cheerio.load(html);
  
        $('.product__title').each(function () {
          mice.push({title: $(this)text()});
        });
      }

      res.json(mice);
    })
    .catch(err => console.log(err));
});

Note that const mice = [] is declared outside the handler, so on each request, this array will continually grow with repeated elements.请注意， const mice = []是在处理程序外部声明的，因此在每次请求时，该数组将随着重复的元素不断增长。 You might want to move it into the request handler closure to rebuild it on every request.您可能希望将它移到请求处理程序闭包中，以便在每次请求时重建它。

抓取具有相同主题的多个网站

问题描述

1 个解决方案

解决方案1
0 2023-01-04 19:59:40

抓取具有相同主题的多个网站

问题描述

1 个解决方案

解决方案1 0 2023-01-04 19:59:40

解决方案1
0 2023-01-04 19:59:40