简体   繁体   English

承诺并将数组中的元素添加到另一个数组中的对象中

[英]Promises and adding elements from an array into objects within another array

The assignment is a command line node application that scrapes some data off of a specific site and saves that data to a CSV file. 该分配是一个命令行节点应用程序,可将某些数据从特定站点上刮下来并将其保存到CSV文件中。

I'm using scrape-it to scrape the data and successfully getting all of the data I need, but I'm struggling to figure out how to add each URL (stored in urls) into their corresponding shirts object, which is an array of objects. 我正在使用scrape-it来抓取数据并成功获取所需的所有数据,但是我在努力寻找如何将每个URL(存储在url中)添加到其对应的shirt对象中,该对象是一个数组对象。

Here is what I have so far. 这是我到目前为止所拥有的。

const scrapeIt = require("scrape-it");

const mainURL = "http://shirts4mike.com/";

scrapeIt(`${mainURL}shirts.php`, {
  pages: {
    listItem: ".products li",
    name: "pages",
    data: {
      url: {
        selector: "a",
        attr: "href"
      }
    }
  }
})
  .then(({ data }) => {
    const urls = data.pages.map(page => `${mainURL}${page.url}`);
    console.log(urls);
    const shirtCalls = urls.map(url =>
      scrapeIt(url, {
        name: {
          selector: ".shirt-picture img",
          attr: "alt"
        },
        image: {
          selector: ".shirt-picture img",
          attr: "src"
        },
        price: {
          selector: "span.price"
        }
      })
    );
    return Promise.all(shirtCalls);
  })
  .then(shirtResults => {
    const shirts = shirtResults.map(shirtResult => shirtResult.data);
    console.log(shirts);
  });

So the output that "shirts" gives me is, 所以“衬衫”给我的输出是

[ { name: 'Logo Shirt, Red',
    image: 'img/shirts/shirt-101.jpg',
    price: '$18' },
  { name: 'Mike the Frog Shirt, Black',
    image: 'img/shirts/shirt-102.jpg',
    price: '$20' },
  { name: 'Mike the Frog Shirt, Blue',
    image: 'img/shirts/shirt-103.jpg',
    price: '$20' },
  { name: 'Logo Shirt, Green',
    image: 'img/shirts/shirt-104.jpg',
    price: '$18' },
  { name: 'Mike the Frog Shirt, Yellow',
    image: 'img/shirts/shirt-105.jpg',
    price: '$25' },
  { name: 'Logo Shirt, Gray',
    image: 'img/shirts/shirt-106.jpg',
    price: '$20' },
  { name: 'Logo Shirt, Teal',
    image: 'img/shirts/shirt-107.jpg',
    price: '$20' },
  { name: 'Mike the Frog Shirt, Orange',
    image: 'img/shirts/shirt-108.jpg',
    price: '$25' } ]

But what I am trying to get the final result to look like is.... 但是我试图使最终结果看起来像是...。

[ { name: 'Logo Shirt, Red',
    image: 'img/shirts/shirt-101.jpg',
    price: '$18',
    url: 'http://shirts4mike.com/shirt.php?id=101' //which is at urls[0]
  },
  { name: 'Mike the Frog Shirt, Black',
    image: 'img/shirts/shirt-102.jpg',
    price: '$20',
    url: 'http://shirts4mike.com/shirt.php?id=102' //urls[1]
  }, //...etc etc

Hopefully that all makes sense, still very new to promises (and node) so I'm feeling a bit out of my depth. 希望所有这些都有意义,对诺言(和结点)来说还是很新的,所以我感到有点不合时宜。 Thank you in advance! 先感谢您!

Try something like this: 尝试这样的事情:

const scrapeIt = require("scrape-it");

const mainURL = "http://shirts4mike.com/";

scrapeIt(`${mainURL}shirts.php`, {
  pages: {
    listItem: ".products li",
    name: "pages",
    data: {
      url: {
        selector: "a",
        attr: "href"
      }
    }
  }
})
  .then(({ data }) => {
    const urls = data.pages.map(page => `${mainURL}${page.url}`);
    console.log(urls);
    return urls.map(async (url) => {
      let urlObj = await scrapeIt(url, {
        name: {
          selector: ".shirt-picture img",
          attr: "alt"
        },
        image: {
          selector: ".shirt-picture img",
          attr: "src"
        },
        price: {
          selector: "span.price"
        }
      });

     return {...urlObj.data, url};
   });
  })
  .then(shirtResults => {
    console.log(shirtResults);
  });

So I actually managed to get it working, thanks to a suggestion another user made (though I think they deleted their comment?). 因此,由于另一个用户的建议,我实际上设法使它工作了(尽管我认为他们删除了他们的评论?)。 In the final .then(), I mapped over shirts, took the pageID from the image property, and interpolated the mainURL, the path, and finally the pageID in a template literal, and added that key/value into each object. 在最后的.then()中,我映射到衬衫上,从image属性中获取pageID,然后将mainURL,路径以及最后的pageID插值到模板文字中,并将该键/值添加到每个对象中。 Also used this as an opportunity to store the full image url in the image property. 还以此为契机,将完整的图像URL存储在image属性中。

  .then(shirtResults => {
    const shirts = shirtResults.map(shirtResult => shirtResult.data);
    shirts.map(shirt => {
      let pageID = shirt.image.replace(/\D/g, "");
      shirt.url = `${mainURL}shirt.php?id=${pageID}`;
      shirt.image = shirt.image.replace(/^/, `${mainURL}`);
    });
    console.log(shirts);
  });

Thanks for the help! 谢谢您的帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM