简体   繁体   English

Promise 不等待函数 promise 被解析

[英]Promise doesn't wait for functions promise to be resolved

So I've been working on a scraper project.所以我一直在做一个刮板项目。

Now I've implemented many things but I've been stuck on this one thing.现在我已经实现了很多事情,但我一直坚持这一件事。

So first let me explain workflow: Scrapers are called in scraping-service module, where I wait for the promise of the functions called to be resolved.所以首先让我解释一下工作流程:scrapers在scraping scraping-service模块中被调用,在这里我等待调用的函数的promise被解析。 Data is fetched in scrapers, and passed to the data_functions object where data is: merged, validated and inserted into DB.数据在爬虫中获取,并传递给data_functions object,其中数据:合并、验证并插入数据库。

Now here is the code:现在这里是代码:

scraping-service

const olxScraper = require('./scrapers/olx-scraper');
const santScraper = require('./scrapers/sant-scraper');
//Calling scraper from where we want to get data about apartments
const data_functions = require('./data-functions/dataF');

let count = 1;

Promise.all([
  olxScraper.olxScraper(count),
  santScraper.santScraper(count),
]).then(() => data_functions.validateData(data_functions.mergedApartments));

So here I'm waiting for the promise of these two functions, and then passing merged data to validateData method in the data_functions .所以这里我在等待这两个函数的 promise,然后将合并的数据传递给data_functions中的validateData方法。

Here is the scraper:这是刮刀:

const axios = require('axios'); //npm package - promise based http client
const cheerio = require('cheerio'); //npm package - used for web-scraping in server-side implementations
const data_functions = require('../data-functions/dataF');

//olxScraper function which as paramater needs count which is sent in the scraping-service file.
exports.olxScraper = async (count) => {
  const url = `https://www.olx.ba/pretraga?vrsta=samoprodaja&kategorija=23&sort_order=desc&kanton=9&sacijenom=sacijenom&stranica=${count}`;
  //url where data is located at.
  const olxScrapedData = [];
  try {
    await load_url(url, olxScrapedData); //pasing the url and empty array
  } catch (error) {
    console.log(error);
  }
};

//Function that does loading URL part of the scraper, and starting of process for fetching raw data.
const load_url = async (url, olxScrapedData) => {
  await axios.get(url).then((response) => {
    const $ = cheerio.load(response.data);
    fetch_raw_html($).each((index, element) => {
      process_single_article($, index, element, olxScrapedData);
    });

    process_fetching_squaremeters(olxScrapedData); // if i place 
 //data_functions.mergeData(olxScrapedData); here it will work
  });
};

//Part where raw html data is fetched but in div that we want.
const fetch_raw_html = ($) => {
  return $('div[id="rezultatipretrage"] > div')
    .not('div[class="listitem artikal obicniArtikal  i index"]')
    .not('div[class="obicniArtikal"]');
};

//Here is all logic for getting data that we want, from the raw html.
const process_single_article = ($, index, element, olxScrapedData) => {
  $('span[class="prekrizenacijena"]').remove();
  const getLink = $(element).find('div[class="naslov"] > a').attr('href');
  const getDescription = $(element).find('div[class="naslov"] > a > p').text();
  const getPrice = $(element)
    .find('div[class="datum"] > span')
    .text()
    .replace(/\.| ?KM$/g, '')
    .replace(' ', '');
  const getPicture = $(element).find('div[class="slika"] > img').attr('src');
  //making array of objects with data that is scraped.
  olxScrapedData[index] = {
    id: getLink.substring(27, 35),
    link: getLink,
    description: getDescription,
    price: parseFloat(getPrice),
    picture: getPicture,
  };
};

//Square meters are needed to be fetched for every single article.
//This function loads up all links in the olxScrapedData array, and updating objects with square meters value for each apartment.
const process_fetching_squaremeters = (olxScrapedData) => {
  const fetchSquaremeters = Promise.all(
    olxScrapedData.map((item) => {
      return axios.get(item.link).then((response) => {
        const $ = cheerio.load(response.data);
        const getSquaremeters = $('div[class="df2  "]')
          .first()
          .text()
          .replace('m2', '')
          .replace(',', '.')
          .split('-')[0];
        item.squaremeters = Math.round(getSquaremeters);
        item.pricepersquaremeter = Math.round(
          parseFloat(item.price) / parseFloat(getSquaremeters)
        );
      });
    })
  );

  fetchSquaremeters.then(() => {
    data_functions.mergeData(olxScrapedData); //Sending final array to mergeData function.
    return olxScrapedData;
  });
};

Now if I console.log(olxScrapedData) in the fetchSquaremeters.then it will output scraped apartments, but it doesn't want to call the function data_functions.mergeData(olxScrapedData) .现在,如果我console.log(olxScrapedData)fetchSquaremeters.then它将 output 刮掉公寓,但它不想调用 function data_functions.mergeData(olxScrapedData) But if I add that block in the load_url , it will trigger the functions and data is being merged, but without square meters things, and I really need that data.但是如果我在load_url中添加那个块,它会触发函数和数据被合并,但是没有平方米的东西,我真的需要那个数据。

So my question is, how to make this work?所以我的问题是,如何使这项工作? Do I need to call function somewhere else or?我需要在其他地方打电话给 function 吗?

What I want is just that this last olxScrapedData be sent to this function mergeData so that my arrays from different scrapers would be merged into one.我想要的只是将最后一个olxScrapedData发送到这个 function mergeData以便我来自不同刮板的 arrays 将合并为一个。

Thanks!谢谢!

Edit: also here is the other scrapers how it looks: https://jsfiddle.net/oh03mp8t/ .编辑:这也是其他刮板的外观: https://jsfiddle.net/oh03mp8t/ Note that in this scraper there is no any promises.请注意,在这个刮板中没有任何承诺。

Try adding this: const process_fetching_squaremeters = async (olxScrapedData)... and then await fetchSquaremeters.then(..) .尝试添加: const process_fetching_squaremeters = async (olxScrapedData)...然后await fetchSquaremeters.then(..)

James, in answer before told you what is happening.詹姆斯,在回答之前告诉你发生了什么。 You must wait for this promise to be resolved, in order to all be executed correctly.您必须等待此 promise 解决,才能正确执行。 If you don't have experience with async/await, promises, I suggest you watch some courses on them, to really understand what is happening here如果你没有使用 async/await 和 promises 的经验,我建议你看一些关于它们的课程,以真正了解这里发生了什么

Are you missing return/await statements from inside your promise/async statements, especially when your last statement is also a promise?您是否缺少承诺/异步语句中的返回/等待语句,尤其是当您的最后一条语句也是 promise 时?

Without that, you may be simply asking the promise to be executed at a later time, rather than returning the result and making $.all() wait for it.否则,您可能只是要求稍后执行 promise,而不是返回结果并让 $.all() 等待它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM