简体   繁体   中英

Web scraping different websites and pushing values to object

I'm trying to loop through different websites and scrape their values and push it to a global variable. I've tried different things but I seem unable to push val to dat. My goal is to have an object with the stock values of DAL and AAL.

var request = require("request"),
cheerio = require("cheerio");

var ticker = ["DAL", "AAL"];
var dat = []

for (var i = 0; i < ticker.length; i++) {
  var url = "https://finance.yahoo.com/quote/"+ ticker[i] + "?p=" + ticker[i];
    request(url, function (error, response, body) {
      if (!error) {
        var $ = cheerio.load(body);
        var val = {
          Ticker : ticker[i],
          "Date" : new Date(),
          PreviousClose : $("span[data-reactid='98']").text().toString(),
          Open : $("span[data-reactid='103']").text().toString(),
          Bid : $("span[data-reactid='108']").text().toString(),
          Ask : $("span[data-reactid='113']").text().toString(),
          DayRange : $("td[data-reactid='117']").text().toString(),
          WeekRange_52 : $("td[data-reactid='121']").text().toString(),
          Volume : $("span[data-reactid='126']").text().toString(),
          AverageVolume : $("span[data-reactid='131']").text().toString(),
          MarketCap : $("span[data-reactid='139']").text().toString(),
          Beta5Months : $("span[data-reactid='144']").text().toString(),
          PEratio : $("span[data-reactid='149']").text().toString(),
          "EPS" : $("span[data-reactid='154']").text().toString()
        };
      } else {
        return console.error(error);
      }
    });
    dat.push(val);
}

console.log(dat);

EDIT:

var request = require("request"),
cheerio = require("cheerio");

var ticker = ["DAL", "AAL"];
var dat = []

for (var i = 0; i < ticker.length; i++) {
  var url = "https://finance.yahoo.com/quote/"+ ticker[i] + "?p=" + ticker[i];
    request(url, function (error, response, body) {
      if (!error) {
        var $ = cheerio.load(body);
        var val = {
          Ticker : ticker[i],
          "Date" : new Date(),
          PreviousClose : $("span[data-reactid='98']").text().toString(),
          Open : $("span[data-reactid='103']").text().toString(),
          Bid : $("span[data-reactid='108']").text().toString(),
          Ask : $("span[data-reactid='113']").text().toString(),
          DayRange : $("td[data-reactid='117']").text().toString(),
          WeekRange_52 : $("td[data-reactid='121']").text().toString(),
          Volume : $("span[data-reactid='126']").text().toString(),
          AverageVolume : $("span[data-reactid='131']").text().toString(),
          MarketCap : $("span[data-reactid='139']").text().toString(),
          Beta5Months : $("span[data-reactid='144']").text().toString(),
          PEratio : $("span[data-reactid='149']").text().toString(),
          "EPS" : $("span[data-reactid='154']").text().toString()
        }; 
        dat.push(val); // moved the push call to inside this if statement so it's in the same "scope" as the var variable
      } else {
        console.error(error);
      }
    });
}

console.log(dat);

ORIGINAL:

The problem is val is being limited to the scope of the if statement you create it in. (Read more about it here )

Try adding var val; above the request call (assuming request is a synchronous function, read more about asynchronous vs. synchronous here ).

So it'd look something like this:

// .... keep the code up here
  var url = "https://finance.yahoo.com/quote/"+ ticker[i] + "?p=" + ticker[I];
    var val;
    request(url, function (error, response, body) {
      if (!error) {
        var $ = cheerio.load(body);
        val = {
// ... keep your code from down here

Basically the val variable is essentially "destroyed" (to put it simply) after that if statement completes, so dat.push doesn't have access to it.

The code could be improved by running the requests concurrently using promises , but to answer the question as asked, do the work of pushing val inside the callback where it is defined (the OP placement of that line won't parse). Similarly, there's no point in logging the accumulated data until the last callback invocation.

for (var i = 0; i < ticker.length; i++) {
  var url = "https://finance.yahoo.com/quote/"+ ticker[i] + "?p=" + ticker[i];
    request(url, function (error, response, body) {
      if (!error) {
        var $ = cheerio.load(body);
        var val = {
          Ticker : ticker[i],
          "Date" : new Date(),
          PreviousClose : $("span[data-reactid='98']").text().toString(),
          Open : $("span[data-reactid='103']").text().toString(),
          Bid : $("span[data-reactid='108']").text().toString(),
          Ask : $("span[data-reactid='113']").text().toString(),
          DayRange : $("td[data-reactid='117']").text().toString(),
          WeekRange_52 : $("td[data-reactid='121']").text().toString(),
          Volume : $("span[data-reactid='126']").text().toString(),
          AverageVolume : $("span[data-reactid='131']").text().toString(),
          MarketCap : $("span[data-reactid='139']").text().toString(),
          Beta5Months : $("span[data-reactid='144']").text().toString(),
          PEratio : $("span[data-reactid='149']").text().toString(),
          "EPS" : $("span[data-reactid='154']").text().toString()
        };
        // relocated OP cumulating and logging code here:
        dat.push(val);
        if (i === ticker.length-1) {
          console.log(dat);
        }
      } else {
        return console.error(error);
      }
    });
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM