[英]Node.js - Looping through array of URLS one at a time
I am a beginner at node js and I'm trying to write a web scraping script. 我是Node js的初学者,我正在尝试编写网络抓取脚本。 I got permission from the site admin to scrape their products if I make less then 15 requests a minute. 如果我每分钟提出少于15个请求,则需要获得站点管理员的许可才能刮取其产品。 When I started out it used to request all the URLs at once but after some tooling around, I was able to go through each item in the array, but the script doesn't stop when there is no more items in the array? 当我开始时,它曾经一次请求所有URL,但是经过一些工具之后,我能够遍历数组中的每个项目,但是当数组中没有更多项目时脚本不会停止吗? I'm not really happy with my result and feel like there is a better way to do this. 我对结果并不满意,并觉得有更好的方法可以做到这一点。
var express = require('express');
var fs = require('fs');
var request = require('request');
var cheerio = require('cheerio');
var app = express();
var async = require('async');
app.get('/scrape', function(req, res){
productListing = ['ohio-precious-metals-1-ounce-silver-bar','morgan-1-ounce-silver-bar']
var i = 0;
async.eachLimit(productListing, 1, function (product, callback) {
var getProducts = function () {
var url = 'http://cbmint.com/' + productListing[i];
request(url, function(error, response, html) {
if(!error){
var $ = cheerio.load(html);
var title;
var json = { title : ""};
$('.product-name').filter(function(){
var data = $(this);
title = data.children().children().first().text();
json.title = title;
})
}
var theTime = new Date().getTime();
console.log(i);
console.log(json.title);
console.log(theTime);
i++;
});
}
setInterval(getProducts,10000);
})
res.send('Check your console!')
})
app.listen('8081')
console.log('Magic happens on port 8081');
exports = module.exports = app;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.