简体   繁体   English

Node.js-一次遍历一组URLS

[英]Node.js - Looping through array of URLS one at a time

I am a beginner at node js and I'm trying to write a web scraping script. 我是Node js的初学者,我正在尝试编写网络抓取脚本。 I got permission from the site admin to scrape their products if I make less then 15 requests a minute. 如果我每分钟提出少于15个请求,则需要获得站点管理员的许可才能刮取其产品。 When I started out it used to request all the URLs at once but after some tooling around, I was able to go through each item in the array, but the script doesn't stop when there is no more items in the array? 当我开始时,它曾经一次请求所有URL,但是经过一些工具之后,我能够遍历数组中的每个项目,但是当数组中没有更多项目时脚本不会停止吗? I'm not really happy with my result and feel like there is a better way to do this. 我对结果并不满意,并觉得有更好的方法可以做到这一点。

    var express = require('express');
    var fs = require('fs');
    var request = require('request');
    var cheerio = require('cheerio');
    var app     = express();
    var async = require('async');

app.get('/scrape', function(req, res){
productListing = ['ohio-precious-metals-1-ounce-silver-bar','morgan-1-ounce-silver-bar']
var i = 0;
async.eachLimit(productListing, 1, function (product, callback) {
    var getProducts = function () {
        var url = 'http://cbmint.com/' + productListing[i];
        request(url, function(error, response, html) {
            if(!error){
                var $ = cheerio.load(html);

                var title;
                var json = { title : ""};

                $('.product-name').filter(function(){
                    var data = $(this);
                    title = data.children().children().first().text();

                    json.title = title;
                })
            }
            var theTime = new Date().getTime();
            console.log(i);
            console.log(json.title);
            console.log(theTime);
            i++;
        });
    }
    setInterval(getProducts,10000); 
})
res.send('Check your console!')
})

app.listen('8081')
console.log('Magic happens on port 8081');
exports = module.exports = app; 

You aren't calling callback inside the iterator function. 您没有在迭代器函数中调用callback Take a look at the docs for eachLimit . 看一下eachLimit的文档。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM