简体   繁体   English

代码停滞在collection.find

[英]Code stalling at collection.find

I am new to JS and am stuck at this point. 我是JS的新手,现在停留在这里。 I am trying to clear my DBs before starting a new query and it keeps stalling at the collection.find command. 我试图在开始新查询之前清除数据库,并且它始终停滞在collection.find命令中。 If I removed the code to clear the DB everything works fine. 如果删除代码以清除数据库,一切正常。

router.get('/parse', function(req, res) {
    collection.remove({})
    collectionweb.remove({})
    collectionhit.remove({})
    //Converter Class
    var Converter = require("csvtojson").core.Converter;
    var fs = require("fs");
    var csvFileName = "./download/company.csv";
    var fileStream = fs.createReadStream(csvFileName);
    //new converter instance
    var param = {};
    var csvConverter = new Converter(param);
    request({
        uri: "http://online.barrons.com/news/articles/SB50001424053111904537004580085820431503044?mod=BOL_twm_fs",
    }, function(error, response, body) {
        collectionweb.insert({
            "website": body
        });
    });
    //end_parsed will be emitted once parsing finished
    csvConverter.on("end_parsed", function(jsonObj) {
        //Insert into DB
        collection.insert(jsonObj);
    });
    fileStream.pipe(csvConverter);
    collectionweb.find({}, function(e, docs1) {
        for (var j in docs1) {
            var body = docs1[j]
            var webs = body.website
            console.log(1)
            collection.find({}, function(e, docs) {
                for (var i in docs) {
                    console.log(2)
                    var companies = docs[i]
                    var Score = 0;
                    $words = webs.search(companies.Symbol);
                    console.log(3)
                    if ($words > 0) {
                        StockScore++console.log(Score)
                        collectionhit.insert(companies)
                        collectionhit.update({
                            "Name": companies.Name
                        }, {
                            '$set': {
                                "score": Score
                            }
                        })
                    } else {};
                };
            });
        };
    });
});

There are a handful of problems, but they share one common denominator: you don't yet understand that Node.js is asynchronous. 有很多问题,但是它们有一个共同点:您还不了解Node.js是异步的。 Just google "node.js asynchronous" and you'll get a handful of resources, or simply look it up here on SO (eg How do I get started with Node.js? ). 只是使用Google“异步的node.js”,您将获得少量资源,或者只是在SO上查找它(例如, 如何开始使用Node.js? )。

The gist of it, is to wait for a callback or an event, for example: 要点是等待回调或事件,例如:

var eiot = new EventedIOThing('paaarammm');

// use once, unless you for sure need to listen for the event multiple times
eiot.once('open',function onEIOTOpen() {
    console.log('opened the thing.');
}).once('error',function onEIOTError(err) {
    console.warn('there were problemzzz');
}).once('end',function onEIOTEnd() {
    // successfully finished evented IO thing...
    someAction(this.dep,'anotherparam',function callMeWhenActionIsDone(err,result) {
        if ( err ) {
            console.warn('someAction had a problem!',err);
            return; // exit early if we didn't get an optimal result
        }
        anotherDependentAction(result,function callMeWhenSecondActionIsDone(err,result) {
            if ( err ) { // this 'err' is local to this function's scope
                console.warn('anotherDependentAction had a problem!',err);
                return; // exit early again
            }
            console.log('All done... what do you want to do next?');
        });
    });
});

The above code is pretty self explanatory given the variable/function names and comments, but pay close attention to the way methods are invoked, and most notably when they are invoked. 给定变量/函数名称和注释,以上代码是很容易解释的,但是要特别注意方法的调用方式,尤其是调用方法时。 Things don't happen in succession, instead code is on "stand-by" until dependent/proper things happen with successful results, and only then does the program flow continue. 事情不会连续发生,而是代码处于“备用”状态,直到相关/正确的事情发生并获得成功的结果,然后程序流才继续。

The down-side to the above coding style, is you'll eventually get many nested functions deep. 上面的编码风格的缺点是,您最终将获得许多嵌套函数。 This is where libs like async come into play. 这就是异步之类的库发挥作用的地方。 It allows for shallow function program flow: you specify an array of functions, and async handles when the callbacks should be invoked internally, you just have to worry about the sequence. 它允许浅的函数程序流:您指定一个函数数组,并在应该在内部调用回调时异步处理,您只需要担心序列。

Now, with the code you currently have, what we learned from the first example, and 1-upping it with the introduction of the async module , it may be rewritten as follows: 现在,使用您当前拥有的代码,我们从第一个示例中学到的内容,并通过引入async模块对其进行1-up操作,可以将其重写如下:

var async = require('async'), // https://github.com/caolan/async
    fs = require('fs'),
    Converter = require('csvtojson').core.Converter;

router.get('/parse',function cbGetParse(req, res) {
    async.series([
        collection.remove.bind(collection),
        collectionweb.remove.bind(collectionweb),
        collectionhit.remove.bind(collectionhit),
        function convertCsv(callback) {
            var cbCalled = false; // i don't trust csvtojson to handle errors properly
            fs.createReadStream('./download/company.csv')
                .once('error',function(err) {
                    if ( !cbCalled ) {
                        cbCalled = true;
                        callback(err,null);
                    }
                })
                .pipe(new Converter({})) // pipe returns an instance of the Converter object
                .once('end_parsed',function onConverterEnd(jsonObj) {
                    collection.insert(jsonObj,function cbCollInsert(err,result) {
                        if ( !cbCalled ) {
                            cbCalled = true;
                            callback(err,result);
                        }
                    });
                });
        },
        function barronsHttpRequest(callback) {
            request({
                uri: 'http://online.barrons.com/news/articles/SB50001424053111904537004580085820431503044?mod=BOL_twm_fs',
            },function cbRequest(err,response,body) {
                if ( err ) {
                    callback(err,null);
                    return; // if err, exit early
                }
                collectionweb.insert({'website':body},function cbCollWebInsert(err,result) {
                    callback(err,result);
                });
            });
        },
        function lastChunkOfCode(callback) {
            // not going to rewrite this, same principle applies as above
            collectionweb.find({}, function(e, docs1) {
                for (var j in docs1) {
                    var body = docs1[j]
                    var webs = body.website
                    console.log(1)
                    collection.find({}, function(e, docs) {
                        for (var i in docs) {
                            console.log(2)
                            var companies = docs[i]
                            var Score = 0;
                            $words = webs.search(companies.Symbol);
                            console.log(3)
                            if ($words > 0) {
                                StockScore++console.log(Score)
                                collectionhit.insert(companies)
                                collectionhit.update({
                                    "Name": companies.Name
                                }, {
                                    '$set': {
                                        "score": Score
                                    }
                                })
                            } else {};
                        };
                    });
                };
                callback();
            });
        }
    ],function asyncComplete(err,result) {
        // you don't specify how to respond to the request so...
        if ( err ) {
            console.warn('Problem with /parse:',err);
        }
        res.end();
    });
});

I made a crap-ton of assumptions in how your script should work, so it may not be 100% what you want, but the asynchronous concept has been applied. 我做了一个废话吨在你的脚本应该如何工作的假设,因此它可能不是你想要的100%,但异步概念得到了应用。 Also, I didn't test this code. 另外,我没有测试此代码。 You need to determine what can be run in parallel vs series, what your control flow should look like and how you want to handles errors (errors do happen). 您需要确定可以并行运行还是串行运行,控制流应该是什么样,以及如何处理错误(错误确实会发生)。

Do note that I did not implement async behavior into the last chunk of your script since I couldn't figure out what your collection relationships were -- and I'm not going to do all the work for you. 请注意,由于我无法弄清楚您的集合关系是什么,因此我并未在脚本的最后一个块中实现异步行为-我也不会为您做所有工作。 I did notice that it could be optimized a bit. 我确实注意到它可以进行一些优化。 I don't see any reason to select all documents from both collections. 我看不出有任何理由要从两个集合中选择所有文档。 You need to offload the selector/query processing onto the database, it shouldn't be in your application if you can help it. 您需要将选择器/查询处理卸载到数据库上,如果可以帮助的话,它不应该存在于您的应用程序中。

Some take-aways: 一些要点:

  • Collection.remove (doc) accepts a callback, use it. Collection.remove (doc)接受回调,然后使用它。
  • Same with Collection.insert (doc) -- even though the docs say the callback is optional, it should only be omitted in very rare cases. Collection.insert (doc)相同-尽管文档说回调是可选的,但仅在极少数情况下应将其省略。 (I don't care if you're using write-concern or not.) (我不在乎您是否在使用写问题。)
  • Mind your for loop: 注意您的for循环:
  • A for...in should never be used, especially with an array, use a normal for or Array.forEach 永远不要使用for...in ,尤其是对于数组,请使用常规的forArray.forEach
  • When using any type of for loop with any async calls -- especially related to sockets, ie MongoDb -- you need to exercise patience and wait for callbacks otherwise you'll flood the socket (like a denial-of-service attack). for任何异步调用使用任何类型的for循环时(尤其是与套接字相关的,例如MongoDb),您需要保持耐心并等待回调,否则您将充斥套接字(例如拒绝服务攻击)。 I recommend using async.eachSeries or async.eachLimit 我建议使用async.eachSeriesasync.eachLimit
  • I like to name my Lambdas (oxymoron?), it helps with dissecting a stack trace. 我喜欢给我的Lambdas(oxymoron?)命名,这有助于剖析堆栈跟踪。
  • Please use either " or ' , they do the exact same thing, don't mix them. 请使用"' ,它们的作用完全相同,请勿混用。
  • Develop in smaller chunks. 分小块发展。 Get one part working, then the next part, then the next one. 让一部分工作,然后再进行下一部分,然后再进行下一部分。 Work small and abstract. 工作小而抽象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM