简体   繁体   English

极慢的锂查询,在MongoDB中速度很快

[英]Extremely slow lithium query, fast in MongoDB

So, I've been trying out the php framework lithium now and it seems like a really good framework and all but I have a slight problem. 所以,我现在一直在试用php框架锂,它似乎是一个非常好的框架,但我有一个小问题。 A query I run on a collection with only 6k+ documents is amazingly slow from php but blazingly fast when I run it from the terminal. 我在一个只有6k +文档的集合上运行的查询在php中非常慢,但是当我从终端运行它时速度非常快。

One document in the collection may look like this: 集合中的一个文档可能如下所示:

{
    "_id" : ObjectId("504c9a3b6070d8b7ea61938e"),
    "startDate" : "Jan 2011",
    "episodes" : [
        {
            "title" : "Series 1, Episode 1",
            "airdate" : ISODate("2011-01-20T00:00:00Z"),
            "epnum" : "1",
            "prodnum" : null,
            "seasonnum" : "01",
            "link" : "http://www.tvrage.com/10_OClock_Live/episodes/1065007783"
        },
        {and maybe 20 more},
    ],
    "runTime" : "60 min",
    "endDate" : "Apr 2012",
    "network" : "Channel 4",
    "numberOfEpisodes" : "25 eps",
    "title" : "10 O'Clock Live",
    "directory" : "10OClockLive",
    "country" : "UK",
    "tvrage" : "27363"
}

I want to get all episodes that exists for this current month. 我想获得本月所有的剧集。 So in the terminal (I use fake values and more than a month) I use the following query: 所以在终端(我使用假值和一个多月)我使用以下查询:

db.series.find({'episodes.airdate': {$gt: ISODate('2012-09-07 00:00:00'), $lt: ISODate('2012-11-01')}})

And wham, it just goes very fast. 不管怎样,它的速度非常快。 Even if I do an explain() on the query it tells me that it's fast: 即使我对查询执行explain(),它也会告诉我它很快:

{
    "cursor" : "BtreeCursor episodes.airdate_1",
    "isMultiKey" : true,
    "n" : 382,
    "nscannedObjects" : 1620,
    "nscanned" : 1620,
    "nscannedObjectsAllPlans" : 1620,
    "nscannedAllPlans" : 1620,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    **"millis" : 181**,
    "indexBounds" : {
        "episodes.airdate" : [
            [
                ISODate("2012-09-07T00:00:00Z"),
                ISODate("292278995-01--2147483647T07:12:56.808Z")
            ]
        ]
    },
    "server" : "example:27017"
}

But when I use the query inside php and lithium, man, it take ages: 但是当我在php和锂中使用查询时,男人需要很长时间:

$series = Series::find('all', array(
                'fields' => array('title', 'episodes.title', 'episodes.airdate'),
                'conditions' => array('episodes.airdate' => array('$gt' => new MongoDate(strtotime(date('Y-m-01'))), '$lt' =>  new MongoDate(strtotime(date('Y-m-t')))))
            ));

And if I even try to loop through it, then it's even worse well past the 30 second execution time. 如果我甚至试图遍历它,那么它甚至会超过 30秒的执行时间。 All though, I think I have a memory leak since I had to add this ini_set('memory_limit', '-1'); 尽管如此,我认为我有一个内存泄漏,因为我必须添加这个ini_set('memory_limit', '-1'); without getting a "maxium usage" or whatever. 没有得到“最大使用”或其他什么。

Could anyone provide me with a answer on why this is happening? 任何人都可以给我一个答案,说明为什么会这样吗? Is there any way to improve the speed of the query? 有没有办法提高查询的速度? I have no idea why it is so slow and I would be super glad if anyone could point me in the right direction. 我不知道它为什么这么慢,如果有人能指出我正确的方向,我会非常高兴。

The issue is that Lithium boxes all the data in objects, which for large queries can be very memory-intensive, hence slow. 问题是Lithium会将对象中的所有数据打包,对于大型查询而言,这些数据可能非常耗费内存,因此速度很慢。 If you don't need any ActiveRecord features for that particular query, there's an option you can pass to find() which gets passed to MongoDb::read() (so check the docs for MongoDb::read() ) that allows you to get back either a raw array, or the actual database cursor which you can iterate over manually. 如果您不需要该特定查询的任何ActiveRecord功能,那么您可以将一个选项传递给find() ,该选项将传递给MongoDb::read() (因此请查看MongoDb::read()的文档)返回一个原始数组,或者可以手动迭代的实际数据库游标。

The other option is to wait till I implement streaming iteration, which will solve the memory problem. 另一个选择是等到我实现流式迭代,这将解决内存问题。 :-) :-)

I'm not sure why this is slow for you. 我不确定为什么这对你来说很慢。 I have a gist here with a class that will log insert, read, and update mongo commands issued from lithium. 我在这里有一个要点,它将记录插入,读取和更新从锂发出的mongo命令。 You could probably add some type of timer to that to get the length of each query. 您可以添加一些类型的计时器来获取每个查询的长度。 Then you could at least know if the problem is waiting for mongo or other parts of the code. 然后你至少可以知道问题是在等待mongo还是代码的其他部分。

Here is some code for iterating over a DocumentSet while discarding each document retrieved from the MongoCursor as you loop. 下面是一些迭代DocumentSet代码,同时在循环时丢弃从MongoCursor检索的每个文档。

$docs = SomeModel::all();
while ($docs->valid()) {
    $key = $docs->key();
    $doc = $docs->current();
    unset($docs[$key]);
    $docs->rewind();

    if (!$docs->valid()) {
        $docs->next();
    }

    // ... do stuff with $doc here ...
}

I just solved an issue where a page was taking us more than 65 seconds to load. 我刚刚解决了一个页面占用超过65秒的问题。 Turns out the user record for this particular user had an array with 152 records, and each array item was very big, so probably this account exceeded the mongodb record limit of 65,000 characters. 原来这个特定用户的用户记录有一个包含152条记录的数组,并且每个数组项都非常大,所以这个帐户可能超过了65,000个字符的mongodb记录限制。 When I deleted the large array from the user account, suddenly the page is loading at 4.5 seconds. 当我从用户帐户中删除大型数组时,突然页面加载时间为4.5秒。

The thing is -- the content on the page that was being loaded was unrelated to this user record, so we were working with the queries for that content to try and speed it up. 问题是 - 正在加载的页面上的内容与此用户记录无关,因此我们正在处理对该内容的查询以尝试加快速度。 Then we find out the bug is completely unrelated to all of that, and it was due to this other issue. 然后我们发现这个错误与所有这些完全无关,这是由于另一个问题。

So make sure your records don't get too big. 因此,请确保您的记录不会太大。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM