简体   繁体   English

在Mongodb中查找基于Year的文档

[英]Find documents based on Year in Mongodb

There is a modal called Movie having information about a movie. 有一个名为Movie的模式,其中包含有关电影的信息。

Movie Modal 电影模态

var mongoose = require('mongoose');
var movieSchema = new mongoose.Schema({
    m_tmdb_id: {
        type: Number,
        unique: true,
        index: true
    },
    m_backdrop_path: {
        type: String,
    },
    m_budget: {
        type: Number,
    },
    m_homepage: {
        type: String
    },
    m_imdb_id: {
        type: String,
    },
    m_original_language: {
        type: String
    },
    m_original_title: {
        type: String
    },
    m_poster_path: {
        type: String
    },
    m_poster_key: {
        type: String
    },
    m_release_date: {
        type: Date
    },
    m_revenue: {
        type: Number
    },
    m_runtime: {
        type: Number
    },
    m_title: {
        type: String
    },
    m_genres: {
        type: Array
    },
    created_at: {
        type: Date
    },
    updated_at: {
        type: Date,
        default: Date.now
    }
});
var MovieModel = mongoose.model('Movie', movieSchema);
module.exports = {
    movie: MovieModel
}

I need to select 10 items in each query [Pagination] from the collection Movie with different conditions.I have added 3 condition in my API[Based on gener name, release date, language]. 我需要从具有不同条件的电影集合中的每个查询[分页]中选择10个项目。我在我的API中添加了3个条件[基于类别名称,发布日期,语言]。

Js Code Js代码

router.post('/movies', function(req, res, next) {
    var perPage = parseInt(req.query.limit);
    var page = req.query.page;
    var datefrom = new Date();
    var dateto = new Date();
    var generNames = req.body.generNames;
    dateto.setMonth(dateto.getMonth() - 2);
    var queryOptions = {
        $and: [{
            'm_release_date': {
                $lte: datefrom,
                $gte: dateto

            }
        }, {
            "m_genres.name": {
                $in: generNames
            }
        }, {
            'm_original_language': 'en'
        }, ]
    };
    Movie
        .find(queryOptions)
        .select('_id m_tmdb_id m_poster_path m_original_title')
        .sort('-m_release_date')
        .limit(perPage)
        .skip(perPage * page)
        .exec(function(err, movies) {
            if (movies) {
                return res.status(200).json(movies);
            }
        }).catch(function(error) {
            return res.status(500).json(error);
        });
});

I need to add one more condition ,the condition is select items from the collection Movie that having release date [m_release_date] from the set of years [ex: 2003,2004,2010 etc].How can i do this? 我需要再添加一个条件,条件是从集合电影中选择发布日期为[年:2003、2004、2010等]的年份中的发布日期为[m_release_date]的项目。我该怎么做? enter code here

Example: 例:

Movie Collection 电影收藏

[   
    {
        "_id": "59420dff3d729440f200bccc",
        "m_tmdb_id": 453651,
        "m_original_title": "PIETRO",
        "m_poster_path": "/3sTFUZorLGOU06A7P3XxjLVKKGD.jpg",
        "m_release_date": "2017-07-14T00:00:00.000Z",
        "m_runtime": 8,
        "m_genres": [{
            "id": 18,
            "name": "Drama"
        }]
    },
    {
        "_id": "594602610772b119e788edab",
        "m_tmdb_id": 425136,
        "m_original_title": "Bad Dads",
        "m_poster_path": null,
        "m_release_date": "2017-07-14T00:00:00.000Z",
        "m_runtime": 0,
        "m_credits_cast": [],
        "m_genres": [{
            "id": 35,
            "name": "Comedy"
        }]
    },
    {
        "_id": "59587747d282843883df755e",
        "m_tmdb_id": 364733,
        "m_original_title": "Blind",
        "m_poster_path": "/cXyObe5aB63ueOndEXxXabgAvIi.jpg",
        "m_release_date": "2017-07-14T00:00:00.000Z",
        "m_runtime": 105,
        "m_genres": [{
            "id": 18,
            "name": "Drama"
        }]
    },
    {
        "_id": "595d93f9c69ab66c4f48254f",
        "m_tmdb_id": 308149,
        "m_original_title": "The Beautiful Ones",
        "m_poster_path": "/kjy1obH5Oy1IsjTViYVJDQufeZP.jpg",
        "m_release_date": "2017-07-14T00:00:00.000Z",
        "m_runtime": 94,

        "m_genres": [{
            "id": 18,
            "name": "Drama"
        }]
    },
    {
        "_id": "59420de63d729440f200bcc7",
        "m_tmdb_id": 460006,
        "m_original_title": "Черная вода",
        "m_poster_path": "/kpiLwx8MGGWgZMMHUnvydZkya0H.jpg",
        "m_release_date": "2017-07-13T00:00:00.000Z",
        "m_runtime": 0,

        "m_genres": []
    },
    {
        "_id": "594602390772b119e788eda3",
        "m_tmdb_id": 281338,
        "m_original_title": "War for the Planet of the Apes",
        "m_poster_path": "/y52mjaCLoJJzxfcDDlksKDngiDx.jpg",
        "m_release_date": "2017-07-13T00:00:00.000Z",
        "m_runtime": 142,
        "m_genres": [{
                "id": 28,
                "name": "Action"
            }

        ]
    }
]

API Request API请求

在此处输入图片说明

Fix your Data for the Most Efficiency 修复数据以提高效率

Honestly the most performant way this is going to happen is by creating a new field in your data for m_release_year . 老实说,最m_release_year方法是在数据中为m_release_year创建一个新字段 Then it becomes a simple matter of supplying the $in condition to the query in place of the date range, but this can of course use an index. 然后,将$in条件提供给查询来代替日期范围就变得很简单,但这当然可以使用索引。

So with such a field in place, then the code to initiate the query becomes: 因此,有了这样的字段,启动查询的代码将变为:

// Just to simulate the request
const req = {
  body: {
    "generNames": ["Action"],
    "selectedYear": ["2003,2004,2005,2017"]
  }
}

// Your selectedYear input looks wrong. So correcting from a single string
// to an actual array of integers
function fixYearSelection(input) {
  return  [].concat.apply([],input.map(e => e.split(",") )).map(e => parseInt(e) ).sort()
}

// Outputs like this - [ 2003, 2004, 2005, 2017 ]
let yearSelection = fixYearSelection(req.body.selectedYear);

Movie.find({
   "m_release_year": { "$in": yearSelection },
   "m_genres.name": { "$in": req.body.generNames },
   "m_original_language": "en"
})
.select('_id m_tmdb_id m_poster_path m_original_title')
.sort('-m_release_date')
.limit(perPage)
.skip(perPage * page)
.exec(function(err, movies) {

Placing the new field in the data is a simple matter to run in the mongo shell: mongo shell中运行一个简单的事情即可将新字段放入数据中:

let ops = [];
db.movies.find({ "m_release_year": { "$exists": false } }).forEach( doc => {
  ops.push({
    "updateOne": { 
      "filter": { "_id": doc._id },
      "update": { "$set": { "m_release_year": doc.m_release_date.getUTCFullYear() } }
  });

  if ( ops.length >= 1000 ) {
    db.movies.bulkWrite(ops);
    ops = [];
  }
});

if ( ops.length > 0 ) {
  db.movies.bulkWrite(ops);
  ops = [];
}

Which would iterate all items in the collection and "extract" the year information and then write to the new field. 这将迭代集合中的所有项目并“提取”年份信息,然后将其写入新字段。 It would be wise to then create an index that matched the fields used in the query selection. 然后,创建与查询选择中使用的字段匹配的索引是明智的。

Forcing Calculation 强制计算

Without that then you are basically "forcing a calculation" and no database can do that efficiently. 否则,您基本上将“强制进行计算”,并且没有数据库可以有效地做到这一点。 The two methods in MongoDB are using $where or $redact , where the "latter" should always be used in preference to the former since at least $redact is using native coded operations for comparison, as opposed to the JavaScript evaluation of $where , which runs much slower. MongoDB中的两种方法都使用$where$redact ,其中“ latter”应始终优先于前者使用,因为至少$redact使用本地编码操作进行比较,而不是对$where的JavaScript评估,运行慢得多。

// Just to simulate the request
const req = {
  body: {
    "generNames": ["Action"],
    "selectedYear": ["2003,2004,2005,2017"]
  }
}

// Your selectedYear input looks wrong. So correcting from a single string
// to an actual array of integers
function fixYearSelection(input) {
  return  [].concat.apply([],input.map(e => e.split(",") )).map(e => parseInt(e) ).sort()
}

// Outputs like this - [ 2003, 2004, 2005, 2017 ]
let yearSelection = fixYearSelection(req.body.selectedYear);

/* 
 * Not stored, so we try to "guestimate" the reasonable "range" to at
 * least give some query condtion on the date and not search everything
 */

var startDate = new Date(0),
    startDate = new Date(startDate.setUTCFullYear(yearSelection[0])),
    endDate  = new Date(0),
    endDate  = new Date(endDate.setUTCFullYear(yearSelection.slice(-1)[0]+1));

// Helper to switch our $redact "if" based on supported MongoDB
const version = "3.4";
function makeIfCondition() {
  return ( version === "3.4" )
    ? { "$in": [ { "$year": "$m_release_date" }, yearSelection ] }
    : { "$or": yearSelection.map(y => 
        ({ "$eq": [{ "$year": "$m_release_date" }, y })
      ) };
}

Then either using $redact : 然后使用$redact

Movie.aggregate(
  [
    { "$match": {
      "m_release_date": {
        "$gte": startDate, "$lt": endDate
      },
      "m_genres.name": { "$in": req.body.generNames },
      "m_original_language": "en"
    }},
    { "$redact": {
      "$cond": {
        "if": makeIfCondition(),
        "then": "$$KEEP",
        "else": "$$PRUNE"
      }
    }},
    { "$sort": { "m_release_date": -1 } },
    { "$project": {
      "m_tmdb_id": 1,
      "m_poster_path": 1,
      "m_original_title": 1
    }},
    { "$skip": perPage * page },
    { "$limit": perPage }
  ],
  (err,movies) => {

  }
)

Or via $where : 或通过$where

Movie.find({
   "m_release_date": {
     "$gte": startDate, "$lt": endDate
   },
   "m_genres.name": { "$in": req.body.generNames },
   "m_original_language": "en",
   "$where": function() {
     return yearSelection.indexOf(this.m_release_date.getUTCFullYear()) !== -1         
   }
})
.select('_id m_tmdb_id m_poster_path m_original_title')
.sort('-m_release_date')
.limit(perPage)
.skip(perPage * page)
.exec(function(err, movies) {    

Being that the basic logic is to instead extract by $year or .getUTCFullYear() the present year from the m_release_date field and use that for comparison to the list of yearSelection in order to only return those that match. 的基本逻辑是改为从m_release_date字段中提取$year.getUTCFullYear()当前年份,并将其与m_release_date列表进行比较,以便仅返回匹配的yearSelection

For the usage of $redact the actual comparison is most effectively done via $in for most recent releases ( 3.4 and upwards ) or otherwise using values from $or where we effectively .map() onto the array of conditions rather than apply the array directly as a argument. 对于$redact的使用,最有效的方式是通过$in进行最新版本(3.4及更高版本)的实际比较,或者使用$or中的值将有效地.map()应用于条件数组,而不是直接应用数组作为争论。


Conclusion 结论

The general recommendation here is to instead include the actual data within your collection if you intend to regularly query on it. 如果要定期查询集合中的实际数据,此处的一般建议是将其包含在集合中。 With actual values in place, you can place an index on the field and regular query operators can use those values as well as take advantage of the index. 使用实际值后,您可以在字段上放置索引,常规查询运算符可以使用这些值并利用索引。

Without putting the values for the "year" in the collection, the subsequent "calculation" needs to be applied to all possible entries in order to determine which match. 在不将“年”的值放入集合中的情况下,需要将随后的“计算”应用于所有可能的条目,以确定哪个匹配项。 So it's not as efficient. 因此效率不高。

Even in this example, we try to "gain back" some efficiency by at least throwing the "possible range" of dates based on the given entries, being presumed from smallest to largest. 即使在此示例中,我们也尝试通过至少基于给定的条目(假定从最小到最大)抛出日期的“可能范围”来“重新获得”某种效率。 But of course there are "unused years" in that selection, but it's better than providing nothing and simply selecting on the calculation alone. 但是,当然选择中有“未使用的年份”,但是总比不提供任何内容并且仅选择计算结果要好。

I can suggest using $where operator. 我可以建议使用$ where运算符。

The main idea here is to construct a function, that will fit number of your arguments and their values. 这里的主要思想是构造一个函数,该函数将适合您的参数及其值的数量。 Not precise, but close solution: 不太精确,但解决方案很严格:

const year1 = 2005;
const year2 = 2007;    
const yearFinder = new Function('',`return new Date(this.m_release_date).getFullYear() === ${year1} || new Date(this.m_release_date).getFullYear() === ${year2}`);

Movie
    .find(queryOptions)
    .$where(yearFinder)
    .select('_id m_tmdb_id m_poster_path m_original_title')
    .sort('-m_release_date')
    .limit(perPage)
    .skip(perPage * page)
    .exec(function(err, movies) {
        if (movies) {
            return res.status(200).json(movies);
        }
    }).catch(function(error) {
        return res.status(500).json(error);
    });

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM