如何使用 MongoDB 存储和搜索大型文档？

Question

Well.好。 Here's the DB schema/architecture problem.这是数据库架构/架构问题。

Currently in our project we use MongoDB.目前在我们的项目中，我们使用 MongoDB。 We have one DB with one collection.我们有一个数据库和一个集合。 Overall there are almost 4 billions of documents in that collection (value is constant).该集合中总共有近 40 亿个文档（价值是恒定的）。 Each document has a unique specific ID and there is a lot of different information related to this ID (that's why MongoDB was chosen - data is totally different, so schemaless is perfect).每个文档都有一个唯一的特定 ID，并且有很多与此 ID 相关的不同信息（这就是选择 MongoDB 的原因 - 数据完全不同，因此无模式是完美的）。

{
    "_id": ObjectID("5c619e81aeeb3aa0163acf02"),
    "our_id": 1552322211,
    "field_1": "Here is some information",
    "field_a": 133,
    "field_с": 561232,
    "field_b": {
            "field_0": 1,
            "field_z": [45, 11, 36]
    }
}

The purpose of that collection is to store a lot of data, that is easy to update (some data is being updated every day, some is updated once a month) and to search over different fields to retrieve the ID.该集合的目的是存储大量数据，即易于更新（有些数据每天更新，有些每月更新一次）并在不同字段中搜索以检索 ID。 Also we store the "history" of each field (and we should have ability to search over history as well).我们还存储每个字段的“历史”（我们也应该能够搜索历史）。 So when overtime updates were turned on we faced a problem called MongoDB 16MB maximum document size.因此，当开启超时更新时，我们面临一个称为 MongoDB 16MB 最大文档大小的问题。

We've tried several workarounds (like splitting document), but all of them include either $group or $lookup stage in aggregation (grouping up by id, see example below), but both can't use indexes, which makes search over several fields EXTREMELY long.我们已经尝试了几种解决方法（例如拆分文档），但它们都在聚合中包含$group或$lookup阶段（按 id 分组，参见下面的示例），但两者都不能使用索引，这使得搜索多个字段非常长。

{
    "_id": ObjectID("5c619e81aeeb3aa0163acd12"),
    "our_id": 1552322211,
    "field_1": "Here is some information",
    "field_a": 133
}


{
    "_id": ObjectID("5c619e81aeeb3aa0163acd11"),
    "our_id": 1552322211,
    "field_с": 561232,
    "field_b": {
            "field_0": 1,
            "field_z": [45, 11, 36]
    }
}

Also we can't use $match stage before those, because the search can include logical operators (like field_1 = 'a' && field_c != 320 , where field_1 is from one document and field_c is from another, so the search must be done after grouping/joining documents together) + the logical expression can be VERY complex.此外，我们不能在这些之前使用$match阶段，因为搜索可以包含逻辑运算符（例如field_1 = 'a' && field_c != 320 ，其中field_1来自一个文档而field_c来自另一个文档，因此必须完成搜索在将文档分组/连接在一起之后）+ 逻辑表达式可能非常复杂。

So are there any tricky workarounds?那么有什么棘手的解决方法吗？ If no, what other DB's can you suggest for moving to?如果不是，您可以建议迁移到哪些其他数据库？

Kind regards.亲切的问候。

Answer 1

好的，所以在花了一些时间测试不同的方法之后，我终于使用了Elasticsearch ，因为没有办法在足够的时间内通过 MongoDB 执行请求的搜索。

如何使用 MongoDB 存储和搜索大型文档？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-10-27 12:23:07

如何使用 MongoDB 存储和搜索大型文档？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-10-27 12:23:07

解决方案1
1 已采纳 2019-10-27 12:23:07