简体   繁体   English

在CouchDB中处理超大的JSON文档

[英]Handling Incredibly large JSON Document in CouchDB

I'm new to NoSql databases and I'm having a hard time figuring how to handle a very large JSON Document that could amount to over 20MB on my local drive. 我是NoSql数据库的新手,我很难确定如何处理非常大的JSON文档,该文档在本地驱动器上可能超过20MB。 This structure will definitely increase over time and I worry about the speed of queries and having to search deep though the returned JSON object nest just to get a string out. 这种结构肯定会随着时间的推移而增加,我担心查询的速度以及必须深入搜索,尽管返回的JSON对象嵌套只是为了取出字符串。 My JSON is deeply nested like so for example. 例如,我的JSON是深层嵌套的。

{
"exams": {
    "exam1": {
        "year": {
            "math": {
                "questions": [
                    {
                        "question_text": "first question",
                        "options": [
                            "option1",
                            "option2",
                            "option3",
                            "option4",
                            "option5"
                        ],
                        "answer": 1,
                        "explaination": "explain the answer"
                    },
                    {
                         "question_text": "second question",
                        "options": [
                            "option1",
                            "option2",
                            "option3",
                            "option4",
                            "option5"
                        ],
                        "answer": 1,
                        "explaination": "explain the answer"
                    },
                    {
                        "question_text": "third question",
                        "options": [
                            "option1",
                            "option2",
                            "option3",
                            "option4",
                            "option5"
                        ],
                        "answer": 1,
                        "explaination": "explain the answer"
                    }
                ]
            },
            "english": {same structure as above}
        },
        "1961": {}
    },
    "exam2": {},
    "exam3": {},
    "exam4": {}
}
}

In the main application, question objects are created and appended based on type of exam, year, and subject making the JSON document huge over time. 在主应用程序中,根据检查,年份和主题的类型创建和附加问题对象,从而使JSON文档随着时间的推移而变得越来越庞大。 How can I re-model this so as to avoid slow queries in the future? 我如何重新建模以避免将来出现查询缓慢?

Dominic is right. 多米尼克是对的。 You need to start dividing the documents and storing them as separate documents. 您需要开始分割文档并将它们存储为单独的文档。

The next question is how to recompose the document after it's been split. 下一个问题是在拆分文档后如何重新构成文档。

Considering you're using Couch, I would recommend doing this at the application layer. 考虑到您使用的是Couch,建议您在应用程序层执行此操作。 A good starting point would be to create exam documents and store them in their own database. 一个很好的起点是创建考试文档并将其存储在自己的数据库中。 Then have a document (exams) in another database that has pointers to the exam documents. 然后在另一个数据库中拥有一个文档(考试),该数据库具有指向考试文档的指针。

You can retrieve the exams document and get exams one by one as needed. 您可以检索考试文档,并根据需要一一获取考试。 This could be especially useful with paging since most people will only want to see the most recent exams. 这对于分页尤其有用,因为大多数人只想查看最新的考试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM