简体   繁体   中英

MongoDB Document Size Limitations

I have a collection of novels that looks as follows:

在此处输入图片说明

The Words array contains all words along with additional linguistic information related to each word. When I try to add longer texts (100k words +), I get the error:

RangeError: attempt to write outside buffer bounds

Which, I have gathered, means that the BSON document is larger than 16 mb and therefore above the limit.

I'm assuming this is a relatively common situation. I am now considering how to work around this limitation - For example, I could split the novel into various chunks of 10k words. Or does this mean that the document should make up a separate collection (ie. one new collection per text uploaded) - this makes the least sense to me.

Is there a standard/suggested approach to designing a MongoDB database in this case?

Also, is it possible to check the size of the BSON before inserting a document in JS/Node?

Do you absolutely need to store the contents of the books in MongoDB? If you're simply serving the contents to users or processing them in bulk, I suggest storing them on disk or in an AWS S3 bucket or similar.

If you need the book contents to live in the database , try using the MongoDB GridFS:

GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB.

Instead of storing a file in a single document, GridFS divides the file into parts, or chunks, and stores each chunk as a separate document

When you query GridFS for a file, the driver will reassemble the chunks as needed. You can perform range queries on files stored through GridFS. You can also access information from arbitrary sections of files, such as to “skip” to the middle of a video or audio file.

Read more here: https://docs.mongodb.com/manual/core/gridfs/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM