简体   繁体   中英

Mongodb document insertion order

I have a mongodb collection for tracking user audit data. So essentially this will be many millions of documents.

Audits are tracked by loginID (user) and their activities on items. example: userA modified 'item#13' on date/time.

Case: I need to query with filters based on user and item. That's Simple. This returns many thousands of documents per item. I need to list them by latest date/time (descending order).

Problem: How can I insert new documents to the top of the stack? (like a capped collection) or Is it possible to find records from the bottom of the stack? (reverse order). I do NOT like the idea of find and sorting because when dealing with thousand and millions of documents sorting is a bottleneck.

Any solutions?

Stack: mongodb, node.js, mongoose.

Thanks!

the top of the stack?

you're implying there is a stack, but there isn't - there's a tree, or more precisely, a B-Tree.

I do NOT like the idea of find and sorting

So you want to sort without sorting? That doesn't seem to make much sense. Stacks are essentially in-memory data structures, they don't work well on disks because they require huge contiguous blocks (in fact, huge stacks don't even work well in memory, and growing stacks requires copying the entire data set, that would hardly work

sorting is a bottleneck

It shouldn't be, at least not for data that is stored closely together (data locality). Sorting is an O(m log n) operation, and since the _id field already encodes a timestamp, you already have a field that you can sort on. m is relatively small, so I don't see the problem here. Have you even tried that? With MongoDB 3.0, index intersectioning has become more powerful, you might not even need _id in the compound index.

On my machine, getting the top items from a large collection, filtered by an index takes 1ms ("executionTimeMillis" : 1) if the data is in RAM. The sheer network overhead will be in the same league, even on localhost. I created the data with a simple network creation tool I built and queried it from the mongo console.

I have encountered the same problem. My solution is to create another additional collection which maintain top 10 records. The good point is that you can query quickly. The bad point is you need update additional collection.

I found this which inspired me. I implemented my solution with ruby + mongoid.

My solution:

collection definition

class TrainingTopRecord
  include Mongoid::Document

  field :training_records, :type=>Array

  belongs_to :training

  index({training_id: 1}, {unique: true, drop_dups: true})
end

maintain process.

if t.training_top_records == nil
  training_top_records = TrainingTopRecord.create! training_id: t.id
else
  training_top_records = t.training_top_records
end
training_top_records.training_records = [] if training_top_records.training_records == nil
top_10_records = training_top_records.training_records
top_10_records.push({
  'id' => r.id,
  'return' => r.return
})
top_10_records.sort_by! {|record| -record['return']}
#limit training_records' size to 10
top_10_records.slice! 10, top_10_records.length - 10
training_top_records.save

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM