简体   繁体   中英

Structuring MongoDb Collections for a tracking system

I need to build a system that will track our users in all of our websites.
Each new user coming to our website will be getting an ID that will be stored in a cookie.
On every activity in the site, we would like to save the relevant data.
for example, when a user register, we will expose an api for adding the activity to the database. later, we will make reporting back-end on the data.
We havn't yet decided on the technology, but we assume we will go for nodejs + express + mongoose.
We believe that the first collection (see bellow) will have around 6 million rows in a month. the other collections might have half of that.

I dont know if the following data structure will work good in mongodb.

SessionCollection

  • Id mongo ObjectId - generated, will be the cookie Id eventually.
  • Referer - string (length of full query string uri)
  • LandingUrl - string (length of full query string uri)
  • DateTime
  • Params - KeyValue data, its the parsed data from LandingUrl , suppose to be a nested json tree.
    if the LandingUrl was http://s.com?a=1&b=2&c=3 so the params will be :
    params : {a:'1',b:'2',c:'3'}

ActivityCollection

  • Id mongo ObjectId
  • SessionId - "forein key" to SessionCollection
  • ActivityType - Short free string
  • DateTime
  • ActivityData - free KeyValue data (similar to the explanation above).

Both of the collection will be searchable in all fields, when I say all i mean all.


  1. Is this good structure for mongo?
  2. Do you recognize a bad pattern here?
  3. Do you have suggestions to make it better?
  4. Can a full url be indexed in mongodb?

thanks

I will answer #4 since it is an interesting question with not an obvious answer.

Can a full url be indexed in mongodb?

the answer is most of the times but not all of the time.

Explanation: A URL cannot always get indexed in MongoDB since MongoDB has a limit on the length of the index (1024 bytes). If the length is more than that then it will not get indexed or may get an error (depends on the version and the case). A full URL may exceed this limit (since at least 2000 chars are supported in almost all browsers). If you have the possibility of such long URLs a solution would be to use a hash approach for the index.

For more info on MondoDB limits and how it handles indexing for >1024 bytes (behavior has significantly changed from 2.6 and onward) see https://docs.mongodb.org/manual/reference/limits/

For URL lengths see What is the maximum length of a URL in different browsers?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM