简体   繁体   English

为跟踪系统构建MongoDb集合

[英]Structuring MongoDb Collections for a tracking system

I need to build a system that will track our users in all of our websites. 我需要构建一个系统来跟踪我们所有网站中的用户。
Each new user coming to our website will be getting an ID that will be stored in a cookie. 每个访问我们网站的新用户都将获得一个ID,该ID将存储在Cookie中。
On every activity in the site, we would like to save the relevant data. 对于网站上的每个活动,我们都希望保存相关数据。
for example, when a user register, we will expose an api for adding the activity to the database. 例如,当用户注册时,我们将公开一个用于将活动添加到数据库的api。 later, we will make reporting back-end on the data. 稍后,我们将对数据进行后端报告。
We havn't yet decided on the technology, but we assume we will go for nodejs + express + mongoose. 我们尚未决定采用哪种技术,但我们假设我们将使用nodejs + express +猫鼬。
We believe that the first collection (see bellow) will have around 6 million rows in a month. 我们相信第一个集合(请参见下面的内容)每月将有600万行。 the other collections might have half of that. 其他集合可能只有一半。

I dont know if the following data structure will work good in mongodb. 我不知道以下数据结构在mongodb中是否能正常工作。

SessionCollection SessionCollection

  • Id mongo ObjectId - generated, will be the cookie Id eventually. Id mongo ObjectId-生成的最终将是cookie ID。
  • Referer - string (length of full query string uri) Referer字符串(完整查询字符串uri的长度)
  • LandingUrl - string (length of full query string uri) LandingUrl字符串(完整查询字符串uri的长度)
  • DateTime
  • Params - KeyValue data, its the parsed data from LandingUrl , suppose to be a nested json tree. Params -KeyValue数据,它是从LandingUrl解析的数据,假定是嵌套的json树。
    if the LandingUrl was http://s.com?a=1&b=2&c=3 so the params will be : 如果LandingUrlhttp://s.com?a=1&b=2&c=3 ,则参数将为:
    params : {a:'1',b:'2',c:'3'}

ActivityCollection 活动集合

  • Id mongo ObjectId Id mongo ObjectId
  • SessionId - "forein key" to SessionCollection SessionId - SessionCollection “外键”
  • ActivityType - Short free string ActivityType简短的免费字符串
  • DateTime
  • ActivityData - free KeyValue data (similar to the explanation above). ActivityData免费的KeyValue数据(类似于上面的说明)。

Both of the collection will be searchable in all fields, when I say all i mean all. 当我说所有我的意思是全部时,都可以在所有字段中搜索这两个集合。


  1. Is this good structure for mongo? 这对mongo是个好结构吗?
  2. Do you recognize a bad pattern here? 您在这里识别出不良模式吗?
  3. Do you have suggestions to make it better? 您有改善建议吗?
  4. Can a full url be indexed in mongodb? 可以在mongodb中为完整的URL编制索引吗?

thanks 谢谢

I will answer #4 since it is an interesting question with not an obvious answer. 我将回答#4,因为这是一个有趣的问题,没有明显的答案。

Can a full url be indexed in mongodb? 可以在mongodb中为完整的URL编制索引吗?

the answer is most of the times but not all of the time. 答案是大部分时间,但不是所有时间。

Explanation: A URL cannot always get indexed in MongoDB since MongoDB has a limit on the length of the index (1024 bytes). 说明:URL不能总是在MongoDB中建立索引,因为MongoDB对索引的长度(1024个字节)有限制。 If the length is more than that then it will not get indexed or may get an error (depends on the version and the case). 如果长度大于该长度,则不会被索引或可能会出错(取决于版本和大小写)。 A full URL may exceed this limit (since at least 2000 chars are supported in almost all browsers). 完整的URL可能会超过此限制(因为几乎所有浏览器都支持至少2000个字符)。 If you have the possibility of such long URLs a solution would be to use a hash approach for the index. 如果您有这么长的URL的可能性,一种解决方案是对索引使用哈希方法。

For more info on MondoDB limits and how it handles indexing for >1024 bytes (behavior has significantly changed from 2.6 and onward) see https://docs.mongodb.org/manual/reference/limits/ 有关MondoDB限制及其如何处理大于1024字节的索引的更多信息(行为从2.6及更高版本发生了显着变化),请参见https://docs.mongodb.org/manual/reference/limits/

For URL lengths see What is the maximum length of a URL in different browsers? 有关URL的长度,请参见不同浏览器中URL的最大长度是多少?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM