[英]persisting dynamic properties and query
I have a requirement to implement contact database. 我需要实现联系人数据库。 This contact database is special in a way that user should be able to dynamically (on runtime) add properties he/she wants to track about the contact.
此联系人数据库的特殊之处在于,用户应该能够动态地(在运行时)添加他/她想跟踪联系人的属性。 Some of these properties are of type string, other numbers and dates.
其中一些属性的类型为字符串,其他数字和日期。 Some of the properties have pre-defined values, others are free fields etc.. User wants to be also able to query such structure fast and easily.
一些属性具有预定义的值,其他属性是自由字段等。用户希望也能够快速,轻松地查询这种结构。 The database needs to handle easily 500 000 contacts each having around 10 properties.
该数据库需要轻松处理500 000个具有大约10个属性的联系人。
It leads to dynamic property model having Contact class with dynamic properties. 它导致动态属性模型具有具有动态属性的Contact类。
class Contact{
private Map<DynamicProperty, Collection<DynamicValue> values> propertiesAndValues;
//other userfull methods
}
The question is how can I store such a structure in "some database" - it does not have to be RDBMS so that I can easily express queries such as 问题是如何将这种结构存储在“某些数据库”中-不必是RDBMS,这样我就可以轻松地表达查询,例如
Get all contacts whose name starts with Martin, they are from Company of size 5000 or less, order by time when this contact was inserted in a database, only first 100 results (provide pagination), where each of these segments correspond to a dynamic property. 获取所有名称以Martin开头的联系人,这些联系人来自大小不超过5000的Company,按此联系人插入数据库的时间顺序排列,仅前100个结果(提供分页),其中每个细分对应一个动态属性。
I need: 我需要:
I was considering RDBMS, but this leads more less to this structure which is quite hard to query and it tends to be slow for this amount of data 我当时正在考虑使用RDBMS,但这更多地导致了这种结构的查询,这种结构很难查询,而且对于如此大量的数据它往往很慢
contact(id serial pk,....);
dynamic_property(dp_id serial pk, ...);
--only one of the values is not empty
dynamic_property_value(dpv_id serial pk, dynamic_property_fk int, value_integer int, date_value timestamp, text_value text);
contact_properties(pav_id serial pk, contact_id_fk int, dynamic_propert_fk int);
property_and_its_value(pav_id_fk int, dpv_id int);
I consider following options: 我考虑以下选项:
Which way would you go and why? 您会走哪条路,为什么?
Wikipedia has a great entry on Entity-Attribute-Value modeling which is a data modeling technique for representing entities with arbitrary properties. Wikipedia在“ 实体-属性-值”建模方面有很多条目,这是一种数据建模技术,用于表示具有任意属性的实体。 It's typically used for clinical data, but might apply to your situation as well.
它通常用于临床数据,但也可能适用于您的情况。
Have you considered using Lucene for your querying needs? 您是否考虑过使用Lucene满足查询需求? You could probably get away with just using Lucene and store all your data in the index.
您可能仅使用Lucene就能逃脱并将所有数据存储在索引中。 Although I wouldn't recommend using Lucene as your only persistence store.
尽管我不建议使用Lucene作为唯一的持久性存储。
Alternatively, you could use Lucene along with a RDBMS and take advantage of something like Compass . 另外,您可以将Lucene与RDBMS一起使用,并利用Compass之类的优势。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.