简体   繁体   English

什么数据结构可用于存储具有多个可比较属性的对象

[英]What data structure could be used to store objects with multiple comparable attributes

I want to build a data structure to store the information of multiple houses, and later user can retrieve desirable housing information through a search query.我想建立一个数据结构来存储多个房屋的信息,以后用户可以通过搜索查询检索到理想的房屋信息。 In order to achieve a fast search, I will use red black tree.为了实现快速搜索,我会使用红黑树。 The problem I am facing is that the key of each node only contains one attribute of the house ie price, as for the others such as number of beds, land size etc they can not be stored in a single tree.我面临的问题是,每个节点的键只包含房屋的一个属性,即价格,至于床的数量、土地面积等其他属性,它们不能存储在一棵树中。 What would be a good data structure for this problem, initially I thought a tree nested in a tree, is this viable or considered good?对于这个问题,什么是好的数据结构,最初我认为一棵树嵌套在一棵树中,这是可行的还是被认为是好的?

The problem you are facing can be solved using secondary indexes on top of your data.您面临的问题可以使用数据之上的二级索引来解决。 Secondary indexes are a concept studied intensely in the database world and you should have no trouble finding resources to help you understand how they are implemented in real databases.二级索引是数据库领域中深入研究的一个概念,您应该可以轻松找到资源来帮助您了解它们是如何在实际数据库中实现的。

So, you currently have a primary key for your data: the objects memory reference or maybe an index into a collection of references.因此,您当前拥有数据的主键:对象 memory 引用或引用集合的索引。 For each attribute that you want to query you will need to have a fast way of looking up matching objects.对于您要查询的每个属性,您需要有一种快速查找匹配对象的方法。 The exact data structure you use will depend on the type of queries you perform but some kind of search tree will be a good general purpose data structure and will usually be efficient for updates which is very important for a lot of databases.您使用的确切数据结构将取决于您执行的查询类型,但某种搜索树将是一个很好的通用数据结构,并且通常对于更新非常有效,这对于许多数据库来说非常重要。 Your data structure should take in a query relating to the specific attribute and return references, or primary keys, to all the objects which match that query.您的数据结构应该接受与特定属性相关的查询,并返回对与该查询匹配的所有对象的引用或主键。

In your example you might have one red-black tree for price and another for number-of-beds.在您的示例中,您可能有一棵红黑树表示价格,另一棵表示床位数。 If you are answering a query for "price = 30 or number-of-beds = 4" then all you need to do is query your price data structure and then your number-of-beds data structure and then since you have an "or" in your query you simply take the union of the primary keys returned from your data structures (take the intersection for "and"s).如果您正在回答“price = 30 or number-of-beds = 4”的查询,那么您需要做的就是查询您的价格数据结构,然后查询您的床位数据结构,然后因为您有一个“或" 在您的查询中,您只需取从您的数据结构返回的主键的并集(取“和”的交集)。

Notice that if you add to or update your objects then you will also need to update all the indexes that change.请注意,如果您添加或更新对象,那么您还需要更新所有发生变化的索引。 This is a trade-off you also see in real databases;这是您在真实数据库中也看到的权衡; faster reads for slower writes.较慢的写入速度更快的读取。

A nested tree approach might work depending on what kind of queries you are making but will quickly become unsuitable if the data structure is not static - it will be very slow to update the tree if you update your objects.嵌套树方法可能会起作用,具体取决于您进行的查询类型,但如果数据结构不是 static,它很快就会变得不合适 - 如果您更新对象,更新树的速度会非常慢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM