简体   繁体   中英

What data structure could be used to store objects with multiple comparable attributes

I want to build a data structure to store the information of multiple houses, and later user can retrieve desirable housing information through a search query. In order to achieve a fast search, I will use red black tree. The problem I am facing is that the key of each node only contains one attribute of the house ie price, as for the others such as number of beds, land size etc they can not be stored in a single tree. What would be a good data structure for this problem, initially I thought a tree nested in a tree, is this viable or considered good?

The problem you are facing can be solved using secondary indexes on top of your data. Secondary indexes are a concept studied intensely in the database world and you should have no trouble finding resources to help you understand how they are implemented in real databases.

So, you currently have a primary key for your data: the objects memory reference or maybe an index into a collection of references. For each attribute that you want to query you will need to have a fast way of looking up matching objects. The exact data structure you use will depend on the type of queries you perform but some kind of search tree will be a good general purpose data structure and will usually be efficient for updates which is very important for a lot of databases. Your data structure should take in a query relating to the specific attribute and return references, or primary keys, to all the objects which match that query.

In your example you might have one red-black tree for price and another for number-of-beds. If you are answering a query for "price = 30 or number-of-beds = 4" then all you need to do is query your price data structure and then your number-of-beds data structure and then since you have an "or" in your query you simply take the union of the primary keys returned from your data structures (take the intersection for "and"s).

Notice that if you add to or update your objects then you will also need to update all the indexes that change. This is a trade-off you also see in real databases; faster reads for slower writes.

A nested tree approach might work depending on what kind of queries you are making but will quickly become unsuitable if the data structure is not static - it will be very slow to update the tree if you update your objects.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM