简体   繁体   English

在像 RavenDB 这样的面向文档的数据库系统中,我如何 model 数据是层次结构和关系的?

[英]How would I model data that is heirarchal and relational in a document-oriented database system like RavenDB?

Document oriented databases (particularly RavenDB) are really intriguing me, and I'm wanting to play around with them a bit.面向文档的数据库(尤其是 RavenDB)真的很吸引我,我想尝试一下它们。 However as someone who is very used to relational mapping, I was trying to think of how to model data correctly in a document database.然而,作为一个非常习惯关系映射的人,我试图思考如何在文档数据库中正确地处理 model 数据。

Say I have a CRM with the following entities in my C# application (leaving out unneeded properties):假设我的 C# 应用程序中有一个包含以下实体的 CRM(省略了不需要的属性):

public class Company
{
    public int Id { get; set; }
    public IList<Contact> Contacts { get; set; }
    public IList<Task> Tasks { get; set; }
}

public class Contact
{
    public int Id { get; set; }
    public Company Company { get; set; }
    public IList<Task> Tasks { get; set; }
}

public class Task
{
    public int Id { get; set; }
    public Company Company { get; set; }
    public Contact Contact { get; set; }
}

I was thinking of putting this all in a Company document, as contacts and tasks do not have a purpose out side of companies, and most of the time query for a task or contacts will also show information about the associated company.我正在考虑将这一切都放在Company文档中,因为联系人和任务在公司之外没有目的,并且大多数时候查询任务或联系人也会显示有关关联公司的信息。

The issue comes with Task entities.问题来自Task实体。 Say the business requires that a task is ALWAYS associated with a company but optionally also associated with a task.假设业务要求任务始终与公司相关联,但也可选择与任务相关联。

In a relational model this is easy, as you just have a Tasks table and have the Company.Tasks relate to all tasks for the company, while Contact.Tasks only show the tasks for the specific Task.在关系型 model 中,这很容易,因为您只有一个Tasks表并且Company.Tasks与公司的所有任务相关,而Contact.Tasks仅显示特定任务的任务。

For modeling this in a document database, I thought of the following three ideas:为了在文档数据库中建模,我想到了以下三个想法:

  1. Model Tasks as a separate document. Model 任务作为单独的文档。 This seems kind of anti-document db as most of the time you look at a company or contact you will want to see the list of tasks, thus having to perform joins over documents a lot.这似乎是一种反文档数据库,因为大多数时候您查看公司或联系人,您将希望查看任务列表,因此必须对文档执行很多连接。

  2. Keep tasks that are not associated with a contact in the Company.Tasks list and put tasks assocaited with a contact in the list for each individual contacts.Company.Tasks列表中保留与联系人无关的任务,并将与联系人关联的任务放在每个单独联系人的列表中。 This unfortunately means that if you want to see all tasks for a company (which will probably be a lot) you have to combine all tasks for the company with all tasks for each individual contact.不幸的是,这意味着如果您想查看公司的所有任务(可能很多),您必须将公司的所有任务与每个联系人的所有任务结合起来。 I also see this being complicated when you want to disassociate a task from a contact, as you have to move it from the contact to the company当您想将任务与联系人解除关联时,我还认为这很复杂,因为您必须将其从联系人移至公司

  3. Keep all tasks in the Company.Tasks list, and each contact has a list of id values for tasks it is associated with.将所有任务保存在Company.Tasks列表中,每个联系人都有一个与其关联的任务的 id 值列表。 This seems like a good approach except for having to manually take id values and having to make a sub-list of Task entities for a contact.除了必须手动获取 id 值并且必须为联系人制作Task实体的子列表之外,这似乎是一个不错的方法。

What is the recommended way to model this data in a document oriented database?在面向文档的数据库中,model 这个数据的推荐方法是什么?

Use denormalized references:使用非规范化引用:

http://ravendb.net/faq/denormalized-references http://ravendb.net/faq/denormalized-references

in essence you have a DenormalizedReference class:本质上你有一个 DenormalizedReference class:

public class DenormalizedReference<T> where T : INamedDocument
{
    public string Id { get; set; }
    public string Name { get; set; }

    public static implicit operator DenormalizedReference<T> (T doc)
    {
        return new DenormalizedReference<T>
        {
            Id = doc.Id,
            Name = doc.Name
        }
    }
}

your documents look like - i've implemented the INamedDocument interface - this can be whatever you need it to be though:你的文档看起来像 - 我已经实现了 INamedDocument 接口 - 这可以是你需要的任何东西:

public class Company : INamedDocument
{
    public string Name{get;set;}
    public int Id { get; set; }
    public IList<DenormalizedReference<Contact>> Contacts { get; set; }
    public IList<DenormalizedReference<Task>> Tasks { get; set; }
}

public class Contact : INamedDocument
{
    public string Name{get;set;}
    public int Id { get; set; }
    public DenormalizedReference<Company> Company { get; set; }
    public IList<DenormalizedReference<Task>> Tasks { get; set; }
}

public class Task : INamedDocument
{
    public string Name{get;set;}
    public int Id { get; set; }
    public DenormalizedReference<Company> Company { get; set; }
    public DenormalizedReference<Contact> Contact { get; set; }
}

Now saving a Task works exactly as it did before:现在保存任务的工作方式与以前完全一样:

var task = new Task{
    Company = myCompany,
    Contact = myContact
};

However pulling all this back will mean you're only going to get the denormalized reference for the child objects.但是,将所有这些都拉回来意味着您只会获得子对象的非规范化引用。 To hydrate these I use an index:为了水合这些,我使用了一个索引:

public class Tasks_Hydrated : AbstractIndexCreationTask<Task>
{
    public Tasks_Hydrated()
    {
        Map = docs => from doc in docs
                      select new
                                 {
                                     doc.Name
                                 };

        TransformResults = (db, docs) => from doc in docs
                                         let Company = db.Load<Company>(doc.Company.Id)
                                         let Contact = db.Load<Contact>(doc.Contact.Id)
                                         select new
                                                    {
                                                        Contact,
                                                        Company,
                                                        doc.Id,
                                                        doc.Name
                                                    };
    }
}

And using your index to retrieve the hydrated tasks is:并使用您的索引来检索水合任务是:

var tasks = from c in _session.Query<Projections.Task, Tasks_Hydrated>()
                    where c.Name == "taskmaster"
                    select c;

Which i think is quite clean:)我认为这很干净:)

As a design conversation - the general rule is that if you ever need to load the child documents alone as in - not part of the parent document.作为设计对话 - 一般规则是,如果您需要单独加载子文档,而不是父文档的一部分。 Whether that be for editing or viewing - you should model it with it's own Id as it's own document.无论是用于编辑还是查看 - 您都应该使用它自己的 Id 作为它自己的文档 model 它。 Using the method above makes this quite simple.使用上面的方法使这变得非常简单。

I'm new to document dbs as well...so with a grain of salt...我也是记录数据库的新手......所以有点盐......

As a contrasting example...if you are on Twitter and you have a list of the people you follow, which contains a list of their tweets...you would not move their tweets into your twitter account in order to read them, and if you re-tweet, you would only have a copy, not the original.作为一个对比示例...如果您在 Twitter 上并且您有一个您关注的人的列表,其中包含他们的推文列表...您不会将他们的推文移动到您的 twitter 帐户以阅读它们,并且如果您转发推文,您将只有一份副本,而不是原件。

So, in the same way, my opinion is that if Tasks belong to a company, then they stay within the Company.因此,同样地,我的观点是,如果 Tasks 属于公司,那么它们将留在公司内。 The Company is the Aggregate Root for Tasks.公司是任务的聚合根。 The Contacts can only hold references (ids) or copies of the Tasks and cannot modify them directly.联系人只能保存引用(id)或任务的副本,不能直接修改它们。 If you have your contact hold a "copy" of the task, that's fine, but in order to modify the task (eg mark it complete) you would modify the task through its Aggregate Root (Company).如果您的联系人持有任务的“副本”,那很好,但为了修改任务(例如,将其标记为完成),您将通过其聚合根(公司)修改任务。 Since a copy could quickly become outdated, it seems like you would only want a copy to exist while in memory and when saving the Contact, you would only save references to the Tasks.由于副本可能很快就会过时,因此您似乎只希望在 memory 中存在副本,并且在保存联系人时,您只会保存对任务的引用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM