简体   繁体   中英

Managing relationships with MongoDb in a Microservices architecture

I've been working with microservices for some time now, always with relational databases. I am looking at MongoDb and I am not sure how to handle entity relationships involving different microservices. Here goes an example:

public class Employee implements Serializable {
   private String id;
   ...
}

public class Department implements Serializable {
    private String id;
    private String desc;
    private List<Employee> employees = new ArrayList<>();
    ...
}

These two entities are managed by two different microservices, with a one-to-many relationship managed by the Department entity. So far, so good.

With a relational database (being an optional relationship and with the possibility of one employee belonging to several departments) I'd map this in the Departments microservice with one table containing two fields: employee_id and department_id . When the client calls getDepartmentWithEmployees(depId) this microservice will read the table and obtain the proper employees from the Employees microservice.

But, in a MongoDb database, as far as I know, when I store a Department object it stores all associated Employee s. Is not that duplicating the information? Is there a way, maybe, where MongoDb don't store all info about the employees but just their id? Or there is another answer?

I am pretty sure this is a very basic question, but I am new to all this stuff.

Thanks in advance.

But, in a MongoDB database, as far as I know, when I store a Department object it stores all associated Employees. Is not that duplicating the information?

First of all, the statement above is not correct. From the MongoDB's perspective, whatever is provided as BSON is stored as it is. If you provide employees with the department then yes, it should. You can apply partial updates after creating the department... (eg using $set operator). But, I think the scope of your question is broader than this.

IMHO, creating nano-services for each document/table in the database is not a good approach. Especially, when the services only responsible for basic CRUD operation. You should first define your bounded contexts, aggragate roots and etc... In short, do not try to design tables before mapping business requirements to domain objects. What I'm trying to say is use DDD principles:)

These are the strategies that I found so far. When designing microservices you should also consider pros and cons of each strategy. (See bottom for references.)

General Principles of Mapping Relational Databases to NoSQL

  • 1:1 Relationship
    • Embedding
    • Link with Foreign Key
  • 1:M Relationship
    • Embedding
    • Linking with Foreign Key
    • (Hybrid) Bucketing Strategy
  • N:M Relationship
    • Two-Way Referencing
    • One-Way Referencing

1:1 Relationship

The 1:1 relation can be mapped in two ways;

  • Embed the relationship as a document
  • Link to a document in a separate collection

Tables:

// Employee document
{
   "id": 123,
   "Name":"John Doe"
}

// Address document
{
   "City":"Ankara",
   "Street":"Genclik Street",
   "Nr":10
}

Example: Embedding (1:1)

  • Advantage: Address can be retrieved with a single read operation.

{
  "id": 123,
  "Name":"John Doe",
  "Address": {
    "City":"Ankara",
    "Street":"Genclik Street",
    "Nr":10
  } 
}

Example: Link with foreign key (1:1)

{
   "id": 763541685,  // link this
   "Name":"John Doe"
}

Address with document key;

{
   "employee_id": 763541685,
   "City":"Ankara",
   "Street":"Genclik street",
   "Nr":10
}

1:M Relationship

Initial:

// Department collection
{
  "id": 1,
  "deparment_name": "Software",
  "department_location": "Amsterdam"
}

/// Employee collection
[
    {
      "employee_id": 46515,
      "employee_name": "John Doe"
    },
    {
      "employee_id": 81584,
      "employee_name": "John Wick"
    }
]

Example: Embedding (1:M)

Warning:

  • Employee list might be huge!
  • Be careful when using this approach in write-heavy system. IO load would increase due to housekeeping operations such indexing, replicating etc.
  • Pagination on employees is hard!!!
{
  "id": 1,
  "deparment_name": "Software",
  "department_location": "Amsterdam",
  "employess": [
                   {
                     "employee_id": 46515,
                     "employee_name": "John Doe"
                   },
                   {
                     "employee_id": 81584,
                     "employee_name": "John Wick"
                   }
               ]
}

Example: Linking (1:M)

We can link department_id from employee document.

  • Advantage: Easier pagination
  • Disadvantage: Retrieve all employees that are belong to department X. This query will need a lot of read operations!
[
    {
      "employee_id": 46515,
      "employee_name": "John Doe",
      "department_id": 1
    },
    {
      "employee_id": 81584,
      "employee_name": "John Wick",
      "department_id": 1
    }
]

Example: Bucketing Strategy (Hybrid 1:M)

We'll split the employees into buckets with maximum of 100 employees in each bucket.

{
    "id":1,
    "Page":1,
    "Count":100,
    "Employees":[
        {
            "employee_id": 46515,
            "employee_name": "John Doe"
        },
        {
            "employee_id": 81584,
            "employee_name": "John Wick"
        }
    ]
}

N:M Relationship

To choose Two Way Embedding or One Way Embedding , the user must establish the maximum size of N and the size of M.
For example; if N is a maximum 3 categories for a book and M is a maximum of 5,000,000 books in a category you should pick One Way Embedding.
If N is a maximum 3 and M is a maximum of 5 then Two Way Embedding might work well. schema basics

Example: Two-Way Referencing (N:M)

In Two Way Embedding we will include the Book foreign keys under the book field in the author document.

Author collection

[
    {
       "id":1,
       "Name":"John Doe",
       "Books":[ 1, 2 ]
    },{
       "id":2,
       "Name": "John Wick",
       "Books": [ 2 ]
    }
]

Book collection:

[
    {
       "id": 1,
       "title": "Brave New World",
       "authors": [ 1 ]
    },{
       "id":2,
       "title": "Dune",
       "authors": [ 1, 2 ]
    }
]

Example: One-Way Referencing (N:M)

Example Books and Categories: The case is that several books belong to a few categories but a couple categories can have many books.

  • Advantage: Optimize the read performance
  • The reason for choosing to embed all the references to categories in the books is due to the fact that being lot more books in a category than categories in a book.

Catergory

[
  {
    "id": 1,
    "category_name": "Brave New World"
  },
  {
    "id": 2,
    "category_name": "Dune"
  }
]

An example of a Book document with foreign keys for Categories

[
    {
      "id": 1,
      "title": "Brave New World",
      "categories": [ 1, 2 ],
      "authors": [ 1 ] 
    },
    {
      "id": 2,
      "title": "Dune",
      "categories": [ 1],
      "authors": [ 1, 2 ] 
    }
]

References

  1. Case study: An algorithm for mapping the relational databases to mongodb
  2. The Little MongoDB Schema Design Book
  3. 6 Rules of Thumb for MongoDB Schema Design

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM