简体   繁体   中英

NoSQL Database Structure for Facebook type social network

For a Facebook type social networking app, a high performing database structure is required, for storing data in Firebase(Cloud Firestore) (NoSQL).

Data to be stored :

 - Userinfo (name, email etc)
 - Friends
 - Posts
 - Comments on posts.

I am confused among the following two DB Structures regarding query performance (if the database becomes huge).

(Ref: C_xxx is Collection, D_xxx is document)

Structure 1

C_AllData
    - D_UserID-1
        name: xxxx,
        email: yyy,
        friends: [UserID-3, UserID-4]
        - C_Posts
            - D_PostId-1
                Text: hhh
                Date: zzz
                - C_Comments
                    - D_CommentId-1
                        UserID: 3
                        Text: kkk
                    - D_CommentId-2
                        UserID: 4
                        Text: kkk
            - D_PostId-2
                Text: hhh
                Date: zzz
                - C_Comments
                    - D_CommentId-3
                        UserID: 3
                        Text: kkk
                    - D_CommentId-4
                        UserID: 4
                        Text: kkk
    - D_UserID-2
        name: xxxx,
        email: yyy
        friends: [UserID-5, UserID-7]
        - C_Posts
            - D_PostId-3
                Text: hhh
                Date: zzz
                - C_Comments
                    - D_CommentId-5
                        UserID: 5
                        Text: kkk
                    - D_CommentId-6
                        UserID: 7
                        Text: kkk

Structure 2

C_AllUsers 
    - D_UserID-1
        name: xxxx,
        email: yyy
        friends: [UserID-3, UserID-4]
    - D_UserID-2
        name: xxxx,
        email: yyy
        friends: [UserID-5, UserID-7]

C_AllPosts
    - D_PostId-1
        UserID: 1
        Text: hhh
        Date: zzz
        - C_Comments
            - D_CommentId-1
                UserID: 3
                Text: kkk
            - D_CommentId-2
                UserID: 4
                Text: kkk
    - D_PostId-3
        UserID: 2
        Text: hhh
        Date: zzz
        - C_Comments
            - D_CommentId-5
                UserID: 5
                Text: kkk
            - D_CommentId-6
                UserID: 7
                Text: kkk

My Question is what are the pros and cons of the two approaches ?

Some points that i could think of are below, please correct me if i am wrong.

Structure 1 :

Is getting all the posts of a given user, Faster in Structure 1 ? Since we are pinpointing to the exact collection ( AllData/{UserID}/Posts/ )

Since entire DB is under one collection, is scalability not good ?

Structure 2 :

Divided DB -> Better Scalability ?

Divided DB -> Better Performance ?

Lesser Nesting -> Better Performance ?

AllPosts under one collection -> Slow querying ?


Or if you can suggest a better model, that would be great too.

In Firebase a rule of thumb is to keep separate entity types in separate branches. This is especially important because:

(Note: Here firebase is firebase realtime database)

  1. Firebase always loads complete nodes, and
  2. once you grant a user read access to a node, they have access to all data under that node.

For example in your first data structure, to load a list of friends, you will have to load all the posts of all friends, and all comments on all those posts too. That's a lot more data than is strictly needed, if all you wanted to do was show a list of the friends names.

In your second data structure, you are one step closer. As now you can first load the friends names, and then load their posts.

But even in that structure you have the same problem. If you want to display the list of post titles for a friend (or for all friends), you are going to have to load the entire posts and all comments. That is again way more data than is needed to show a list of post titles. So you'll definitely want to store the comments in a separate top-level list too, using the same key of the post to identify and group them.

C_AllPosts
    - D_PostId-1
        UserID: 1
        Text: hhh
        Date: zzz
    - D_PostId-3
        UserID: 2
        Text: hhh
        Date: zzz
C_AllComments
    - D_PostId-1
        - D_CommentId-1
            UserID: 3
            Text: kkk
        - D_CommentId-2
            UserID: 4
            Text: kkk
    - D_PostId-3
        - D_CommentId-5
            UserID: 5
            Text: kkk
        - D_CommentId-6
            UserID: 7
            Text: kkk

Now if you want to display a post and its comments, you will have to read two nodes. If you do this for multiple posts, you'll end up with a lot of reads, to essentially perform the NoSQL equivalent of a SQL JOIN . This is quite normal, it's essentially a client-side join, and it not nearly as slow as you may think, because Firebase pipelines the requests .

For some more introduction on this type of data modeling, I recommend:

And these answers to previous questions:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM