简体   繁体   中英

Database design & normalization

I'm creating a messaging system for a e-learning platform and there are some design concerns that I'd like some feedback on.

First of all, it is important for me and my system to be highly modifiable in the future. As such, maintaining a fairly high normalization across my tables is important.

On to how my system will work:

  • All members (students or teachers) are part of a virtual classroom.
  • Teachers can create tasks and exercises in these classrooms and assign them to one or multiple students ( member_task table not illustrated).
  • A student can request help for a specific task or exercise by sending a message to the teachers of the classroom.
    • Messages sent by students are sent to all the teachers. They cannot address a message to a specific teacher.
    • Messages sent by teachers can be addressed to one or more students.
    • Students cannot send messages to other students.
    • Messages behave like chat, meaning that a private conversation starts between a student and all teachers when they send a message.

Here's the ER diagram I made:

在此输入图像描述

So my question is, is this table normalized properly for my purpose? Is there anything that can be done to reduce redundancy of data across my tables? And out of curiosity, is it in BCNF?

Another question: I don't intend to ever implement delete features anywhere in my system. Only "archiving" where said classroom/task/member/message/whatever is simply hidden/deactivated. So is there any reason to actually use FK?

EDIT: Also, a friend brought to my attention that the Conversations table might be redundant, and it kinda feels so. Thoughts?

Thanks.

In response to your emphasis on "modifiability" which I'm taking to mean with respect to application and schema evolution I'm actually going to suggest a fairly extreme solution. Before that some notes some aspects you've mentioned. First, foreign keys represent meaningful constraints in your data. They should always be defined and enforced. Foreign keys are not there just for cascading delete. Second, the Conversations table is arguably redundant. It would make sense if you had a notion of "session" of chat which would correspond to a Conversation. Otherwise, you just have a bunch of messages throughout time. The Conversation table could also enable a many-to-many relation between messages and tasks/exercises if you wanted to have chats that simultaneously covered multiple exercises, for example.

Now for the extreme suggestion. You could use 6NF. In particular, you might look at its incarnation in anchor modeling . The most notable difference in this approach is each attribute is modeled as a different table. 6NF supports temporal databases (supported in anchor modeling via "historized" attributes/ties). This means handling situations like a student being associated to a task now but not later won't cause all their messages to disappear. Most relevant to you, all schema modifications are non-destructive and additive, so no old code breaks when you make a change.

There are downsides. First, it's a bit weird, and in particular anchor modeling (somewhat gratuitously?) introduces a bunch of new terms. Second, it produces weird queries for most relational databases which they may not optimize well. This can sometimes be resolved with materialized views. Third, at the physical level, every attribute is effectively nullable. Finally, the tooling and support, while present, is pretty young. In particular, for MySQL, you may only be "inspired by" what's provided on the anchor modeling site.

As far as the actual database model would go, it would look roughly similar. Anchor modeling uses the term "anchor" for roughly the same thing as an entity, and "tie" for roughly the same thing as a relation. For simplicity, dropping the Conversation relation (and thus directly connecting Message to Task), the image would be similar: you'd have an anchor for Classroom, Member, Message, and Task, and a tie replacing Recipient that you might called ReceivedMessage representing the relation of " member received message message ". The attributes on your entities would be attribute nodes. Making the message attribute on the Message anchor historized would allow messages to be edited if desired and support a history of revisions.

One concern I have is that I don't see a Users table which will hold all the students and teachers info (login, email, system id, role, etc) but I assume there is something similar in our system?

Now, looking into the Members table: usually students change classes every semester or so and you don't want last semesters' students to receive new messages. I would suggest the following:

Members
=============
PK member_id 
FK class_id
FK user_id
--------------
join_date
leave_date
active
role

The last two fields might be redundant:

  • active : is an alternative solution if you want to avoid using dates. This will become false when a user stops being member of this class. Since there is not delete feature, the Members entry has to be preserved for archive purposes (and historical log).
  • role : Depends on how you setup Users table and roles in your system. If a user entry has role field(s) then this is not needed. However, this field allows for the same user to assume different roles in different classes. Example: a 3rd year student, who was a member of this class 2 years ago, is now working as TA/LA (teaching/lab assistant) for the same class. This depends on how the institution works... in my BSc we had the "rule": anyone with grade > 8.5/10 in Java could volunteer to do workshops to other students (using uni's labs). Finally, this field if used as a mask or a constant, allows for roles to be extended (future-proof)

As for FKs I will always suggest using them for data consistency. Things can get really ugly really fast without FKs. The limitations they impose can be worked around and they are usually needed: What is the purpose of archiving a message with sender_id if the sender has been deleted by accident? Also, note that in most systems FKs are indexed which improves the performance of queries/joins.

Hope the above helps and not confuse things :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM