简体   繁体   中英

social network - User profile design schema question

I am creating user profiles on my site and lost on how to design this: There are many fields, some are 1:1 like city of residence, birthday, etc. But there are over 50 fields which are 1:many (or many to many?) like favorite movies, sport teams, dating preference, screen names, phone numbers, email addresses etc. It gets more complex when we have previous companies worked at, previous schools, etc. A person can belong to many companies and there are many fields in this group like Date worked at, department, company name, industry name, etc.

So the question is how to store all this? If we normalize all these profile fields there will be many many tables to join. As far as i read, for social networks people recommend a denormalized approach. But eitherways, I am storing all user details and profile details in the main user table, so each row is a unique user. If i have to store all these multiple preference, esp like favorite movies can go in the hundreds and past companies itself have a whole group of fields, so there will be lots of duplicates in the user table.

What approach do social networks take for this?

Social network data storage questions are really no different than the data storage questions in general... normalized and related data is the best way to 'store' this data efficiently. The RDBMS is made to handle these relationships - the PK-FK relationships and JOINS are the MAIN point of Relational DBs... so even though YOU 'see" join join join etc, the DB is (should be) efficient in handling these joins.

From a USAGE standpoint of getting to the pertinent data - make sure your indexes are accurate and optimized - and make use of VIEWS to 'flatten' the data you need for display purposes...

So whatever application server you are using to get the data will call the VIEW - that will 'appear' to you, the developer, as a 'flatter' representation of the data, making UI and APP serer interaction cleaner and more efficient (both in resources, and in coding),

as a general guideline - flattening of data is generally considered 'acceptable' in a data warehousing environment... of course I don't what to open up the monstrous debate of "just how normalized, is 'normalized'" (first - sixth form of normalization...)

I guess you could think of a SN as more of an OLAP, than the OLTP. In which case 'some' de-normalized data storage is common - and acceptable - really, YOU get to decide just how de-normalized you want things... For instance - in your examples, of employment history and movies, sports. I'd think that a simple 1:many allowing duplicate entries on such items would be fine, and probably easier to maintain...

Hope that was helpful,

You have to stick with the normalization strategy of creating your schema.The query might be a pain which you should handle with extreme caution especially when dealing with joins.If you are a dot developer, i guess LINQ will handle d pain for you.I believe your RDMS is smart enough to handle your queries with great performance. One thing to take note is your query structure.Write performance-based queries.As i said, LINQ should do this best....cheers

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM