简体繁体 English

设计网站的共享，重新共享功能，避免重复

[英]Design a share, re-share functionality for a website, avoiding duplication

原文 2014-06-29 21:00:06 0 2 algorithm/ data-structures/ language-agnostic

This is an interesting interview question that I found somewhere. 这是我在某处发现的一个有趣的访谈问题。 To elaborate more: 详细说明：

You are expected to design classes and data structures for some website such as facebook or linkedin where your activity can be shared and re-shared. 您应该为某些网站（例如，facebook或linkedin）设计类和数据结构，以便在其中共享和重新共享您的活动。 Design should be such that it avoids redundancy and duplication. 设计时应避免冗余和重复。

While thinking of this problem I was stuck on "link vs copy" problem as discussed here 考虑到此问题时，我陷入了此处讨论的“链接与复制”问题

But since the problem states that duplication should be avoided I decided to go "link" way. 但是由于问题指出应该避免重复，所以我决定采用“链接”方式。 This makes sharing/re-sharing easier but deleting very difficult. 这使得共享/重新共享更加容易，但是删除却非常困难。 ie if the original user deletes their post all the shares should be deleted. 即，如果原始用户删除了他们的帖子，则所有的共享都应删除。 (programmatically speaking all the objects on the pointing to the particular activity should be made null. And this is the difficult part here, ie to find all the pointing objects) （以编程方式来讲，指向特定活动的指向上的所有对象都应为空。这是这里的难点，即查找所有指向对象）

2 个解决方案

Wouldn't it be better to keep the shares? 保留股份会更好吗？ The original user deletes their post, fine, it's gone. 原始用户删除了他们的信息，很好，它已经消失了。 But everyone who has linked to it should not suddenly have it disappear on them. 但是，每个与它关联的人都不应突然使它消失在他们身上。

This could be done the way Unix handles hard links. 这可以通过Unix处理硬链接的方式来完成。 "Deleting" just means removing one link to an object -- an inode, in Unix terms. “删除”仅表示删除一个对象的链接-以Unix术语来说是一个索引节点。 You don't remove the object itself until the link count is zero. 在链接计数为零之前，不要删除对象本身。

It's not obvious from the original specification that deletion should work as you describe. 从原始规范中显而易见，删除应该如您所描述的那样起作用。 It might be desired that when the original user deletes the item, it is not deleted elsewhere; 可能希望当原始用户删除该项目时，不要在其他地方删除它； in that case you don't necessarily need to track all references, just keep a reference count on each post, and remove it from the database only when the count hits zero. 在这种情况下，您不必跟踪所有引用，只需在每个帖子上保留引用计数，仅在计数达到零时才将其从数据库中删除。

If you do want the behavior you describe, it may be achievable by simply removing broken links as and when you encounter them, again relieving you of the need to track each reference. 如果您确实想要您描述的行为，则可以通过在遇到链接时将它们断开的方式简单删除，从而使您不必再跟踪每个引用，就可以实现此目的。 The cost of tracking and updating every reference to every post is replaced with the comparable cost of one failed lookup for each referring page. 跟踪和更新对每个帖子的每个引用的成本被替换为每个引用页面一次失败查找的可比成本。 The latter case is simpler to implement, though, and the cost doesn't hit your server all at once. 不过，后一种情况的实现更简单，而且成本不会一次击中您的服务器。

In real life, I would implement all references as bidirectional anyway, because it's likely to be needed sooner or later as you add features. 在现实生活中，无论如何我都会将所有引用实现为双向引用，因为添加功能时可能迟早需要它。 For example, a "like" counter seems pretty simple, but to prevent duplicate votes you need to keep track of who has liked each item, and then if you want to remove their "like" when they delete their profile, you need to keep a list of each user's outbound "likes" too. 例如，“喜欢”计数器似乎非常简单，但是要防止重复投票，您需要跟踪谁喜欢每个商品，然后，如果要在删除个人资料时删除他们的“喜欢”，则需要保持每个用户的出站“喜欢”列表。

It takes a lot of database activity to implement something like Facebook... 实现诸如Facebook之类的东西需要大量的数据库活动...