使用可以在GAE数据存储中相互发送消息的用户对系统进行建模

Question

I was wondering whether you could help me work out a way to model the following in a GAE datastore such that it is scalable and can be updated frequently. 我想知道您是否可以帮助我设计一种在GAE数据存储中对以下内容进行建模的方法，以使其具有可伸缩性并可以经常更新。 I thought I had a solution which I expressed in this question but whilst waiting for replies I realise that it might be overly complicated. 我以为我有在这个问题中表达的解决方案，但是在等待答复时，我意识到它可能过于复杂。 I have explained below why I have kept it as a separate question. 我在下面解释了为什么我将其保留为一个单独的问题。

Problem: Building a system with users who can send many messages to each other. 问题：使用可以互相发送许多消息的用户来构建系统。 Each user must be able to retrieve their messages - like online chat. 每个用户都必须能够检索其消息-例如在线聊天。 Would like to avoid contention when possibly a user may receive many messages over a short time. 在可能的情况下，用户可能会在短时间内收到许多消息，因此希望避免竞争。

Solution 1: As mentioned here I am wondering whether a sharded list can be used to implement this. 解决方案1：如此处所述，我想知道是否可以使用分片列表来实现此目的。 By this I mean have messages stored as entity objects and sender and receiver store the keys of these objects (the messages sent between them) in a list. 我的意思是将消息存储为实体对象，发送者和接收者将这些对象的键（在它们之间发送的消息）存储在列表中。 I thought of sharding because a user who receives many messages would have to update the list frequently and a sharding approach could have prevented datastore contention. 我想到了分片，因为接收许多消息的用户将不得不频繁更新列表，并且分片方法可能会阻止数据存储争用。

Problem - what happens when the list of keys to say a user's received messages gets large? 问题-当说用户收到的消息的键列表很大时，会发生什么？ Will appending to it not become slow? 附加到它上不会变慢吗？ I could split the list over several entities but this would take careful thought on allocation schemes and ways of retrieval. 我可以将列表划分为几个实体，但这需要仔细考虑分配方案和检索方式。 Willing to do this if it is the best way. 愿意这样做是最好的方法。

Alternative approach: Store messages as entity objects (as above) but this time have them store a properties which are indexed (date, from, to, etc). 替代方法：将消息存储为实体对象（如上所述），但是这次让它们存储被索引的属性（日期，从，到等）。 Retrieve messages for a user using queries (date greater than..., from=... etc). 使用查询为用户检索消息（日期大于...，from = ...等）。 This could work well but I worry - will all the indexing degrade as they will grow extremely large with many users sending many messages? 这可能行得通，但我担心-所有索引在许多用户发送许多邮件时会变得非常大，会降低索引质量吗？ Seems like it will degrade into an SQL like system. 似乎它将降级为类似SQL的系统。

Any thoughts? 有什么想法吗？

I have read about how to model complex relations in the GAE docs but they are using python for an example and I am having trouble abstracting the overall design pattern. 我已经阅读了有关如何在GAE文档中对复杂关系进行建模的信息，但是他们使用python作为示例，并且在抽象总体设计模式时遇到了麻烦。

Many thanks to anyone with input on this 非常感谢任何对此提供意见的人

PS at the moment using the low level datastore directly. PS目前直接使用底层数据存储区。

Answer 1

I have created a system similar to this before. 我已经创建了一个与此类似的系统。 The way I chose to implement it was that i created a Conversation entity, that was the parent for many Message entities. 我选择实现它的方式是创建一个Conversation实体，该实体是许多Message实体的父代。 A conversation had two participants (although you could do more), each of which was the key to a User entity. 对话中有两个参与者（尽管您可以做更多），每个参与者都是User实体的关键。

Something like this (assuming ofy) 这样的东西（假设是ofy）

@Entity public class Conversation {
    @Id Long id;
    @Index Key<User> participant1;
    @Index Key<User> participant2;
    @Index String participant1ExternalId;
    @Index String participant2ExternalId;
}

@Entity public class Message {
    @Id Long id;
    @Parent Ref<Conversation> conversation;
    @Index String senderExternalId;
    @Index String recipientExternalId;
    String message;        
}

In this way, you can query all conversations for a participant in an eventually consistent fashion, and all messages received or sent (or both) for a conversation in a strongly consistent fashion. 通过这种方式，您可以以最终一致的方式查询参与者的所有对话，并以非常一致的方式来查询对话的所有已接收或发送的消息（或两者）。 I had an extra requirement of users not being able to identify eachother, and so messaging used generated UUIDs (the externalId properties). 我有一个额外的要求，即用户无法彼此识别，因此消息传递使用了生成的UUID（externalId属性）。

So in this way, sharding and the 1 write/sec limit applies at a conversation level. 因此，以这种方式，分片和1个写入/秒的限制适用于会话级别。 You could put unread counters onto the conversation object for each user, or on each message if you needed to (at a contention level it makes no real difference, so whatever makes most sense). 您可以将每个用户的未读计数器放在对话对象上，或者如果需要，可以在每条消息上放置未读计数器（在争用级别上没有任何实际区别，所以最有意义的事情）。

If your users are regularly exceeding 1 message per second per conversation you'll have a lot of other problems to solve beyond just datastore contention, so its probably a good starting point. 如果您的用户每次对话通常每秒超过1条消息，那么除了数据存储争用之外，您还需要解决很多其他问题，因此这可能是一个很好的起点。 In the general case, eventual consistency will work very well for this sort of operation (ie checking for new messages), so you can lean heavily on that. 在一般情况下，最终的一致性对于这种操作（例如，检查新消息）非常有效，因此您可以在很大程度上依靠它。

使用可以在GAE数据存储中相互发送消息的用户对系统进行建模

问题描述

1 个解决方案

解决方案1
0 2014-12-15 09:18:06

使用可以在GAE数据存储中相互发送消息的用户对系统进行建模

问题描述

1 个解决方案

解决方案1 0 2014-12-15 09:18:06

解决方案1
0 2014-12-15 09:18:06