简体   繁体   English

Google App-Engine数据存储非常慢

[英]Google App-Engine Datastore is extremely slow

I need help in understanding why the below code is taking 3 to 4 seconds. 我需要帮助来理解为什么下面的代码需要3到4秒。

UPDATE: Use case for my application is to get the activity feed of a person since last login. 更新:我的应用程序的用例是获取自上次登录以来的人的活动供稿。 This feed could contain updates from friends or some new items outside of his network that he may find interesting. 此Feed可能包含来自朋友的更新或他网络之外的一些他可能感兴趣的新项目。 The Activity table stores all such activities and when a user logs in, I run a query on the GAE-DataStore to return above activities. Activity表存储所有这些活动,当用户登录时,我在GAE-DataStore上运行查询以返回上述活动。 My application supports infinite scrolling too, hence I need the cursor feature of GAE. 我的应用程序也支持无限滚动,因此我需要GAE的光标功能。 At a given time, I get around 32 items but the activities table could have millions of rows (as it contains data from all the users). 在给定的时间,我得到大约32个项目,但活动表可能有数百万行(因为它包含来自所有用户的数据)。

Currently the Activity table is small and contains 25 records only and the below java code reads only 3 records from the same table. 目前,Activity表很小,只包含25条记录,下面的java代码只读取同一个表中的3条记录。

Each record in the Activity table has 4 UUID fields. Activity表中的每条记录都有4个UUID字段。

I cannot imagine how the query would behave if the table contained millions of rows and result contained 100s of rows. 我无法想象如果表包含数百万行并且结果包含100行,则查询将如何表现。

Is there something wrong with the below code I have below? 以下代码有什么问题吗?

(I am using Objectify and app-engine cursors) (我正在使用Objectify和app-engine游标)

Filter filter = new FilterPredicate("creatorID", FilterOperator.EQUAL, userId);
Query<Activity> query = ofy().load().type(Activity.class).filter(filter);
query = query.startAt(Cursor.fromWebSafeString(previousCursorString));
QueryResultIterator<Activity> itr = query.iterator();
while (itr.hasNext())
{
    Activity a = itr.next();
    System.out.println (a);
}

I have gone through Google App Engine Application Extremely slow and verified that response time improves if I keep on refreshing my page (which calls the above code). 我已经浏览了Google App Engine应用程序非常慢并且验证了如果我继续刷新页面(调用上面的代码),响应时间会有所改善。 However, the improvement is only ~30% 但是,改善只有~30%

Compare this with any other database and the response time for such tiny data is in milliseconds, not even 100s of milliseconds. 将此与任何其他数据库进行比较,这些微小数据的响应时间以毫秒为单位,甚至不到100毫秒。

Am I wrong in expecting a regular database kind of performance from the GAE DataStore? 我期望从GAE DataStore获得常规数据库性能是错误的吗?

I do not want to turn on memcache just yet as I want to improve this layer without caching first. 我还不想打开memcache,因为我想先改进这个层而不先缓存。

Not exactly sure what your query is supposed to do but it doesn't look like it requires a cursor query. 不完全确定您的查询应该做什么但看起来它不需要游标查询。 In my humble opinion the only valid use case for cursor queries is a paginated query for data with a limited count of result rows. 在我看来,游标查询的唯一有效用例是对结果行数量有限的数据进行分页查询。 Since your query does not have a limit i don't see why you would want to use a cursor at all. 由于您的查询没有限制,我不明白您为什么要使用游标。

When you need millions of results you're probably doing ad-hoc analysis of data (as no human could ever interpret millions of raw data rows) you might be better off using BigQuery instead of the appengine datastore. 当您需要数百万个结果时,您可能正在对数据进行临时分析(因为没有人能够解释数百万个原始数据行),您最好使用BigQuery而不是appengine数据存储区。 I'm just guessing here, but for normal front end apps you rarely need millions of rows in a result but only a few (maybe hundreds at times) which you filter from the total available rows. 我只是在这里猜测,但对于普通的前端应用程序,您很少需要数百万行的结果,但只有少数(可能是数百次)从总可用行中过滤掉。

Another thing: 另一件事:

Are you sure that it is the query that takes long? 你确定这是一个需要很长时间的查询吗? It might as well be the wrapper around the query. 它也可能是查询的包装器。 Since you are using cursors you would have to recall the query until there are no more results. 由于您使用的是游标,因此必须重新调用查询,直到没有更多结果为止。 The handling of this could be costly. 处理这个可能是昂贵的。

Lastly: 最后:

Are you testing on appengine itself or on the local development server? 您是在测试appengine本身还是在本地开发服务器上? The devserver can obviouily not simulate a cloud and thus could be slower (or faster) than the real thing at times. devserver很可能不会模拟云,因此有时可能比真实的更慢(或更快)。 The devserver does not know about instance warmup times either when your query spawns new instances. 当您的查询生成新实例时,devserver不知道实例预热时间。

Speaking of cloud: The thing about cloud databases is not that they have the best performance for very little data but that they scale and perform consistently with a couple of hundreds and a couple of billions of rows. 说到云:关于云数据库的问题不在于它们对于非常少的数据具有最佳性能,而是它们可以在几百和几十亿行的情况下一致地扩展和执行。

Edit: 编辑:

After performing a retrieval operation, the application can obtain a cursor, which is an opaque base64-encoded string marking the index position of the last result retrieved. 在执行检索操作之后,应用程序可以获得游标,该游标是不透明的base64编码的字符串,用于标记检索的最后结果的索引位置。

[...] [...]

The cursor's position is defined as the location in the result list after the last result returned. 光标的位置定义为返回最后一个结果后结果列表中的位置。 A cursor is not a relative position in the list (it's not an offset); 光标不是列表中的相对位置(它不是偏移量); it's a marker to which the Datastore can jump when starting an index scan for results. 它是数据存储区在开始索引扫描结果时可以跳转到的标记。 If the results for a query change between uses of a cursor, the query notices only changes that occur in results after the cursor. 如果查询的结果在使用游标之间发生更改,则查询仅注意光标后结果中发生的更改。 If a new result appears before the cursor's position for the query, it will not be returned when the results after the cursor are fetched. 如果在查询的光标位置之前出现新结果,则在获取光标后的结果时将不会返回该结果。 ( Datastore Queries ) 数据存储区查询

These two statements make be believe that the query performance should be consistent with or without cursor queries. 这两个语句使得查询性能应该与游标查询一致或不一致。

Here are some more things you might want to check: 以下是您可能需要检查的更多内容:

  • How do you register your entity classes with objectify? 你如何用客体化注册你的实体类?
  • What does your actual test code look like? 您的实际测试代码是什么样的? I'd like to see how and where you measure. 我想看看你测量的方式和位置。
  • Can you share a comparison between cursor query and query without cursors? 你可以在没有游标的情况下共享游标查询和查询之间的比较吗?
  • Improvement with multiple request could be the result of Objectifys integrated caching. 具有多个请求的改进可能是Objectifys集成缓存的结果。 You might want to disable caching for datastore performance tests 您可能希望禁用数据存储区性能测试的缓存

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM