[英]How to implement application level pagination over ScalarDB
This question is part-Cassandra and part ScalarDB.这个问题部分是 Cassandra,部分是 ScalarDB。 I am using ScalarDB which provide ACID support on top of
Cassandra
.我正在使用 ScalarDB,它在
Cassandra
之上提供 ACID 支持。 The library seem to be working well, Unfortunately.不幸的是,图书馆似乎运作良好。 ScalarDB doesn't support pagination though so I have to implement it in the application.
ScalarDB 不支持分页,所以我必须在应用程序中实现它。
Consider this scenario in which P
is primary key, C
is clustering key and E
is other data within the partition考虑这种情况,其中
P
是主键, C
是集群键, E
是分区内的其他数据
Partition => { P,C1,E1
P,C2,E1
P,C2,E2
P,C2,E3
P,C2,E4
P,C3,E1
...
P,Cm,En
}
In ScalarDB, I can specify start and end values of keys so I suppose ScalarDB will get data only from the specified rows.在 ScalarDB 中,我可以指定键的开始和结束值,所以我想 ScalarDB 将只从指定的行获取数据。 I can also limit the no.
我也可以限制没有。 of entries fetched.
获取的条目数。
https://scalar-labs.github.io/scalardb/javadoc/com/scalar/db/api/Scan.html https://scalar-labs.github.io/scalardb/javadoc/com/scalar/db/api/Scan.html
Say I want to get entries E3
and E4
from P,C2
.假设我想从
P,C2
获取条目E3
和E4
。 For smaller values, I can specify start and end clustering keys as C2 and set fetch limit to say 4 and ignore E1
and E2
.对于较小的值,我可以将开始和结束聚类键指定为 C2 并将 fetch limit 设置为 4 并忽略
E1
和E2
。 But if there are several hundred records then this method will not scale.但是如果有数百条记录,那么这种方法将无法扩展。
For example say P,C1
has 10 records, P,C2
has 100 records and I want to implement pagination of 20 records per query.例如说
P,C1
有 10 条记录, P,C2
有 100 条记录,我想为每个查询实现 20 条记录的分页。 Then to implement this, I'll have to Query 1 – Scan – primary key will be P, clustering start will be C1, clustering end will be Cn as I don't know how many records are there.然后要实现这一点,我必须查询 1 - 扫描 - 主键将是 P,集群开始将是 C1,集群结束将是 Cn,因为我不知道那里有多少条记录。
P,C1
.P,C1
。 This will give 10 recordsP,C2
.P,C2
。 This will give me 20 records.P,C1
's 10 with P,C2
's first 10 and return the result.P,C1
的 10 与P,C2
的前 10 组合并返回结果。 I'll also have to maintain that the last cluster key queried was C2
and also that 10 records were fetched from it.我还必须保持查询的最后一个集群键是
C2
,并且从中获取了 10 条记录。
Query 2 (for next pagination request) - Scan – primary key will be P, clustering start will be C2, clustering end will be Cn as I don't know how many records are there.查询 2(用于下一个分页请求) - 扫描 - 主键为 P,聚类开始为 C2,聚类结束为 Cn,因为我不知道那里有多少条记录。 Now I'll fetch
P,C2
and get 20, ignore 1st 10 (as they were sent last time), take the remaining 10, do another fetch using same Scan and take first 10 from that.现在我将获取
P,C2
并获取 20,忽略第一个 10(因为它们上次发送),获取剩余的 10,使用相同的 Scan 再次获取并从中获取前 10。
Is this how it should be done or is there a better way?这是应该怎么做还是有更好的方法? My concern with above implementation is that every time I'll have to fetch loads of records and dump them.
我对上述实现的担忧是,每次我都必须获取大量记录并转储它们。 For example, say I want to get records 70-90 from
P,C2
then I'll still query up to record 60 and dump the result!例如,假设我想从
P,C2
获取记录 70-90,那么我仍然会查询到记录 60 并转储结果!
Primary keys and Clustering keys compose a primary key so your above example looks not right.主键和集群键组成一个主键,因此您上面的示例看起来不正确。 Let' assume the following data structure.
让我们假设以下数据结构。
P, C1, ...
P, C2, ...
P, C3, ...
...
Anyways, I think one of the ways could be as follows.无论如何,我认为其中一种方法可能如下。 Assuming the page size is 2.
假设页面大小为 2。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.