简体   繁体   English

如何在 ScalarDB 上实现应用程序级分页

[英]How to implement application level pagination over ScalarDB

This question is part-Cassandra and part ScalarDB.这个问题部分是 Cassandra,部分是 ScalarDB。 I am using ScalarDB which provide ACID support on top of Cassandra .我正在使用 ScalarDB,它在Cassandra之上提供 ACID 支持。 The library seem to be working well, Unfortunately.不幸的是,图书馆似乎运作良好。 ScalarDB doesn't support pagination though so I have to implement it in the application. ScalarDB 不支持分页,所以我必须在应用程序中实现它。

Consider this scenario in which P is primary key, C is clustering key and E is other data within the partition考虑这种情况,其中P是主键, C是集群键, E是分区内的其他数据

Partition => { P,C1,E1
P,C2,E1
P,C2,E2
P,C2,E3
P,C2,E4
P,C3,E1
...
P,Cm,En
}

In ScalarDB, I can specify start and end values of keys so I suppose ScalarDB will get data only from the specified rows.在 ScalarDB 中,我可以指定键的开始和结束值,所以我想 ScalarDB 将只从指定的行获取数据。 I can also limit the no.我也可以限制没有。 of entries fetched.获取的条目数。

https://scalar-labs.github.io/scalardb/javadoc/com/scalar/db/api/Scan.html https://scalar-labs.github.io/scalardb/javadoc/com/scalar/db/api/Scan.html

Say I want to get entries E3 and E4 from P,C2 .假设我想从P,C2获取条目E3E4 For smaller values, I can specify start and end clustering keys as C2 and set fetch limit to say 4 and ignore E1 and E2 .对于较小的值,我可以将开始和结束聚类键指定为 C2 并将 fetch limit 设置为 4 并忽略E1E2 But if there are several hundred records then this method will not scale.但是如果有数百条记录,那么这种方法将无法扩展。

For example say P,C1 has 10 records, P,C2 has 100 records and I want to implement pagination of 20 records per query.例如说P,C1有 10 条记录, P,C2有 100 条记录,我想为每个查询实现 20 条记录的分页。 Then to implement this, I'll have to Query 1 – Scan – primary key will be P, clustering start will be C1, clustering end will be Cn as I don't know how many records are there.然后要实现这一点,我必须查询 1 - 扫描 - 主键将是 P,集群开始将是 C1,集群结束将是 Cn,因为我不知道那里有多少条记录。

  • get P,C1 .得到P,C1 This will give 10 records这将提供 10 条记录
  • get P,C2 .得到P,C2 This will give me 20 records.这将给我 20 条记录。 I'll ignore last 10 and combine P,C1 's 10 with P,C2 's first 10 and return the result.我将忽略 last 10 并将P,C1的 10 与P,C2的前 10 组合并返回结果。

I'll also have to maintain that the last cluster key queried was C2 and also that 10 records were fetched from it.我还必须保持查询的最后一个集群键是C2 ,并且从中获取了 10 条记录。

Query 2 (for next pagination request) - Scan – primary key will be P, clustering start will be C2, clustering end will be Cn as I don't know how many records are there.查询 2(用于下一个分页请求) - 扫描 - 主键为 P,聚类开始为 C2,聚类结束为 Cn,因为我不知道那里有多少条记录。 Now I'll fetch P,C2 and get 20, ignore 1st 10 (as they were sent last time), take the remaining 10, do another fetch using same Scan and take first 10 from that.现在我将获取P,C2并获取 20,忽略第一个 10(因为它们上次发送),获取剩余的 10,使用相同的 Scan 再次获取并从中获取前 10。

Is this how it should be done or is there a better way?这是应该怎么做还是有更好的方法? My concern with above implementation is that every time I'll have to fetch loads of records and dump them.我对上述实现的担忧是,每次我都必须获取大量记录并转储它们。 For example, say I want to get records 70-90 from P,C2 then I'll still query up to record 60 and dump the result!例如,假设我想从P,C2获取记录 70-90,那么我仍然会查询到记录 60 并转储结果!

Primary keys and Clustering keys compose a primary key so your above example looks not right.主键和集群键组成一个主键,因此您上面的示例看起来不正确。 Let' assume the following data structure.让我们假设以下数据结构。

P, C1, ...
P, C2, ...
P, C3, ...
...

Anyways, I think one of the ways could be as follows.无论如何,我认为其中一种方法可能如下。 Assuming the page size is 2.假设页面大小为 2。

  1. Scan with start (P, C1) inclusive, ascending and with limit 2. Results stored in R1以起始 (P, C1) 包括在内、升序和限制 2 进行扫描。结果存储在 R1 中
  2. Get the last record of R1 -> (P, C2).获取 R1 -> (P, C2) 的最后一条记录。
  3. Scan with start the previous last record (P, C2) not inclusive, ascending with limit 2. ...以不包括上一条记录 (P, C2) 开始扫描,以限制 2 递增。 ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM