[英]How to get Cassandra set size?
I want to store info about some events in Cassandra. 我想在Cassandra中存储有关某些事件的信息。 Events have different groups and also grouped by time interval (group id = partition key, interval = clustering key). 事件具有不同的组,并且也按时间间隔分组(组ID =分区键,时间间隔=群集键)。 Events has id and inside every group I want to store only events with unique id inside this group. 事件具有ID,并且在每个组中我只想在该组中存储具有唯一ID的事件。 I think to use sets for it and store event id in them. 我认为为此使用集并在其中存储事件ID。 Something like this: 像这样:
group id (PK) | time (CK) | event ids
1 | 13:00 | {0, 2, 4, 5}
1 | 14:00 | {1, 3}
1 | 15:00 | {}
2 | 13:00 | {}
2 | 14:00 | {2, 4}
When I do select request I want to get events count for special group inside some time range. 当我选择请求时,我想在某个时间范围内获取特殊组的事件计数。 It will be next for table above and group with id 1
for time range 13:00 - 15:00
: 上表将在下一个,时间范围为13:00 - 15:00
: 13:00 - 15:00
ID为1
组:
13:00 - 4
14:00 - 2
15:00 - 0
I can select all events sets for group 1
for time range 13:00 - 15:00
and calculate their side. 我可以为时间范围13:00 - 15:00
选择组1
所有事件集,并计算其边。 It will works but events set can be large enough and I don't need info about event ids (I store it only for uniqueness), only their size. 它将起作用,但是事件集可以足够大,并且我不需要有关事件ID的信息(我仅出于唯一性而存储它),而无需它们的大小。 Can I get sets sizes on Cassandra side using CQL? 我可以使用CQL在Cassandra端获取集大小吗?
Don't use collection for huge data 不要将收集用于大数据
Collection (Set): collection size: 2B (231); 集合(Set):集合大小:2B(231); values size: 65535 (216-1) (Cassandra 2.1 and later, using native protocol v3) 值大小:65535(216-1)(Cassandra 2.1及更高版本,使用本机协议v3)
Instead put event_id in the primary key. 而是将event_id放在主键中。
CREATE TABLE events(
group_id bigint,
time bigint,
event_id bigint,
PRIMARY KEY(group_id,time,event_id)
);
You can insert data like this one : 您可以像这样插入数据:
INSERT INTO events (group_id , time , event_id ) VALUES ( 1, 13, 0);
And you can query like this one : 您可以像这样查询:
SELECT * FROM events WHERE group_id = 1;
It will return all the event in a group. 它将以组的形式返回所有事件。
group_id | time | event_id
----------+------+----------
1 | 13 | 0
1 | 13 | 1
1 | 14 | 2
Use Spark or Write program to Find the group by count. 使用Spark或Write程序按计数查找组。
Or use any one of these query to get count. 或使用这些查询中的任何一个进行计数。
SELECT group_id,time,count(*) FROM events WHERE group_id = 1 AND time = 13; // To count in a group and time
SELECT group_id,time,count(*) FROM events WHERE group_id = 1 AND time >= 13 AND time <= 14; // To count in a group between time 13 to 14.
Source : https://docs.datastax.com/en/cql/3.1/cql/cql_reference/refLimits.html 来源: https : //docs.datastax.com/en/cql/3.1/cql/cql_reference/refLimits.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.