简体   繁体   English

一致性 ONE/LOCAL_QUORUM 读取查询期间的 Cassandra 超时

[英]Cassandra timeout during read query at consistency ONE/LOCAL_QUORUM

Table Structure表结构

CREATE TABLE tablename(
col1 text,
col2 text,
col3 timestamp,
col4 timestamp,
col5 text,
col6 timestamp,
.
.
PRIMARY KEY (col5, col6))
WITH CLUSTERING ORDER BY (col6 DESC)

CREATE CUSTOM INDEX indexname on tablename (col1) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX indexname on tablename (col2) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX indexname on tablename (col3) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX indexname on tablename (col4) USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX indexname on tablename (col6) USING 'StorageAttachedIndex';

Read Query:阅读查询:

select col1, col2, col3, col4, col.... from tablename
where col1='text'
and col2='text'
and col3>'timestamp'
and col4>='timestamp'
and col4<='timestamp'
PER PARTITION LIMIT 1;

In Java, I have written a code to execute a query to fetch 100,000 records with below config:在 Java 中,我编写了一个代码来执行查询以使用以下配置获取 100,000 条记录:

  1. executeAsync执行异步
  2. Fetch_Size = 10000 Fetch_Size = 10000
  3. Not using ALLOW FILTERING不使用允许过滤
  4. DSE - 6.8.9 DSE - 6.8.9
  5. Cql - 3.4.5重庆 - 3.4.5
  6. Cassandra - 4.0.0.681卡桑德拉 - 4.0.0.681
  7. Java driver - 4.6.1 Java 驱动程序 - 4.6.1

When I run the code, it works perfectly and responding in around 1 min 20 sec for 100,000 rows.当我运行代码时,它完美地工作并在大约 1 分 20 秒内响应 100,000 行。

But when I try to run in more than 2 windows parallelly, then only one window showing the result and other windows throwing timeout error.但是当我尝试在 2 个以上的窗口中并行运行时,只有一个窗口显示结果,其他窗口抛出超时错误。

Cassandra timeout during read query at consistency ONE一致性 ONE 读取查询期间的 Cassandra 超时

When I run the code, it works perfectly and responding in around 1 min 20 sec当我运行代码时,它运行良好并在大约 1 分 20 秒内响应

TBH I'm surprised this returns a result set at all. TBH 我很惊讶这会返回一个结果集。 Cassandra was not designed to support OLAP or queries requiring filtering on many different columns. Cassandra 并非旨在支持 OLAP 或需要对许多不同列进行过滤的查询。

The reason it's timing out, is that queries based on a secondary index (or multiple indexes, in this case) put extra stress on one node.它超时的原因是基于二级索引(或多个索引,在这种情况下)的查询给一个节点带来了额外的压力。 When they run, a "coordinator" node is selected.当它们运行时,会选择一个“协调器”节点。 That node is then responsible for pulling data from all of the other nodes and assembling the result set (in RAM).然后该节点负责从所有其他节点提取数据并组装结果集(在 RAM 中)。

The default timeouts are set with the specific intent of stopping queries like this, because they can (and often do) cause nodes to crash.默认超时设置的特定目的是停止这样的查询,因为它们可能(并且经常会)导致节点崩溃。 I imagine that supporting two similar queries in parallel is too much for the cluster to handle.我想并行支持两个类似的查询对于集群来说太多了。

The way around this, is to ensure that your queries are always filtering on a partition key ( col5 in this case).解决这个问题的方法是确保您的查询始终过滤分区键(在本例中为col5 )。 Single partition queries ensure that only a single node will be queried.单分区查询确保只查询一个节点。 That's why the idea with Cassandra is to build your tables around the intended queries.这就是为什么 Cassandra 的想法是围绕预期查询构建表。 In this case, building a query table with partition keys of col1 and col2 would help to ensure that.在这种情况下,使用col1col2分区键构建查询表将有助于确保这一点。 Adding clustering keys of col3 and col4 will help for your other conditions:添加col3col4聚类键将有助于您的其他条件:

PRIMARY KEY ((col1, col2),col3,col4)

Of course, I'm building that definition without an understanding of the cardinality of col1 or col2 .当然,我是在不了解col1col2的基数的情况下构建该定义的。 As Cassandra has a partition limit of 2GB and 2 billion cells, it's always a good idea to keep your partition sizes much lower than that.由于 Cassandra 的分区限制为 2GB 和 20 亿个单元,因此将分区大小保持在低于该值总是一个好主意。 In which case, an additional partition key and running more than one query for smaller parts of the data set would be the way to go.在这种情况下,一个额外的分区键并对数据集的较小部分运行多个查询将是可行的方法。

I recommend checking out DataStax Academy , specifically the (free) course DS220 on Data Modeling.我建议查看DataStax Academy ,特别是有关数据建模的(免费)课程 DS220。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Cassandra 在一致性 LOCAL_QUORUM 的读取查询期间超时(需要 2 个响应,但只有 0 个副本响应) - Cassandra timeout during read query at consistency LOCAL_QUORUM (2 responses were required but only 0 replica responded) 一致性为LOCAL_QUORUM的写查询期间,Cassandra失败 - Cassandra failure during write query at consistency LOCAL_QUORUM Cassandra 读取一致性 LOCAL_QUORUM - Cassandra read consistency LOCAL_QUORUM 读取查询期间Cassandra超时,一致性为LOCAL_ONE - Cassandra timeout during read query at consistency LOCAL_ONE 一致性为ONE的读取查询期间的Cassandra超时 - Cassandra timeout during read query at consistency ONE Cassandra LOCAL_QUORUM - Cassandra LOCAL_QUORUM 如何在Cassandra群集端添加LOCAL_QUORUM一致性 - How to Add LOCAL_QUORUM consistency on Cassandra Cluster side Cassandra 在一致性 LOCAL_ONE 读取查询期间超时(等待修复不一致副本时超时) - Cassandra timeout during read query at consistency LOCAL_ONE (timeout while waiting for repair of inconsistent replica) 一致性 LOCAL_ONE 的 SIMPLE 写入查询期间的 Cassandra 超时 - Cassandra timeout during SIMPLE write query at consistency LOCAL_ONE 停用节点时,在读取查询期间以一致性LOCAL_ONE获取Cassandra超时 - Getting Cassandra timeout during read query at consistency LOCAL_ONE while decommissioning a node
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM