简体   繁体   English

带分页的Datastax Cassandra Java驱动程序RetryPolicy

[英]Datastax Cassandra java driver RetryPolicy for Statement with paging

I'm running a query that fetches millions of rows (5.000.000 or so). 我正在运行一个获取数百万行(5.000.000左右)的查询。 My nodes seem to be quite busy, as the coordinator returns a com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses were required but only 0 replica responded) exception. 我的节点似乎很忙,因为协调器返回com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses were required but only 0 replica responded)异常。 (I don't really know if the nodes are busy or something else is going on). (我真的不知道节点是否忙或发生了其他事情)。

So far I've tried setting a higher read_request_timeout_in_millis in every Cassandra node, and executing the query like this 到目前为止,我已经尝试在每个Cassandra节点中设置更高的read_request_timeout_in_millis,并像这样执行查询

new SimpleStatement("SELECT * FROM where date = ? ",param1)
    .setFetchSize(pageSize).setConsistencyLevel(ConsistencyLevel.ONE)
    .setReadTimeoutMillis(ONE_DAY_IN_MILLIS);
ResultSet resultSet = this.session.execute(statement);

But the exception is still being thrown. 但是仍在抛出异常。 My next move is to try a custom RetryPolicy, but can someone tell me if a readTimeout retry will execute the whole query again or will retry from the current page that failed? 我的下一步是尝试自定义RetryPolicy,但是有人可以告诉我readTimeout重试是否将再次执行整个查询,还是将从当前失败的页面重试?

I was trying something like this: 我正在尝试这样的事情:

@Override
public RetryDecision onReadTimeout(Statement statement, ConsistencyLevel cl, int requiredResponses, int receivedResponses, boolean dataRetrieved, int nbRetry) {
    if (dataRetrieved) {
        return RetryDecision.ignore();
    } else if (nbRetry < readRetries) {
        LOGGER.info("Retry attemp {} out of {} ",nbRetry,readRetries);
        return RetryDecision.retry(cl);
    } else {
        return RetryDecision.rethrow();
    }
}

where readReatries is the number of retries that I will attemp to fetch the data. 其中readReatries是我将尝试获取数据的重试次数。

When you use fetch size on query driver will never issue whole query up front. 当您在查询驱动程序上使用访存大小时,永远不会预先发出整个查询。 Even when you do not specify fetch size driver will use 5000 as fetch size to prevent overloading the memory with many objects. 即使未指定访存大小,驱动程序也将使用5000作为访存大小,以防止许多对象使内存过载。 What is happening, is that chunk of results are fetched by issuing query with limit and while you iterate over results, when you get to end of chunk driver will issue query for following number of results and so on. 发生的情况是,通过发出具有限制的查询来获取结果块,并且在对结果进行迭代时,当到达块末尾时,驱动程序将针对以下结果数发出查询,依此类推。 All in all if result number is bigger that fetch size multiple queries will get issued from driver to cluster. 总而言之,如果结果数大于获取大小,则会从驱动程序向集群发出多个查询。 Nice sequence diagram along with other explanations can be seen on official datastax driver page . 可以在官方的datastax驱动程序页面上看到漂亮的序列图以及其他说明。

That being said RetryPolicy works on single statement, and does not know nothing about fetch size, so that statement will get retried number of times you define (meaning only that chunk will get retried on timeout). 话虽这么说, RetryPolicy对单个语句起作用,并且对获取大小一无所知,所以该语句将获得重定义的次数(这意味着超时时将仅重试该块)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Paging Datastax Java驱动程序 - Paging Datastax java driver 如何在Cassandra中使用datastax java驱动程序有效地使用准备好的语句? - How to use prepared statement efficiently using datastax java driver in Cassandra? NoHostAvailableException使用Cassandra和DataStax Java驱动程序如果是大ResultSet - NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet 绑定Java datastax驱动程序中的cassandra多列 - cassandra multi column in binding java datastax driver Cassandra对象使用Datastax Java驱动程序映射注释 - Cassandra object mapping annotations with Datastax Java driver 无法使用Cassandra Datastax Java驱动程序连接到Cassandra节点之一 - Unable to connect to one of the Cassandra nodes using Cassandra Datastax Java Driver 通过 ssl 连接 cassandra 与 datastax cassandra JAVA 驱动程序 - connecting cassandra through ssl with datastax cassandra JAVA driver 几百次插入后,DataStax Cassandra java驱动程序与NoHostAvailableException崩溃 - DataStax Cassandra java driver crashes with NoHostAvailableException after a few hundred inserts 如何使用datastax java驱动程序有效地使用批量写入cassandra? - How to efficiently use Batch writes to cassandra using datastax java driver? 如何使用带有Datastax Java驱动程序的CQL向Cassandra添加任意列? - How to add arbitrary columns to Cassandra using CQL with Datastax Java driver?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM