[英]what should I do for “com.datastax.driver.core.exceptions.ReadTimeoutException”?
I put almost 190 million records
in Cassandra(2.1.11) cluster with 3 nodes, and the replication factor is 1
, then I write client application to count the all records using datastax's Java Driver
, the snippet code as follows: 我在具有3个节点的Cassandra(2.1.11)集群中放置了将近
190 million records
,并且复制因子为1
,然后我编写了客户端应用程序以使用datastax的Java Driver
对所有记录进行计数,代码段如下:
Statement stmt = new SimpleStatement("select * from test" );
System.out.println("starting to read records ");
stmt.setFetchSize(10000);
ResultSet rs = session.execute(stmt);
//System.out.println("rs.size " + rs.all().size());
long cntRecords = 0;
for(Row row : rs){
cntRecords++;
if(cntRecords % 10000000 == 0){
System.out.println("the " + cntRecords/10000000 + " X 10 millions of records");
}
}
After the above variable cntRecords
is more than 30 millions, I always get the exception: 在上面的变量
cntRecords
超过3000万之后,我总是得到异常:
Exception in thread "main" com.datastax.driver.core.exceptions.ReadTimeoutException:
Cassandra timeout during read query at consistency ONE (1 responses were required but only
0 replica responded)
I got several results in google and changed the settings about heap and GC, the following is my relative settings: 我在google中得到了几个结果,并更改了关于堆和GC的设置,以下是我的相对设置:
-XX:InitialHeapSize=17179869184
-XX:MaxHeapSize=17179869184
-XX:MaxNewSize=12884901888
-XX:MaxTenuringThreshold=1
-XX:NewSize=12884901888
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseCompressedOops
-XX:+UseConcMarkSweepGC
-XX:+UseCondCardMark
-XX:+UseGCLogFileRotation
-XX:+UseParNewGC
-XX:+UseTLAB
-XX:+UseThreadPriorities
-XX:+CMSClassUnloadingEnabled
and I used GCViewer to analysis the gc log file and the througputs are 99.95%, 98.15% and 95.75%. 并且我使用GCViewer分析了gc日志文件,吞吐量分别为99.95%,98.15%和95.75%。
UPDATED BEGIN: And I used jstat
to monitor one of the three nodes and found that when the S1
's value changed into 100.00
I will get the above error quickly: 更新开始:我使用
jstat
监视三个节点之一,发现当S1
的值更改为100.00
我将迅速得到上述错误:
/usr/java/jdk1.7.0_80/bin/jstat -gcutil 8862 1000
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 100.00 28.57 36.29 74.66 55 14.612 2 0.164 14.776
And once S1
changed into 100.00
, S1
no longer will decrease, I don't know this is relative to the error? 并且一旦
S1
更改为100.00
, S1
将不再减少,我不知道这是与错误有关的吗? Or what property in cassandra.yaml
or cassandra-env.sh
I should set for this? 还是应该为此设置
cassandra.yaml
或cassandra-env.sh
什么属性?
What should I do for finishing the task to count the all records? 完成该任务以计算所有记录该怎么办? Thanks in advance!
提前致谢!
ATTACH: the following is other options: 附加:以下是其他选项:
-XX:+CMSEdenChunksRecordAlways
-XX:CMSInitiatingOccupancyFraction=75
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSParallelRemarkEnabled
-XX:CMSWaitDuration=10000
-XX:CompileCommandFile=bin/../conf/hotspot_compiler
-XX:GCLogFileSize=94371840
-XX:+HeapDumpOnOutOfMemoryError
-XX:NumberOfGCLogFiles=90
-XX:OldPLABSize=16
-XX:PrintFLSStatistics=1
-XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC
-XX:+PrintPromotionFailure
-XX:+PrintTenuringDistribution
-XX:StringTableSize=1000003
-XX:SurvivorRatio=8
-XX:ThreadPriorityPolicy=42
-XX:ThreadStackSize=256
Examine why you need to know the number of rows. 检查为什么您需要知道行数。 Does your application really need to know this?
您的应用程序真的需要知道这一点吗? If it can survive with "just" a good approximation, then create a counter and increment it as you load your data.
如果它可以“近似”良好的近似值存活,则创建一个计数器,并在加载数据时对其进行递增。
http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_counter_t.html
Things you can try: 您可以尝试的事情:
*
. *
。 This might reduce by GC pressure and network consumption. select column1 from test
select column1 from test
cassandra.yaml
on your nodes and increase range_request_timeout_in_ms
and read_request_timeout_in_ms
cassandra.yaml
并增加range_request_timeout_in_ms
和read_request_timeout_in_ms
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.