[英]Handling failing seed nodes in Astyanax Cassandra API
也許我誤解了Astyanax Cassandra API中自動節點發現的工作原理,但這是我的問題:
我有以下設置:
2個數據中心,每個中心有2個節點,復制因子為2。
DC1:N1和N2,DC2:N3和N4
種子節點是N1和N3(也提供給應用程序)。 自動發現其他節點(N2和N4)似乎可行。 即使它們沒有顯示在主機池中。
如果N3發生故障,則將數據正確寫入N4,並且在節點再次出現時也正確同步到N3。 N1和N2也是如此。
當兩個種子節點(N1和N3)都發生故障時,就會發生此問題。 然后,數據不再寫入N2和N4(按預期方式),但是異常導致應用程序失敗(當一個種子節點關閉時,Astyanax將異常信息寫入日志,但這通常不會導致應用失敗)。
顯然,當應用程序啟動時,種子節點必須處於聯機狀態,但是我認為astyanax中的自動節點發現將使種子節點發生故障,以便復制節點可以接管(使用CL_ONE的一致性級別) 。
有什么方法可以避免這種故障,還是我只是誤解了自動發現節點,還是只是做錯了什么大錯?
一些其他信息:節點主要使用cassandra.yaml中的默認設置,並且令牌是使用文檔中建議的python腳本生成的。
private AstyanaxContext<Cluster> connect(final String hosts) {
AstyanaxConfigurationImpl asConfig = new AstyanaxConfigurationImpl();
asConfig.setDefaultWriteConsistencyLevel(ConsistencyLevel.CL_ONE);
asConfig.setDefaultReadConsistencyLevel(ConsistencyLevel.CL_ONE);
AstyanaxContext<Cluster> context = new AstyanaxContext.Builder()
.forCluster("TestSuitCluster")
.withAstyanaxConfiguration(
asConfig.setDiscoveryType(NodeDiscoveryType.TOKEN_AWARE)
.setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE))
.withConnectionPoolConfiguration(
new ConnectionPoolConfigurationImpl(
"CassandraConnectionPool").setSeeds(hosts)
.setMaxConnsPerHost(8).setMaxConns(8))
.withConnectionPoolMonitor(new ConnectionPoolMonitor())
.buildCluster(ThriftFamilyFactory.getInstance());
context.start();
return context;
}
最后一個種子節點消失時顯示的堆棧跟蹤:
com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=127.0.0.1(127.0.0.1):9160, latency=2000(2000), attempts=1]Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:218)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:185)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:66)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:67)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
at com.netflix.astyanax.thrift.ThriftClusterImpl.describeKeyspaces(ThriftClusterImpl.java:165)
at com.netflix.astyanax.thrift.ThriftClusterImpl.describeKeyspace(ThriftClusterImpl.java:184)
at at.dbeg.cassandra.CasandraTestSuit.deleteKeyspace(CasandraTestSuit.java:134)
at at.dbeg.cassandra.CasandraTestSuit.runTests(CasandraTestSuit.java:189)
at at.dbeg.cassandra.CasandraTestSuit.main(CasandraTestSuit.java:50)
com.netflix.astyanax.connectionpool.exceptions.ConnectionAbortedException: ConnectionAbortedException: [host=127.0.0.1(127.0.0.1):9160, latency=0(0), attempts=1]org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset by peer: socket write error
at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:193)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$6$3.execute(ThriftKeyspaceImpl.java:355)
at at.dbeg.cassandra.CasandraTestSuit.testWrite(CasandraTestSuit.java:269)
at at.dbeg.cassandra.CasandraTestSuit.runTests(CasandraTestSuit.java:168)
at at.dbeg.cassandra.CasandraTestSuit.main(CasandraTestSuit.java:50)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset by peer: socket write error
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
at org.apache.cassandra.thrift.Cassandra$Client.send_insert(Cassandra.java:833)
at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:822)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$6$3$1.internalExecute(ThriftKeyspaceImpl.java:367)
at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$6$3$1.internalExecute(ThriftKeyspaceImpl.java:358)
at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
... 10 more
Caused by: java.net.SocketException: Connection reset by peer: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
... 17 more
我想我終於找到了答案。 如果沒有自己的HostSupplier,這在群集環境中是不可能的。 解決此問題的最簡單方法是遍歷群集中的所有鍵空間,並使用RingDescribeHostSupplier的邏輯來查找所有主機。
如果使用此HostSupplier並在AstyanaxContext中進行設置,則會顯示預期的行為。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.