簡體   English   中英

在Astyanax Cassandra API中處理失敗的種子節點

[英]Handling failing seed nodes in Astyanax Cassandra API

也許我誤解了Astyanax Cassandra API中自動節點發現的工作原理,但這是我的問題:

我有以下設置:

2個數據中心,每個中心有2個節點,復制因子為2。

DC1:N1和N2,DC2:N3和N4

種子節點是N1和N3(也提供給應用程序)。 自動發現其他節點(N2和N4)似乎可行。 即使它們沒有顯示在主機池中。

如果N3發生故障,則將數據正確寫入N4,並且在節點再次出現時也正確同步到N3。 N1和N2也是如此。

當兩個種子節點(N1和N3)都發生故障時,就會發生此問題。 然后,數據不再寫入N2和N4(按預期方式),但是異常導致應用程序失敗(當一個種子節點關閉時,Astyanax將異常信息寫入日志,但這通常不會導致應用失敗)。

顯然,當應用程序啟動時,種子節點必須處於聯機狀態,但是我認為astyanax中的自動節點發現將使種子節點發生故障,以便復制節點可以接管(使用CL_ONE的一致性級別) 。

有什么方法可以避免這種故障,還是我只是誤解了自動發現節點,還是只是做錯了什么大錯?

一些其他信息:節點主要使用cassandra.yaml中的默認設置,並且令牌是使用文檔中建議的python腳本生成的。

private AstyanaxContext<Cluster> connect(final String hosts) {
    AstyanaxConfigurationImpl asConfig = new AstyanaxConfigurationImpl();
    asConfig.setDefaultWriteConsistencyLevel(ConsistencyLevel.CL_ONE);
    asConfig.setDefaultReadConsistencyLevel(ConsistencyLevel.CL_ONE);
    AstyanaxContext<Cluster> context = new AstyanaxContext.Builder()
            .forCluster("TestSuitCluster")
            .withAstyanaxConfiguration(
                    asConfig.setDiscoveryType(NodeDiscoveryType.TOKEN_AWARE)
                    .setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE))
            .withConnectionPoolConfiguration(
                    new ConnectionPoolConfigurationImpl(
                            "CassandraConnectionPool").setSeeds(hosts)
                            .setMaxConnsPerHost(8).setMaxConns(8))
            .withConnectionPoolMonitor(new ConnectionPoolMonitor())
            .buildCluster(ThriftFamilyFactory.getInstance());
    context.start();
    return context;
}

最后一個種子節點消失時顯示的堆棧跟蹤:

com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=127.0.0.1(127.0.0.1):9160, latency=2000(2000), attempts=1]Timed out waiting for connection
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:218)
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:185)
    at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:66)
    at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:67)
    at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.describeKeyspaces(ThriftClusterImpl.java:165)
    at com.netflix.astyanax.thrift.ThriftClusterImpl.describeKeyspace(ThriftClusterImpl.java:184)
    at at.dbeg.cassandra.CasandraTestSuit.deleteKeyspace(CasandraTestSuit.java:134)
    at at.dbeg.cassandra.CasandraTestSuit.runTests(CasandraTestSuit.java:189)
    at at.dbeg.cassandra.CasandraTestSuit.main(CasandraTestSuit.java:50)    
com.netflix.astyanax.connectionpool.exceptions.ConnectionAbortedException: ConnectionAbortedException: [host=127.0.0.1(127.0.0.1):9160, latency=0(0), attempts=1]org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset by peer: socket write error
    at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:193)
    at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
    at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
    at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
    at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
    at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
    at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
    at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
    at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$6$3.execute(ThriftKeyspaceImpl.java:355)
    at at.dbeg.cassandra.CasandraTestSuit.testWrite(CasandraTestSuit.java:269)
    at at.dbeg.cassandra.CasandraTestSuit.runTests(CasandraTestSuit.java:168)
    at at.dbeg.cassandra.CasandraTestSuit.main(CasandraTestSuit.java:50)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset by peer: socket write error
    at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147)
    at org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:156)
    at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
    at org.apache.cassandra.thrift.Cassandra$Client.send_insert(Cassandra.java:833)
    at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:822)
    at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$6$3$1.internalExecute(ThriftKeyspaceImpl.java:367)
    at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$6$3$1.internalExecute(ThriftKeyspaceImpl.java:358)
    at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
    ... 10 more
Caused by: java.net.SocketException: Connection reset by peer: socket write error
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
    at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
    ... 17 more 

我想我終於找到了答案。 如果沒有自己的HostSupplier,這在群集環境中是不可能的。 解決此問題的最簡單方法是遍歷群集中的所有鍵空間,並使用RingDescribeHostSupplier的邏輯來查找所有主機。

如果使用此HostSupplier並在AstyanaxContext中進行設置,則會顯示預期的行為。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM