简体   繁体   中英

Hints are timing out in cassandra

We are seeing lot of hints timing out and I don't see any thing is logs about nodes are going DOWN. This is strange to me why cassandra is building up the hints table if it does not think it is down. I don't see any GC pauses as well.

Can someone help me how to solve this problem

INFO [HintedHandoff:2] 2015-03-11 01:56:00,958 HintedHandOffManager.java (line 469) Timed out replaying hints to /1.1.1.79; aborting (0 delivered)
INFO [HintedHandoff:1] 2015-03-11 02:03:54,914 HintedHandOffManager.java (line 469) Timed out replaying hints to /1.1.1.76; aborting (0 delivered)

The fact that you have hints on that node indicates that the node itself is up. What this log say is that nodes 1.1.1.79 & 1.1.1.76 are down, or more likely, flapping. You should check for their statuses. Run nodetool tpstats on these nodes, if they are up, look for any dropped mutations. Inspect the logs.

If you want to somehow replicate that behaviour just unplug for 5 seconds, each 10 seconds, 10 times in a row the internet cable from a machine.

Here i have some extras from another machine`s /var/log/cassandra/system.log

INFO [HintedHandoff:2] 2016-10-27 14:20:00,333 HintedHandOffManager.java:486 - Timed out replaying hints to /192.168.0.178; aborting (0 delivered) INFO [HintedHandoff:1] 2016-10-27 14:26:13,393 HintedHandOffManager.java:367 - Started hinted handoff for host: fa16996c-722c-458b-a621-eb53efa79fb2 with IP: /192.168.0.178 INFO [HintedHandoff:1] 2016-10-27 14:28:27,959 HintedHandOffManager.java:486 - Timed out replaying hints to /192.168.0.178; aborting (28850 delivered) INFO [HintedHandoff:2] 2016-10-27 14:36:17,398 HintedHandOffManager.java:367 - Started hinted handoff for host: fa16996c-722c-458b-a621-eb53efa79fb2 with IP: /192.168.0.178

I understand that sometimes it timeouts before the actual stream starts

aborting (0 delivered)

Sometimes it aborts after the stream started, specifying how many were sent and set something like a marker to know from where to stream next time :

aborting (28850 delivered)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM