简体   繁体   中英

Fiware: Data loss prevention

I'm working with the 0.27.0 version of context broker. I'm using the Cygnus generic enabler and I have established a MQTT agent that connects external devices to the context broker.

My major concern right now is how to prevent from data loss. I established the context broker and the Cygnus mongodb databases as replica sets, but that won't ensure that all data will be persisted into the databases. I have seen that Cygnus uses Apache flume. Looking at its configuration, the re-injection retries can be configured:

# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries) 
cygnusagent.sources.http-source.handler.events_ttl = -1

¿It is a good idea to establish the retries value to -1? I have read about events re-injected in the channel forever. ¿What can be done to ensure that all the data will be persisted? ¿Is there any functionality into fiware ecosystem oriented to that purpose?

Regarding Cygnus, the TTL is for sure the way of controlling the persistence retries after an error. A retry means the data is reinjected in the internal channel communicating the source (which receives Orion notifications) and the sink (which persists the data in the final storage) for future persistence attempts.

Possible values for this TTL are:

  • TTL = 0: there are no retries, ie if the first time a notified data cannot be persisted in the final storage (because of a network fail, a storage error, whatever) then the data is dropped.
  • TTL > 0: there are as much retries as configured TTL. Once exhausted the TTL the data is dropped.
  • TTL = -1: infinite retries, ie the data is reinjected in the channel forever until it is persisted or the channel gets full.

As commented, a -1 TTL may consume the channel capacity if the final storage never gets OK, avoiding new received data is put into the channel. Nevertheless, if the final storage never gets OK, such a drawback does not matter, right? :)

Thus, we could say the rules for choosing a TTL are:

  • If you don't want retries, simply configure 0.
  • If you want retries but you don't mind to loose data afeter certain number of retries, then configure a positive value.
  • If you want retries but you don't want to loose data, then configure -1 and a large channel capacity since the final storage may be down for an unknown time.

In any case, the TTL feature is changing during this sprint. The behaviour will be the same, but instead of being applied to single events, it will applied to batches of events (batches may be about 1 single event, of course). You'll see this change in the next release of Cygnus (0.13.0), and it will be available at the end of February 2016 (at the moment of writing this, the next week :)). My recommendation is to wait for such a release if you want to instensively use the TTL feature.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM