简体繁体中英

How should I setup Kafka paritions vs topics in my setup

原文 2020-05-03 10:35:32 5 1 apache-kafka

I'm looking to use Kafka as my event store/stream for orders, here are a few attributes:

I have two regions to cater for: London and New York
An order started in London is highly likely to have further events (updates) come from London, however we do need to support cross-regional reads/writes (ie for an event started in London, writes can come from New York)
The business would benefit from a lower latency so having London writing to New York or vice versa should be minimized
An order has a lifestyle of 24h, it can be archived from the event log at this point as we no longer need it.
Need resiliency, if the London Kafka plant goes down, I should be able to failover to New York and vice versa.
Ordering of the events needs to be consistent across all regions
Order numbers are only in the 1000s per 24h.

So I'm trying to get my setup of Kafka correct so I can minimise the amount of work I have to do external to Kafka, so my concerns/questions are:

Originating region seems like a natural partitioning key, but as far as I can see it, I gain nothing from a partitioning a topic...I could just have 2 topics, one for London, one for New York? Am I correct?
As far as I can see, in order to have the ability to failover, I need to setup two SEPARATE clusters and use mirror maker to sync the two topics across regions. But this would mean I would need to build logic into my applications so that they publish an event to the correct cluster - am I understanding correctly? Is there any way I can setup Kafka so I don't have to do this and I just connect to the local cluster and read/write to that, letting the cluster take care of where it routes the events to

1 answers

You might want to look into the "rack awareness" configuration for brokers, which helps with rack aware partition replication. This is mostly used to improve cross availability zone traffic, you can read about it more here. The gist of it, is that your consumers can fetch records from the "nearest" replica. In your case a consumer sitting in London might only fetch data from brokers in London, assuming you operate a single cross-region cluster.

Concerning latency: If you don't have any sub-seconds requirements, I would highly recommend to operate a single cluster instead of two. The latency between the east coast and the UK shouldn't be too bad. Keep it simple, Kafka is very robust and can handle most faults within a single cluster (eg a broker dying). Start with a single cluster in one location, you will still be able to add a second one and migrate your data over using mirror maker or a dedicated service.

This would also result in you not having the "same" topic twice for each region. Separate your topics based on their content, not their location. Otherwise you'll have lots of fun, when migrating the data format you use for orders. You want to be as flexible as possible for future changes.

Unable to list kafka topics in openwhisk setup

How are consumers setup in Active - Active Kafka setup

How to setup composer with fabric + kafka

How do I view and delete Kafka topics

How to join two Kafka streams, each having multiple paritions?

How can I setup Spring Cloud Kafka project with SASL_SSL connect Apache Kafka?

Kafka - uncompacted topics Vs compacted topics

Kafka Topics--should I have more or fewer of them?

How to Setup Kafka Streams Message Compression?

How to setup Kafka Idempotent Producer in Spring Boot?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Unable to list kafka topics in openwhisk setup How are consumers setup in Active - Active Kafka setup How to setup composer with fabric + kafka How do I view and delete Kafka topics How to join two Kafka streams, each having multiple paritions? How can I setup Spring Cloud Kafka project with SASL_SSL connect Apache Kafka? Kafka - uncompacted topics Vs compacted topics Kafka Topics--should I have more or fewer of them? How to Setup Kafka Streams Message Compression? How to setup Kafka Idempotent Producer in Spring Boot?

Related Tags

How should I setup Kafka paritions vs topics in my setup

Question

1 answers

solution1 0 2020-05-03 15:37:21

solution1
0 2020-05-03 15:37:21