I have problem with KSQL is data lost while update stream (terminate query and drop stream, create new stream) and publish data to 'MainTopic'.
My KSQL architecture is:
MAIN_STREAM ----> CONDITION_STREAM
I lost data during terminate query and drop stream until create new CONDITION_STREAM.
Can suggest way to update new CONDITION_STREAM while publish data in 'MainTopic' and continue consume data at time of terminate query and drop stream.
Forgive my English skill. Thank.
I try to use 'auto.offset.reset'='earliest' for CONDITION_STREAM but this consume all of data in MainTopic from MAIN_STREAM.
STEP 1: create MAIN_STREAM from main topic and condition stream for output topic
CREATE STREAM MAIN_STREAM (event_id VARCHAR , payload VARCHAR) WITH (KAFKA_TOPIC='MainTopic', VALUE_FORMAT='json');
STEP 2: create CONDITION_STREAM filter data from MAIN_STREAM
CREATE STREAM CONDITION_STREAM WITH (KAFKA_TOPIC='OutputTopic', VALUE_FORMAT='json') AS SELECT * from MAIN_STREAM WHERE event = "payment";
STEP 3: terminate query id of CONDITION_STREAM
TERMINATE <CONDITION_STREAM_QUERY_ID>;
STEP 4: create new CONDITION_STREAM
DROP STREAM CONDITION_STREAM;
STEP 5: create new CONDITION_STREAM
CREATE STREAM CONDITION_STREAM WITH (KAFKA_TOPIC='OutputTopic', VALUE_FORMAT='json') AS SELECT * from main_stream WHERE event = "something " AND EXTRACTJSONFIELD(payload, '$.status') = 'init';
Recreating the CONDITION_STREAM
and having it pick up processing of its main_stream
input from where it left off is not currently possible, (as of ksqlDB v0.10).
However, its an area receiving a lot of thought and attention. There is a design proposal in review at the current time to take the first steps in allowing existing stream and table views to be updated in-place. See https://github.com/confluentinc/ksql/pull/5611 for more info.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.