简体   繁体   English

Flink维护配置state

[英]Flink maintaining configuration state

I have a use case of maintaining configuration in Flink that I don't really know how to handle.我有一个在 Flink 中维护配置的用例,我真的不知道如何处理。

Let's say that I have some configuration stored somewhere and I need it to do my processing.假设我在某处存储了一些配置,我需要它来进行处理。 At the initialization of the Flink job, I want to load all the configuration.在初始化 Flink 作业时,我想加载所有配置。

This configuration can also be modified during the run of the Flink job, so I must keep in memory the state of this configuration and update it when needed.这个配置也可以在 Flink 作业运行的时候修改,所以我必须在 memory 中保留这个配置的 state 并在需要的时候更新。 The updates of configuration are accessible from a KafkaSource.可以从 KafkaSource 访问配置更新。

So here is what I have:所以这就是我所拥有的:

I have a function that load the whole configuration, keep it in a state and associate it with my data stream:我有一个 function 加载整个配置,将其保存在 state 并将其与我的数据 stream 相关联:

public class MyConfiguration extends RichFlatMapFunction<Row, Row>{
    private transient MapState<String, MyConfObject> configuration;

    @Override
    public void open(MyConfiguration config) throws Exception{
        MapStateDescriptor<String,MyConfObject> descriptor = new MapStateDescriptor<String,MyConfObject>(
                "configuration",
                BasicTypeInfo.STRING_TYPE_INFO,
                ...
        );
        configuration = getRuntimeContext().getMapState(descriptor);
        configuration.putAll(...);   // Load configuration from somewhere
    }

    @Override
    public void flatMap(Row value, Collector<Row> out) throws Exception {
        MyConfObject conf = configuration.get(...);
        ...               // Associate conf with data
        out.collect(value);
    }
}

And my pipeline look like this:我的管道看起来像这样:

DataStream<Row> dataStream = ...; // My data stream
DataStream<Map<String, MyConfObject> streamConf = 
     env.addSource(new FlinkKafkaConsumer<Row>(..., ..., ...)) // The stream of configuration updates
        .map(...); 

return dataStream
    .assignTimestampsAndWatermarks(...)
    .flatMap(new MyConfiguration())

    ... //Do some processing

    .map(m -> {
        ObjectMapper objectMapper = new ObjectMapper();
        String json = objectMapper.writeValueAsString(m);
        return json.getBytes();
    });

What I want is to use the stream of configuration updates streamConf to update the State variable inside the MyConfiguration flat map function. What I want is to use the stream of configuration updates streamConf to update the State variable inside the MyConfiguration flat map function. How can I do that?我怎样才能做到这一点?

I'd suggest that you write a source that reads config info from Kafka and then broadcasts changes to the config via broadcast stream to the mapping function.我建议您编写一个从 Kafka 读取配置信息的源,然后通过广播 stream 到映射 function 广播对配置的更改。 The mapping function would store the complete, current config in its persisted state and the broadcast stream means that all instances of the mapping function would get all config changes. The mapping function would store the complete, current config in its persisted state and the broadcast stream means that all instances of the mapping function would get all config changes.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM