简体   繁体   English

带有rocksdb的Flink增量检查点使用大量内存

[英]Flink incremental checkpoint with rocksdb use a lot of memory

Now I'm using incremental Checkpoint in Flink with RocksDB, running on a container environment.现在我在 Flink 和 RocksDB 中使用增量检查点,在容器环境中运行。 As I know, rocksdb will use a lot of memory when doing incremental checkpoint, there is already a JIRA describe this problem: https://issues.apache.org/jira/browse/FLINK-7289 I have tried to adjust my Rocksdb's configuration, but my container still be killed because of OOM.据我所知,rocksdb 在做增量检查点时会使用很多内存,已经有一个 JIRA 描述了这个问题: https ://issues.apache.org/jira/browse/FLINK-7289 我已经尝试调整我的 Rocksdb 的配置,但我的容器仍然因为 OOM 被杀死。 Here are the monitor page: my container will be killed and restart and then killed again.这是监视器页面:我的容器将被杀死并重新启动,然后再次被杀死。

在此处输入图片说明

Here are my configurations:这是我的配置:

public class BackendOptions implements OptionsFactory {

@Override
public DBOptions createDBOptions(DBOptions dbOptions) {
    return dbOptions
            .setIncreaseParallelism(4)
            .setUseFsync(false)
            .setMaxOpenFiles(-1);

}
@Override
public ColumnFamilyOptions createColumnOptions(ColumnFamilyOptions columnFamilyOptions) {
    return columnFamilyOptions.setCompactionStyle(CompactionStyle.LEVEL)
            .setLevelCompactionDynamicLevelBytes(true)
            .setTargetFileSizeBase(256 * 1024 * 1024)
            .setWriteBufferSize(64 * 1024 * 1024)
            .setMaxBytesForLevelBase(1024 * 1024 * 1024)
            .setMinWriteBufferNumberToMerge(2)
            .setMaxWriteBufferNumber(5)

            .setOptimizeFiltersForHits(true)
            .setTableFormatConfig(
                    new BlockBasedTableConfig()
                            .setBlockCacheSize(256 * 1024 * 1024)  // 256 MB
                            .setBlockSize(128 * 1024) //// 128 KB
                            .setCacheIndexAndFilterBlocks(true)
            );
}

I make my checkpoint every 1 minutes and state size is about 5GB.我每 1 分钟创建一次检查点,状态大小约为 5GB。 Can somebody help me or tell me some right way to use incremental checkpoint ?有人可以帮助我或告诉我使用增量检查点的正确方法吗?

This appears to be fixed in the newer versions of Flink, being 1.10 and above.这似乎在较新版本的 Flink 中得到修复,即 1.10 及更高版本。 The question is about 2019 and per feb 2020 the linked issue got closed.问题是关于 2019 年,到 2020 年 2 月,相关问题已关闭。

Details can be found here https://issues.apache.org/jira/browse/FLINK-7289详细信息可以在这里找到https://issues.apache.org/jira/browse/FLINK-7289

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM