简体   繁体   English

Apache Samza 刷新表立即更新到更改日志

[英]Apache Samza flush table update to changelog immediately

If I specify a changelog backing for a RocksDB Table in Samza.如果我为 Samza 中的 RocksDB 表指定更改日志支持。 Is there configuration to update the async write time to the changelog?是否有配置将异步写入时间更新到更改日志? I want to reduce it to a shorter time.我想把它缩短到更短的时间。 I cannot see anything in the Config reference .我在Config 参考中看不到任何内容。

The scenario I want is too write to a changelog from a stream after bridging a legacy JMS connection.我想要的场景是在桥接旧版 JMS 连接后从 stream 写入变更日志。 This legacy connection provides partial updates and I want to merge the partial updates into a fuller message building a cache of these messages in the samza streaming application and write these down to a changelog.此旧连接提供部分更新,我想将部分更新合并到更完整的消息中,在 samza 流应用程序中构建这些消息的缓存,并将它们写到更改日志中。

If I use a changelog configured with stores.store-name.changelog then it will write to the changelog eventually changes I make to the Samze API Table.如果我使用配置了stores.store-name.changelog的更改日志,那么它将写入更改日志,最终我对 Samze API 表所做的更改。 But not quick enough for my needs so want to configure the max wait time to propagate to changelog.但对我的需求不够快,所以想配置最大等待时间以传播到变更日志。

Alternatively it seems that using the withSideInputs to bootstrap my table each time and then using sendTo will work faster to update and I can keep a LocalStore to read and write the cache too and always have the changelog as golden source.或者,似乎每次使用withSideInputs引导我的表,然后使用sendTo会更快地更新,我也可以保留LocalStore来读取和写入缓存,并且始终将更改日志作为黄金来源。

The reason I want the changelog to write quickly too is because other applications are reading from this changelog.我希望更改日志也能快速写入的原因是因为其他应用程序正在从该更改日志中读取。

Yes you can configure the time it will commit changes to the changelog usin the config:是的,您可以在配置中配置将更改提交到更改日志的时间:

task.commit.ms

Docs 文档

Then writes to the store will be flushed when the commit happens:然后,当提交发生时,对存储的写入将被刷新:

profileTable.put(message.key, message.value) 

A note on this higher volumes of input appear to result in changes going to changelog topic before this commit millisecond configuration.在此提交毫秒配置之前,有关此较高输入量的注释似乎会导致更改日志主题的更改。 Also be careful not to put too low as will slow down overall throughout massively with higher volumes.还要注意不要放得太低,因为随着音量的增加,整体速度会变慢。

You can also use the low level API to commit on a particular stream task the TaskCoordinator provides commit api to manually commit.您还可以使用低级别 API 来提交特定的 stream 任务,TaskCoordinator 提供提交api 以手动提交。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM