简体   繁体   English

Flink 仪表板:操作员吞吐量

[英]Flink Dashboard: Operator Throughput

  1. I have a program that uses flink (1.9) and I want to check the throughput of instances of a Map operator with the help of the dashboard.我有一个使用 flink (1.9) 的程序,我想在仪表板的帮助下检查 Map 运算符实例的吞吐量。 From the already existing metrics numRecordsInPerSecond seems to be the most promising, but I guess it doesn't involve processing time.从现有的指标 numRecordsInPerSecond 似乎是最有前途的,但我想它不涉及处理时间。 Am I wrong?我错了吗?

  2. I've defined my own metric (throughput) that essentially calculates the average throughput by dividing the number of records processed by the total execution time of the OUT map(IN value) function.我已经定义了我自己的指标(吞吐量),它基本上通过将处理的记录数除以 OUT 映射(IN 值)函数的总执行时间来计算平均吞吐量。 But this does not count anything that happens outside the map function.但这不包括在 map 函数之外发生的任何事情。

  3. Another idea would be to add a meter at the end of the map function, but I suppose if the source doesn't produce records fast enough the throughput calculated will be worse just because the operator remains idle a lot of the time.另一个想法是在 map 函数的末尾添加一个仪表,但我想如果源没有足够快地生成记录,计算出的吞吐量会更糟,因为操作员大部分时间都处于空闲状态。 Is this correct?这样对吗?

Please specifically answer 1 and 2. Also, how do you usually calculate the throughput in your programs?请具体回答1和2。另外,您通常如何计算程序中的吞吐量?

在此处输入图片说明

All of Flink's Meter metrics, such as numRecordsInPerSecond, are measuring rates in terms of processing time. Flink 的所有 Meter 指标,例如 numRecordsInPerSecond,都是根据处理时间来衡量速率的。

I'm generally content to rely on these built-in metrics for measuring throughput.我通常满足于依赖这些内置指标来衡量吞吐量。 But you might want to add a custom metric in the sink, since Flink always returns 0 for numRecordsOut and numRecordsOutPerSecond for sinks.但是您可能希望在接收器中添加自定义指标,因为 Flink 始终为 numRecordsOut 和 numRecordsOutPerSecond 返回 0 为接收器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM