简体   繁体   中英

Apache Beam Global Counting

I am trying to understand the best way of solving the following:

As simple example scenario, I have a file which describes a test name and if its execution passed (true/false).

test-scenario,passed
--------------------
testA,true
testB,false

Using apache beam I can read, parse the file into PCollection<TestDetails> and then using subsequent transforms write all test details which have passed to one set of files and likewise for those tests which failed.

After writing the above files I would finally like to generate some counts regarding: the total number of file records processed, number tests that passed, number test that failed and write these details to a single file.

Should I use a combine global for this?

For this purpose, you can use Beam Metrics (please, see the documentation ). It provides counters, that can be used for the needs you described above, and then metrics can be fetched once your pipeline is finished. Please, take a look on this example . Also, Beam allows to export metrics into external sink, if it's more convenient.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM