繁体   English   中英

全球联合收割机不生产 output Apache Beam

[英]Global combine not producing output Apache Beam

我正在尝试编写一个无界 ping 管道,该管道从 ping 命令中获取 output 并对其进行解析以确定有关 RTT(平均/最小/最大)的一些统计信息,现在,只需打印结果。

我已经编写了一个无界 ping 源,它会在每行输入时输出它。每 5 秒 ping 一次,结果每秒窗口化一次。 窗口化的数据被馈送到一个Combine.globally调用以有状态地处理字符串输出。 问题是累加器永远不会合并,output 永远不会被提取。 这意味着管道永远不会超过这一点。 我在这里做错了什么?

public class TestPingIPs {
   public static void main(String[] args)
   {
      PipelineOptions options = PipelineOptionsFactory.create();
      Pipeline pipeline = Pipeline.create(options);
      String destination = "8.8.8.8";
      PCollection<PingResult> res =
              /*
              Run the unbounded ping command. Only the lines where the result of the ping command are returned.
              No statistics or first startup lines are returned here.
               */
              pipeline.apply("Ping command",
                      PingCmd.read()
                              .withPingArguments(PingCmd.PingArguments.create(destination, -1)))
             /*
             Window the ping command strings into 5 second sliding windows produced every 1 second
              */
              .apply("Window strings",
                      Window.into(SlidingWindows.of(Duration.standardSeconds(5))
                              .every(Duration.standardSeconds(1))))
             /*
             Parse and aggregate the strings into a PingResult object using stateful processing.
              */
              .apply("Combine the pings",
                      Combine.globally(new ProcessPings()).withoutDefaults())
             /*
             Test our output to see what we get here
              */
              .apply("Test output",
                      ParDo.of(new DoFn<PingResult, PingResult>() {
                 @ProcessElement
                 public void processElement(ProcessContext c)
                 {
                    System.out.println(c.element().getAvgRTT());
                    System.out.println(c.element().getPacketLoss());
                    c.output(c.element());
                 }
              }));

      pipeline.run().waitUntilFinish();
   }


   static class ProcessPings extends Combine.CombineFn<String, RttStats, PingResult> {
      private long getRTTFromLine(String line){
         long rtt = Long.parseLong(line.split("time=")[1].split("ms")[0]);
         return rtt;
      }

      @Override
      public RttStats createAccumulator()
      {
         return new RttStats();
      }

      @Override
      public RttStats addInput(RttStats mutableAccumulator, String input)
      {
         mutableAccumulator.incTotal();
         if (input.contains("unreachable")) {
            _unreachableCount.inc();
            mutableAccumulator.incPacketLoss();
         }
         else if (input.contains("General failure")) {
            _transmitFailureCount.inc();
            mutableAccumulator.incPacketLoss();
         }
         else if (input.contains("timed out")) {
            _timeoutCount.inc();
            mutableAccumulator.incPacketLoss();
         }
         else if (input.contains("could not find")) {
            _unknownHostCount.inc();
            mutableAccumulator.incPacketLoss();
         }
         else {
            _successfulCount.inc();
            mutableAccumulator.add(getRTTFromLine(input));
         }

         return mutableAccumulator;
      }

      @Override
      public RttStats mergeAccumulators(Iterable<RttStats> accumulators)
      {
         Iterator<RttStats> iter = accumulators.iterator();
         if (!iter.hasNext()){
            return createAccumulator();
         }
         RttStats running = iter.next();
         while (iter.hasNext()){
            RttStats next = iter.next();
            running.addAll(next.getVals());
            running.addLostPackets(next.getLostPackets());
         }
         return running;
      }

      @Override
      public PingResult extractOutput(RttStats stats)
      {
         stats.calculate();
         boolean connected = stats.getPacketLoss() != 1;
         return new PingResult(connected, stats.getAvg(), stats.getMin(), stats.getMax(), stats.getPacketLoss());
      }

      private final Counter _successfulCount = Metrics.counter(ProcessPings.class, "Successful pings");
      private final Counter _unknownHostCount = Metrics.counter(ProcessPings.class, "Unknown hosts");
      private final Counter _transmitFailureCount = Metrics.counter(ProcessPings.class, "Transmit failures");
      private final Counter _timeoutCount = Metrics.counter(ProcessPings.class, "Timeouts");
      private final Counter _unreachableCount = Metrics.counter(ProcessPings.class, "Unreachable host");
   }

我猜我写的CombineFn存在一些问题,但我似乎无法弄清楚这里出了什么问题! 我尝试按照此处的示例进行操作,但是我仍然必须缺少一些东西。

编辑:我在下面添加了 ping 命令实现。 这是在我测试时在 Direct Runner 上运行的。

PingCmd.java:

public class PingCmd {
 public static Read read(){
      if (System.getProperty("os.name").startsWith("Windows")) {
         return WindowsPingCmd.read();
      }
      else{
         return null;
      }
   }

WindowsPingCmd.java:

public class WindowsPingCmd extends PingCmd {
   private WindowsPingCmd()
   {
   }

   public static PingCmd.Read read()
   {
      return new WindowsRead.Builder().build();
   }


   static class PingCheckpointMark implements UnboundedSource.CheckpointMark, Serializable {
      @VisibleForTesting
      Instant oldestMessageTimestamp = Instant.now();
      @VisibleForTesting
      transient List<String> outputs = new ArrayList<>();

      public PingCheckpointMark()
      {
      }

      public void add(String message, Instant timestamp)
      {
         if (timestamp.isBefore(oldestMessageTimestamp)) {
            oldestMessageTimestamp = timestamp;
         }
         outputs.add(message);
      }

      @Override
      public void finalizeCheckpoint()
      {
         oldestMessageTimestamp = Instant.now();
         outputs.clear();
      }

      // set an empty list to messages when deserialize
      private void readObject(java.io.ObjectInputStream stream)
              throws IOException, ClassNotFoundException
      {
         stream.defaultReadObject();
         outputs = new ArrayList<>();
      }

      @Override
      public boolean equals(@Nullable Object other)
      {
         if (other instanceof PingCheckpointMark) {
            PingCheckpointMark that = (PingCheckpointMark) other;
            return Objects.equals(this.oldestMessageTimestamp, that.oldestMessageTimestamp)
                    && Objects.deepEquals(this.outputs, that.outputs);
         }
         else {
            return false;
         }
      }
   }


   @VisibleForTesting
   static class UnboundedPingSource extends UnboundedSource<String, PingCheckpointMark> {

      private final WindowsRead spec;

      public UnboundedPingSource(WindowsRead spec)
      {
         this.spec = spec;
      }

      @Override
      public UnboundedReader<String> createReader(
              PipelineOptions options, PingCheckpointMark checkpointMark)
      {
         return new UnboundedPingReader(this, checkpointMark);
      }

      @Override
      public List<UnboundedPingSource> split(int desiredNumSplits, PipelineOptions options)
      {
         // Don't really need to ever split the ping source, so we should just have one per destination
         return Collections.singletonList(new UnboundedPingSource(spec));
      }

      @Override
      public void populateDisplayData(DisplayData.Builder builder)
      {
         spec.populateDisplayData(builder);
      }

      @Override
      public Coder<PingCheckpointMark> getCheckpointMarkCoder()
      {
         return SerializableCoder.of(PingCheckpointMark.class);
      }

      @Override
      public Coder<String> getOutputCoder()
      {
         return StringUtf8Coder.of();
      }
   }


   @VisibleForTesting
      static class UnboundedPingReader extends UnboundedSource.UnboundedReader<String> {

      private final UnboundedPingSource source;

      private String current;
      private Instant currentTimestamp;
      private final PingCheckpointMark checkpointMark;
      private BufferedReader processOutput;
      private Process process;
      private boolean finishedPings;
      private int maxCount = 5;
      private static AtomicInteger currCount = new AtomicInteger(0);

      public UnboundedPingReader(UnboundedPingSource source, PingCheckpointMark checkpointMark)
      {
         this.finishedPings = false;
         this.source = source;
         this.current = null;
         if (checkpointMark != null) {
            this.checkpointMark = checkpointMark;
         }
         else {
            this.checkpointMark = new PingCheckpointMark();
         }
      }

      @Override
      public boolean start() throws IOException
      {
         WindowsRead spec = source.spec;
         String cmd = createCommand(spec.pingConfiguration().getPingCount(), spec.pingConfiguration().getDestination());
         try {
            ProcessBuilder builder = new ProcessBuilder(cmd.split(" "));
            builder.redirectErrorStream(true);
            process = builder.start();

            processOutput = new BufferedReader(new InputStreamReader(process.getInputStream()));
            return advance();
         } catch (Exception e) {
            throw new IOException(e);
         }
      }

      private String createCommand(int count, String dest){
         StringBuilder builder = new StringBuilder("ping");
         String countParam = "";
         if (count <= 0){
            countParam = "-t";
         }
         else{
            countParam += "-n " + count;
         }

         return builder.append(" ").append(countParam).append(" ").append(dest).toString();
      }

      @Override
      public boolean advance() throws IOException
      {
         String line = processOutput.readLine();
         // Ignore empty/null lines
         if (line == null || line.isEmpty()) {
            line = processOutput.readLine();
         }
         // Ignore the 'Pinging <dest> with 32 bytes of data' line
         if (line.contains("Pinging " + source.spec.pingConfiguration().getDestination())) {
            line = processOutput.readLine();
         }
         // If the pings have finished, ignore
         if (finishedPings) {
            return false;
         }
         // If this is the start of the statistics, the pings are done and we can just exit
         if (line.contains("statistics")) {
            finishedPings = true;
         }

         current = line;
         currentTimestamp = Instant.now();
         checkpointMark.add(current, currentTimestamp);
         if (currCount.incrementAndGet() == maxCount){
            currCount.set(0);
            return false;
         }
         return true;
      }

      @Override
      public void close() throws IOException
      {
         if (process != null) {
            process.destroy();
            if (process.isAlive()) {
               process.destroyForcibly();
            }
         }
      }

      @Override
      public Instant getWatermark()
      {
         return checkpointMark.oldestMessageTimestamp;
      }

      @Override
      public UnboundedSource.CheckpointMark getCheckpointMark()
      {
         return checkpointMark;
      }

      @Override
      public String getCurrent()
      {
         if (current == null) {
            throw new NoSuchElementException();
         }
         return current;
      }

      @Override
      public Instant getCurrentTimestamp()
      {
         if (current == null) {
            throw new NoSuchElementException();
         }
         return currentTimestamp;
      }

      @Override
      public UnboundedPingSource getCurrentSource()
      {
         return source;
      }
   }


   public static class WindowsRead extends PingCmd.Read {
      private final PingArguments pingConfig;

      private WindowsRead(PingArguments pingConfig)
      {
         this.pingConfig = pingConfig;
      }

      public Builder builder()
      {
         return new WindowsRead.Builder(this);
      }

      PingArguments pingConfiguration()
      {
         return pingConfig;
      }

      public WindowsRead withPingArguments(PingArguments configuration)
      {
         checkArgument(configuration != null, "configuration can not be null");
         return builder().setPingArguments(configuration).build();
      }

      @Override
      public PCollection<String> expand(PBegin input)
      {
         org.apache.beam.sdk.io.Read.Unbounded<String> unbounded =
                 org.apache.beam.sdk.io.Read.from(new UnboundedPingSource(this));

         return input.getPipeline().apply(unbounded);
      }

      @Override
      public void populateDisplayData(DisplayData.Builder builder)
      {
         super.populateDisplayData(builder);
         pingConfiguration().populateDisplayData(builder);
      }

      static class Builder {
         private PingArguments config;

         Builder()
         {
         }

         private Builder(WindowsRead source)
         {
            this.config = source.pingConfiguration();
         }

         WindowsRead.Builder setPingArguments(PingArguments config)
         {
            this.config = config;
            return this;
         }

         WindowsRead build()
         {
            return new WindowsRead(this.config);
         }
      }

      @Override
      public int hashCode()
      {
         return Objects.hash(pingConfig);
      }


   }

我在您的代码中注意到的一件事是advance()总是返回True 水印仅在捆绑完成时前进,我认为如果advance永远不会返回False ,跑步者是否会完成捆绑取决于跑步者。 您可以尝试在有限的时间/ping 次数后返回False

您也可以考虑将其重写为SDF

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM