简体   繁体   English

风暴螺栓不打印/记录卡夫卡喷口

[英]Storm Bolt not printing/logging Kafka Spout

Edit: I added an .ack() to the Bolt (which required me to use a Rich Bolt instead of the basic bolt) and am having the same issue - nothing that tells me tuples are being processed by the bolt. 编辑:我向Bolt添加了一个.ack()(这要求我使用Rich Bolt而不是基本的Bolt),并且存在相同的问题-没有任何信息告诉我Tuple正在被处理。

If it matters, I'm running this on a CentOS image on an EC2 instance. 如果有关系,我将在EC2实例上的CentOS映像上运行它。 Any help would be appreciated. 任何帮助,将不胜感激。



I'm trying to set up a very basic HelloWorld Storm example to read messages from a Kafka cluster and print/log the messages I get. 我正在尝试建立一个非常基本的HelloWorld Storm示例,以从Kafka集群读取消息并打印/记录我收到的消息。

Currently I have 20 messages in the Kafka cluster. 目前,我在Kafka集群中有20条消息。 When I run the topology (which appears to start just fine), I am able to see my Kafka Spout as well as the Echo Bolt. 当运行拓扑时(似乎开始就很好),我可以看到我的Kafka喷口和回声螺栓。 In the Storm UI, the Kafka Spout Acked column has 20 as a value - which I would assume is the number of messages that it was able to read/access (?) 在Storm UI中,“ Kafka Spout Acked列的值是20-我认为这是它能够读取/访问的消息数(?)

The Echo Bolt line, however, only notes that I have 1 executor and 1 tasks. 但是,Echo Bolt行仅指出我有1个执行程序和1个任务。 All other columns are 0. 所有其他列均为0。

Looking at the Storm worker log that is generated, I see this line: Read partition information from: /HelloWorld Spout/partition_0 --> {"topic":"helloworld","partition":0,"topology":{"id":"<UUID>","name":"Kafka-Storm test"},"broker":{"port":6667,"host":"ip-10-0-0-35.ec2.internal"},"offset":20} 查看生成的Storm worker日志,我看到以下行: Read partition information from: /HelloWorld Spout/partition_0 --> {"topic":"helloworld","partition":0,"topology":{"id":"<UUID>","name":"Kafka-Storm test"},"broker":{"port":6667,"host":"ip-10-0-0-35.ec2.internal"},"offset":20}以下位置Read partition information from: /HelloWorld Spout/partition_0 --> {"topic":"helloworld","partition":0,"topology":{"id":"<UUID>","name":"Kafka-Storm test"},"broker":{"port":6667,"host":"ip-10-0-0-35.ec2.internal"},"offset":20}

The next few lines are as follows: 接下来的几行如下:

s.k.PartitionManager [INFO] Last commit offset from zookeeper: 0
s.k.PartitionManager [INFO] Commit offset 0 is more than 9223372036854775807 behind, resetting to startOffsetTime=-2
s.k.PartitionManager [INFO] Starting Kafka ip-10-0-0-35.ec2.internal:0 from offset 0
s.k.ZkCoordinator [INFO] Task [1/1] Finished refreshing
s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition manager connections
s.k.DynamicBrokersReader [INFO] Read partition info from zookeeper: GlobalPartitionInformation{partitionMap={0=ip-10-0-0-35.ec2.internal:6667}}

The rest of the worker log shows no log/print out of the messages processed by the Bolt. 工作日志的其余部分在由Bolt处理的消息中未显示任何日志/打印。 I'm at a loss of why the Bolt doesn't seem to be getting any of the messages from the Kafka Cluster. 我不知道为什么Bolt似乎没有从Kafka集群中获取任何消息。 Any help would be great. 任何帮助都会很棒。 Thanks. 谢谢。

Building the KafkaSpout 建立KafkaSpout

private static KafkaSpout setupSpout() {
  BrokerHosts hosts = new ZkHosts("localhost:2181");
  SpoutConfig spoutConfig = new SpoutConfig(hosts, "helloworld", "", "HelloWorld Spout");
  spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
  spoutConfig.forceFromStart = true;
  spoutConfig.startOffsetTime = kafka.api.OffsetRequest.EarliestTime();
  return new KafkaSpout(spoutConfig);
}

Building the topology and submitting it 构建拓扑并提交

public static void main(String[] args) {
  TopologyBuilder builder = new TopologyBuilder();
  builder.setSpout("Kafka Spout", setupSpout());
  builder.setBolt("Echo Bolt", new SystemOutEchoBolt());

  try {
    System.setProperty("storm.jar", "/tmp/storm.jar");
    StormSubmitter.submitTopology("Kafka-Storm test", new Config(), builder.createTopology());
  } //catchExceptionsHere
}

Bolt 螺栓

public class SystemOutEchoBolt extends BaseRichBolt {

  private static final long serialVersionUID = 1L;
  private static final Logger logger = LoggerFactory.getLogger(SystemOutEchoBolt.class);

  private OutputCollector m_collector;

  @SuppressWarnings("rawtypes")
  @Override
  public void prepare(Map _map, TopologyContext _conetxt, OutputCollector _collector) {
    m_collector = _collector;
  }

  @Override
  public void execute(Tuple _tuple) {
    System.out.println("Printing tuple with toString(): " + _tuple.toString());
    System.out.println("Printing tuple with getString(): " + _tuple.getString(0));
    logger.info("Logging tuple with logger: " + _tuple.getString(0));
    m_collector.ack(_tuple);
  }

  @Override
  public void declareOutputFields(OutputFieldsDeclarer _declarer) {}
}

The answer was simple. 答案很简单。 I was never telling the bolt which stream to subscribe to. 我从来没有告诉螺栓要订阅哪个流。 Adding .shuffleGrouping("Kafka Spout"); 添加.shuffleGrouping("Kafka Spout"); fixed the issue. 解决了这个问题。

You need to call an ack or fail on the tuple in your bolts, otherwise the spout doesn't know that the tuple has been fully processed. 您需要调用ack或使螺栓中的元组失败,否则,喷嘴将不知道元组已被完全处理。 This will cause the count issues you're seeing. 这将导致您看到计数问题。

public class SystemOutEchoBolt extends BaseBasicBolt {

  private static final long serialVersionUID = 1L;
  private static final Logger logger = LoggerFactory.getLogger(SystemOutEchoBolt.class);

  @Override
  public void execute(Tuple _tuple, BasicOutputCollector _collector) {
    System.out.println("Printing tuple with toString(): " + _tuple.toString());
    System.out.println("Printing tuple with getString(): " + _tuple.getString(0));
    logger.info("Logging tuple with logger: " + _tuple.getString(0));
    _collector.ack(_tuple);
  }

  @Override
  public void declareOutputFields(OutputFieldsDeclarer arg0) {}
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM