Kafka Connect HDFS Sink与Azure Blob存储

Question

I want to connect to Azure Blob Storage with Kafka HDFS Sink Connector. 我想使用Kafka HDFS Sink连接器连接到Azure Blob存储。 So far I have done: 到目前为止，我已经完成了：

Set kafka-connect properties: 设置kafka-connect属性：

 hdfs.url=wasbs://<my_url> hadoop.conf.dir={hadoop_3_home}/etc/hadoop/ hadoop.home={hadoop_3_home}

And in core-site.xml added support for wasbs: 并且在core-site.xml添加了对wasbs的支持：

 <property> <name>fs.wasbs.impl</name> <value>org.apache.hadoop.fs.azure.NativeAzureFileSystem</value> </property>

Exported HADOOP_CLASSPATH variable, added to PATH 导出的HADOOP_CLASSPATH变量，添加到PATH

But anyway, Hadoop can not find the class - NativeAzureFileSystem : 但是无论如何，Hadoop无法找到该类NativeAzureFileSystem ：

at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at io.confluent.connect.hdfs.storage.StorageFactory.createStorage(StorageFactory.java:29)
 ... 11 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azure.NativeAzureFileSystem not found
 at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
 at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)

Could you please help with this issue. 您能帮忙解决这个问题吗？ Is it even possible? 可能吗？

Answer 1

my goal is: backup everything from Kafka to Azure BLOB of any data format. 我的目标是：将所有数据从Kafka备份到Azure BLOB。

The HDFS and cloud connectors can't backup "any format". HDFS和云连接器无法备份“任何格式”。 Confluent's Avro is the first class citizen of file formats. Confluent的Avro是文件格式的一流公民。 JSON secondly, but there is not a "plain text" format, from what I've found. 其次是JSON，但根据我发现的内容，没有“纯文本”格式。 I think the HDFS connector does support "byte array" format. 我认为HDFS连接器确实支持“字节数组”格式。

As I mentioned in the comments, in my opinion, a backup of Kafka is different than indefinitely retaining the data to a file system. 正如我在评论中提到的那样，我认为Kafka的备份不同于将数据无限期地保存到文件系统中。 Backing up Kafka-to-Kafka includes use of MirrorMaker. 备份Kafka-to-Kafka包括使用MirrorMaker。

If you want to use any format, Spark, Flink, NiFi, or Streamsets have more flexibility for handling that out of the box 如果您想使用任何格式，Spark，Flink，NiFi或Streamset都具有更大的灵活性，可以直接使用

Kafka Connect HDFS Sink与Azure Blob存储

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-01-29 19:21:26

Kafka Connect HDFS Sink与Azure Blob存储

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-01-29 19:21:26

解决方案1
0 已采纳 2018-01-29 19:21:26