简体   繁体   中英

Writing Kafka topic data to HDFS

I am trying to push kafka topic data to HDFS. I can see kafka topic data in kafka-consumer-console window.

Here is my code . Not calling writeToWebHDFS(record) method itself. Till Before calling HDFS is printing. writeToWebHDFS method contains new landing zone url and writing code.

val stream = KafkaUtils.createDirectStream[String, String]( ssc, PreferConsistent, Subscribe[String, String](topics, kafkaParams))
stream.map(record=>(record.value().toString)).print
print("+++++++++++++ Before calling HDFS +++++++++++++++++++++++ ") val uploadFile = stream.map(record =>

writeToWebHDFS code snippet

def writeToWebHDFS( record: >org.apache.kafka.clients.consumer.ConsumerRecord[String, String]) = {

val res = Http(" https://hdfsurl:port/gateway/webhdfs/webhdfs/v1/opt/sandboxes/user/test/ " + record.key().toString().toLowerCase().replaceAll(" ", "") + ".txt?op=CREATE&overwrite=true").put("") .option(HttpOptions.allowUnsafeSSL) .auth("user_mail_id"," *pwd ").asString()

val location = res.headers.get("Location").get(0) val upload = Http(location.toString()).put(record.value()) .timeout(30000, 30000) .option(HttpOptions.allowUnsafeSSL) .auth("user_mail_id", " *pwd ").asString

print(" Done uploading to HDFS ") }

Please suggest me how to call writeToWebHDFS function

I would suggest that instead of reinventing the Wheel , you should actually use HDFS connector . You will get more details here https://github.com/confluentinc/kafka-connect-hdfs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM