简体   繁体   English

Spark Streaming检查点到远程HDFS

[英]Spark Streaming checkpoint to remote hdfs

I am trying to checkpoint my spark streaming context to hdfs to handle a failure at some point of my application. 我正在尝试将我的火花流上下文检查点指向hdfs,以处理我的应用程序某处的故障。 I have my HDFS setup on a separate cluster and spark running on a separate standalone server. 我将HDFS设置在单独的群集上,并在单独的独立服务器上运行spark。 To do this, I am using : 为此,我正在使用:

ssc.checkpoint(directory: String)

This gives me : org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE when I try with directory as "hdfs://hostname:port/pathToFolder" 这给了我: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE当我尝试使用目录作为"hdfs://hostname:port/pathToFolder"org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE

How can I checkpoint to a remote hdfs path? 如何检查指向远程hdfs路径? Is it possible to add credentials to the string uri? 是否可以向字符串uri添加凭据? I tried googling, but no help so far. 我尝试了谷歌搜索,但到目前为止没有帮助。

Thanks and appreciate any help! 感谢并感谢您的帮助!

您可以使用以下方式提供凭据:

hdfs://username:password@hostname:port/pathToFolder

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM