简体   繁体   English

使用Windows中的Java和Kerberos Keytab在Cloudera上访问HDFS

[英]Accessing HDFS on Cloudera with Java and Kerberos Keytab from Windows

I'm trying to connect to my HDFS instance running on Cloudera. 我正在尝试连接到在Cloudera上运行的HDFS实例。 My first step was enabling Kerberos and creating Keytabs (as shown here ). 我的第一个步骤启用Kerberos和创建Keytabs(如图所示这里 )。

In the next step i would like to authenticate with a keytab. 在下一步中,我想使用keytab进行身份验证。

Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://cloudera:8020");
conf.set("hadoop.security.authentication", "kerberos");

UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab("hdfs@CLOUDERA", "/etc/hadoop/conf/hdfs.keytab");

FileSystem fs = FileSystem.get(conf);
FileStatus[] fsStatus = fs.listStatus(new Path("/"));
for (int i = 0; i < fsStatus.length; i++) {
    System.out.println(fsStatus[i].getPath().toString());
}

It fails with the following error 它失败并出现以下错误

java.io.IOException: Login failure for hdfs@CLOUDERA from keytab /etc/hadoop/conf/hdfs.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user java.io.IOException:来自keytab /etc/hadoop/conf/hdfs.keytab的hdfs @ CLOUDERA登录失败:javax.security.auth.login.LoginException:无法从用户获取密码

The question is: how do I correctly handle the keytab? 问题是:如何正确处理密钥表? Do i have to copy it to my local machine? 我必须将它复制到我的本地机器吗?

When running a Hadoop client on Windows to reach a kerberized cluster, you need a specific "native library" (ie DLL). 在Windows上运行Hadoop客户端以访问kerberized集群时,您需要一个特定的“本机库” (即DLL)。
As far as I can tell there is no good reason for that, because that lib is not actually used outside of some automated regression tests (!?!) so it's a pain inflicted to Hadoop users by Hadoop committers. 据我所知,没有充分的理由,因为lib实际上并没有在一些自动回归测试之外使用(!?!),因此Hadoop提交者对Hadoop用户造成了痛苦。

To add extra pain, there is no official build of that DLL (and of the Windows "stub" that enable its use from Java). 为了增加额外的痛苦,没有正式构建该DLL(以及允许从Java使用它的Windows“存根”)。 You must either (a) build it yourself from source code -- good luck -- or (b) search the internet for a downloadable Hadoop-for-Windows runtime, and pray that is does not contain any malware. 您必须(a)自己从源代码构建它 - 祝你好运 - 或者(b)在互联网上搜索可下载的Hadoop-for-Windows运行时,并祈祷它不包含任何恶意软件。
The best option (for 64-bit Windows) is here: https://github.com/steveloughran/winutils 最好的选择(对于64位Windows)在这里: https//github.com/steveloughran/winutils
...and the ReadMe explains why you can reasonably trust that run-time. ...而自述文件解释了为什么你可以合理地相信运行时。 But if you are stuck with an older 32-bit Windows, then you are on your own. 但是,如果您遇到旧的32位Windows,那么您就是独立的。

Now let's assume you deployed that run-time on your Windows box under 现在让我们假设您在Windows框下部署了该运行时
C:\\Some Dir\\hadoop\\bin\\
(the final bin is required; the embedded space is just extra fun) (最后的bin是必需的;嵌入式空间只是额外的乐趣)

You must point the Hadoop client to that run-time with a couple of Java properties: 您必须使用几个Java属性将Hadoop客户端指向该运行时:
"-Dhadoop.home.dir=C:/Some Dir/hadoop" "-Djava.library.path=C:/Some Dir/hadoop/bin"
(note the double quotes around Windows args as a whole, to protect embedded spaces in the paths, which have been translated to Java style for extra fun) (注意Windows args周围的双引号,以保护路径中的嵌入空间,这些空间已被翻译为Java风格以获得额外的乐趣)
(in Eclipse, just stuff these props under "VM Arguments", quotes included) (在Eclipse中,只需将这些道具填入“VM Arguments”,包括引号)

Now, there's the Kerberos config. 现在,有Kerberos配置。 If your KDC is your corporate Active Directory server, then Java should find the config parameters automatically. 如果您的KDC是您的公司Active Directory服务器,那么Java应该自动找到配置参数。 But if your KDC is a standalone "MIT Kerberos" install on Linux, then you have to find a valid /etc/krb5.conf file on the cluster, copy it on your Windows box, and have Java use it with an additional property... 但是,如果您的KDC是Linux上的独立“MIT Kerberos”安装,那么您必须在群集上找到有效的/etc/krb5.conf文件,将其复制到Windows框中,并让Java使用其他属性。 ..
"-Djava.security.krb5.conf=C:/Some Other Dir/krb5.conf"

Then let's assume you have created your keytab file on a Linux box, using ktutil (or an Active Directory admin created it for you with some AD command) and you dropped the file under 然后我们假设您已经在Linux机器上创建了keytab文件,使用ktutil (或者使用某些AD命令为您创建的Active Directory管理员)并将文件放在
C:\\Some Other Dir\\foo.keytab
Before anything else, if the keytab is for a real Windows account -- ie your own account -- or a Prod service account, then make sure that keytab is secure!! 在其他任何事情之前,如果密钥表是用于真正的Windows帐户 - 即您自己的帐户 - 或Prod服务帐户,那么请确保密钥表是安全的! Use the Windows Security dialog box to restrict access to your account only (and maybe System, for backups) . 使用Windows安全性对话框仅限制对您的帐户的访问(对于备份,可能只限制系统) Because that file could enable anyone, on any machine, to authenticate on the cluster (and any Kerberos-enabled system, including Windows). 因为该文件可以使任何计算机上的任何人在群集(以及任何支持Kerberos的系统,包括Windows)上进行身份验证。

Now you can try to authenticate using 现在您可以尝试使用身份验证
UserGroupInformation.loginUserFromKeytab("foo@BAR.ORG", "C:/Some Other Dir/foo.keytab");

If it does not work, enable the Kerberos debug traces with both an environment variable 如果它不起作用,请使用环境变量启用Kerberos调试跟踪
set HADOOP_JAAS_DEBUG=true
...and a Java property ...和Java属性
-Dsun.security.krb5.debug=true
(in Eclipse, set these in "Environment" and "VM Arguments" respectively) (在Eclipse中,分别在“Environment”和“VM Arguments”中设置它们)

Do you have set proper permissions? 你有设置适当的权限吗?

 chown hdfs:hadoop /etc/hadoop/conf/hdfs.keytab
 chmod 440 /etc/hadoop/conf/hdfs.keytab

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM