简体   繁体   中英

How to enable OAuth2 in Hadoop WebHDFS

I am running Hadoop ver 2.8.2 and am attempting to configure the OAuth 2 Client Credentials Grant flow for a WebHDFS client application. I followed the guidance documented here: WebHDFS REST API . Once on this page, search for OAuth2 to find the section on configuring OAuth 2 for WebHDFS.

Here are the OAuth 2 properties I added to hdfs-site.xml:

  <!-- OAuth2 properties -->
  <property>
    <name>dfs.webhdfs.oauth2.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>dfs.webhdfs.oauth2.access.token.provider</name>
    <value>org.apache.hadoop.hdfs.web.oauth2.ConfCredentialBasedAccessTokenProvider</value>
  </property>
  <property>
    <name>dfs.webhdfs.oauth2.client.id</name>
    <value>webHdfsClient</value>
  </property>
  <property>
    <name>dfs.webhdfs.oauth2.credential</name>
    <value>secret</value>
  </property>
  <property>
    <name>dfs.webhdfs.oauth2.refresh.url</name>
    <value>https://<hostname:port of OAuth 2 token endpoint></value>
  </property>

To my core-site.xml here are the properties I believe might be related to the OAuth2 configuration:

  <property>
    <name>hadoop.http.authentication.simple.anonymous.allowed</name>
    <value>false</value>
  </property>
  <property>
    <name>hadoop.http.authentication.type</name>
    <value>simple</value>
  </property>

I figured, perhaps wrong, that anonymous authentication should not be allowed. According to the documentation, using "simple" requires that user.name= username be included as a query string parameter when first accessing WebHDFS via a web console. I don't think using simple has anything to do with client application authentication via OAuth to WebHDFS, but I thought I should mention it if it does play a role.

I then created a Java client application to access the WebHDFS endpoint. I have configured WebHDFS for SSL so that both the WebHDFS endpoint and the token management server listen using the HTTPS protocol.

Here is the main method of a small java application I wrote to access the root of my WebHDFS endpoint (hdserver.local):

public static void main(String[] args) throws IOException {
    Configuration conf = new Configuration();
    conf.set("fs.defaultFS","swebhdfs://hdserver.local:44305");
    FileSystem fs = FileSystem.get(conf);

    FileStatus[] fsStatus = fs.listStatus(new Path("/"));

    for(int i = 0; i < fsStatus.length; i++) {
        System.out.println(fsStatus[i].getPath().toString());
    }
}

This returns properly without requiring that I retrieve a bearer token from my token endpoint and send that along to WebHDFS for authentication. I expected the call to fail, telling me that my call wasn't authorized or was missing a bearer token with the request. Please tell me where I went wrong.

The hadoop.http.authentication.type and hadoop.http.authentication.simple.anonymous.allowed configurations relate only to the web consoles of Hadoop (JobTracker, NameNode, etc.). WebHDFS, even though it goes over http, is not orthogonal to these settings. Yes, this is confusing.

The other settings appear correct. Were you able to see the oauth2 configurations taking effect in the NameNode logs?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM