简体   繁体   English

无法在具有权限的文件夹中的HDFS上创建登台目录

[英]Cannot create staging directory on HDFS in a folder that has permissions

There are couple of folders in the root dir of HDFS: HDFS的根目录中有几个文件夹:

  • dir1 目录1
    • subdir1 subdir1
      • table1 表格1
      • table2 表2
    • subdir2 subdir2
  • dir2 目录2
    • subdir1 subdir1
      • table1 表格1
      • table2 表2
  • dir3 目录3

They all have subfolders that contain different Parquet files that are queried with Hive. 它们都有子文件夹,这些子文件夹包含使用Hive查询的不同Parquet文件。 I can't load one of the subfolders (for example table1 inside dir2) even though the permissions look ok to me, I get the EXECUTE error when trying to load it. 即使权限看起来正常,我也无法加载子文件夹之一(例如dir2中的table1),尝试加载时出现EXECUTE错误。 The code is running in a Jupyter notebook. 该代码在Jupyter笔记本中运行。 Users are organized in groups. 用户按组进行组织。

I've added rwx permissions for the directory in question to the group by using the following command: 我已使用以下命令将有关目录的rwx权限添加到了组:

hdfs dfs -setfacl -R -m group:user_group:rwx /dir2/subdir2

The error I'm getting looks like this: 我收到的错误如下所示:

Cannot create staging directory 'hdfs://server:8020/dir2/subdir1/table1/.hive-staging_hive_2019-08-01_13-04-22': Permission denied: user=username, access=EXECUTE, inode="/dir2":hdfs:supergroup:drwxrwx---

I've added read and execute permissions on dir2 to the user group but the error persists. 我已经向用户组添加了对dir2的读取和执行权限,但是错误仍然存​​在。 It looks to me from this error that somehow the default permissions are applied and they are --- 从这个错误中我可以看出,默认权限是以某种方式应用的,它们是-

So, to summarize; 因此,总结一下; group has read and execute privileges on the root dir, and read, write and execute privileges on the table directories, but it keeps failing with permissions for root directory. 组在根目录上具有读取和执行特权,在表目录上具有读取,写入和执行特权,但是由于对根目录的权限而一直失败。

This is how the permissions look: 权限的外观如下:

# file: /dir2
# owner: hdfs
# group: supergroup
user::rwx
user:some_group1:r-x
group::---
group:some_group2:rwx
group:user_group:r-x
group:hive:rwx
group:some_group3:r-x
group:some_group4:r-x
mask::rwx
other::---
default:user::rwx
default:user:some_group1:r-x
default:group::---
default:group:some_group2:rwx
default:group:hive:rwx
default:group:some_group3:r-x
default:group:some_group4:r-x
default:mask::rwx
default:other::---


# file: /dir2/subdir1/table1
# owner: some_user
# group: supergroup
user::rwx
user:some_group1:r-x
group::---
group:some_group2:rwx
group:user_group:rwx
group:hive:rwx
group:some_group3:r-x
group:some_group4:rwx
mask::rwx
other::---
default:user::rwx
default:user:some_group1:r-x
default:group::---
default:group:some_group2:rwx
default:group:user_group:rwx
default:group:hive:rwx
default:group:some_group3:r-x
default:group:some_group4:rwx
default:mask::rwx
default:other::---

The problem was eventually solved by creating new directories that replaced the old ones. 通过创建替换旧目录的新目录最终解决了该问题。 The new directories were created with the correct user and credentials. 使用正确的用户和凭据创建新目录。 For example, I created subdir1_new, moved the data there, renamed subdir1 to subdir1_old and renamed subdir1_new to subdir1. 例如,我创建了subdir1_new,将数据移到那里,将subdir1重命名为subdir1_old,并将subdir1_new重命名为subdir1。 Not a lot of folders were affected by this issue so it didn't take a long time. 没有很多文件夹受此问题的影响,因此花了很长时间。

I know it's not the actual solution, but I couldn't figure out what exactly was happening and this workaround did the trick. 我知道这不是实际的解决方案,但是我无法弄清楚到底发生了什么,这种解决方法可以解决问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM