[英]Cannot create staging directory on HDFS in a folder that has permissions
There are couple of folders in the root dir of HDFS: HDFS的根目录中有几个文件夹:
They all have subfolders that contain different Parquet files that are queried with Hive. 它们都有子文件夹,这些子文件夹包含使用Hive查询的不同Parquet文件。 I can't load one of the subfolders (for example table1 inside dir2) even though the permissions look ok to me, I get the EXECUTE error when trying to load it. 即使权限看起来正常,我也无法加载子文件夹之一(例如dir2中的table1),尝试加载时出现EXECUTE错误。 The code is running in a Jupyter notebook. 该代码在Jupyter笔记本中运行。 Users are organized in groups. 用户按组进行组织。
I've added rwx permissions for the directory in question to the group by using the following command: 我已使用以下命令将有关目录的rwx权限添加到了组:
hdfs dfs -setfacl -R -m group:user_group:rwx /dir2/subdir2
The error I'm getting looks like this: 我收到的错误如下所示:
Cannot create staging directory 'hdfs://server:8020/dir2/subdir1/table1/.hive-staging_hive_2019-08-01_13-04-22': Permission denied: user=username, access=EXECUTE, inode="/dir2":hdfs:supergroup:drwxrwx---
I've added read and execute permissions on dir2 to the user group but the error persists. 我已经向用户组添加了对dir2的读取和执行权限,但是错误仍然存在。 It looks to me from this error that somehow the default permissions are applied and they are --- 从这个错误中我可以看出,默认权限是以某种方式应用的,它们是-
So, to summarize; 因此,总结一下; group has read and execute privileges on the root dir, and read, write and execute privileges on the table directories, but it keeps failing with permissions for root directory. 组在根目录上具有读取和执行特权,在表目录上具有读取,写入和执行特权,但是由于对根目录的权限而一直失败。
This is how the permissions look: 权限的外观如下:
# file: /dir2
# owner: hdfs
# group: supergroup
user::rwx
user:some_group1:r-x
group::---
group:some_group2:rwx
group:user_group:r-x
group:hive:rwx
group:some_group3:r-x
group:some_group4:r-x
mask::rwx
other::---
default:user::rwx
default:user:some_group1:r-x
default:group::---
default:group:some_group2:rwx
default:group:hive:rwx
default:group:some_group3:r-x
default:group:some_group4:r-x
default:mask::rwx
default:other::---
# file: /dir2/subdir1/table1
# owner: some_user
# group: supergroup
user::rwx
user:some_group1:r-x
group::---
group:some_group2:rwx
group:user_group:rwx
group:hive:rwx
group:some_group3:r-x
group:some_group4:rwx
mask::rwx
other::---
default:user::rwx
default:user:some_group1:r-x
default:group::---
default:group:some_group2:rwx
default:group:user_group:rwx
default:group:hive:rwx
default:group:some_group3:r-x
default:group:some_group4:rwx
default:mask::rwx
default:other::---
The problem was eventually solved by creating new directories that replaced the old ones. 通过创建替换旧目录的新目录最终解决了该问题。 The new directories were created with the correct user and credentials. 使用正确的用户和凭据创建新目录。 For example, I created subdir1_new, moved the data there, renamed subdir1 to subdir1_old and renamed subdir1_new to subdir1. 例如,我创建了subdir1_new,将数据移到那里,将subdir1重命名为subdir1_old,并将subdir1_new重命名为subdir1。 Not a lot of folders were affected by this issue so it didn't take a long time. 没有很多文件夹受此问题的影响,因此花了很长时间。
I know it's not the actual solution, but I couldn't figure out what exactly was happening and this workaround did the trick. 我知道这不是实际的解决方案,但是我无法弄清楚到底发生了什么,这种解决方法可以解决问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.