简体   繁体   English

为什么hive试图在hdfs中写入/ user?

[英]Why is hive attempting to write to /user in hdfs?

Working with a simple HiveQL query that looks like this: 使用如下所示的简单HiveQL查询:

SELECT event_type FROM {{table}} where dt=20140103 limit 10;

The {{table}} part is just interpolated via the runner code im using via Jinja2. {{table}}部分仅通过使用via Jinja2的转轮代码进行插值。 I'm running my query using the -e flag on the hive command line using subprocess.Popen from python. 我正在使用python中的subprocess.Popen在hive命令行上使用-e标志运行我的查询。

For some reason, this setup is attempting to write into the regular /user directory in HDFS? 出于某种原因,此设置是否尝试写入HDFS中的常规/user目录? Sudoing the command has no effect. 发出命令无效。 The error produced is as follows: 产生的错误如下:

Job Submission failed with exception:
org.apache.hadoop.security.AccessControlException(Permission denied:user=username, access=WRITE, inode="/user":hdfs:hadoop:drwxrwxr-x\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)

Why would hive attempt to write to /users ? 为什么hive会尝试写入/users Additionally, why would a select statement like this need an output location at all? 另外,为什么像这样的select语句需要一个输出位置呢?

Hive is a SQL frontend to MapReduce and so needs to compile and stage Java code for execution. Hive是MapReduce的SQL前端,因此需要编译和分阶段执行Java代码。 It's not trying to put output there but rather the program it will execute. 它不是试图将输出放在那里,而是它将执行的程序。 Depending on your version of Hadoop this is controlled by the variables: 根据您的Hadoop版本,这由变量控制:

mapreduce.jobtracker.staging.root.dir

And on YARN / Hadoop 2: 在YARN / Hadoop 2上:

yarn.app.mapreduce.am.staging-dir

These are set in mapred-site.xml. 这些是在mapred-site.xml中设置的。

Your runner needs to be authenticated to the cluster and have a writable directory it can use. 您的运行器需要通过身份验证到群集并具有可以使用的可写目录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM